Software testability is the degree to which a software artifact (i.e. a software system, software module, requirements- or design document) supports testing in a given test context. If the testability of the software artifact is high, then finding faults in the system (if it has any) by means of testing is easier.
Formally, some systems are testable, and some are not. This classification can be achieved by noticing that, to be testable, for a functionality of the system under test "S", which takes input "I", a computable functional predicate "V" must exists such that is true when S, given input I, produce a valid output, false otherwise. This function "V" is known as the verification function for the system with input I.
Many software systems are untestable, or not immediately testable. For example, Google's ReCAPTCHA, without having any metadata about the images is not a testable system. Recaptcha, however, can be immediately tested if for each image shown, there is a tag stored elsewhere. Given this meta information, one can test the system.
Therefore, testability is often thought of as an extrinsic property which results from interdependency of the software to be tested and the test goals, test methods used, and test resources (i.e., the test context). Even though testability can not be measured directly (such as software size) it should be considered an intrinsic property of a software artifact because it is highly correlated with other key software qualities such as encapsulation, coupling, cohesion, and redundancy.
The correlation of 'testability' to good design can be observed by seeing that code that has weak cohesion, tight coupling, redundancy and lack of encapsulation is difficult to test.
In order to link the testability with the difficulty to find potential faults in a system (if they exist) by testing it, a relevant measure to assess the testability is how many test cases are needed in each case to form a complete test suite (i.e. a test suite such that, after applying all test cases to the system, collected outputs will let us unambiguously determine whether the system is correct or not according to some specification). If this size is small, then the testability is high. Based on this measure, a testability hierarchy has been proposed.
Testability, a property applying to empirical hypothesis, involves two components. The effort and effectiveness of software tests depends on numerous factors including:
- Properties of the software requirements
- Properties of the software itself (such as size, complexity and testability)
- Properties of the test methods used
- Properties of the development- and testing processes
- Qualification and motivation of the persons involved in the test process
Testability of software components
The testability of software components (modules, classes) is determined by factors such as:
- Controllability: The degree to which it is possible to control the state of the component under test (CUT) as required for testing.
- Observability: The degree to which it is possible to observe (intermediate and final) test results.
- Isolateability: The degree to which the component under test (CUT) can be tested in isolation.
- Separation of concerns: The degree to which the component under test has a single, well defined responsibility.
- Understandability: The degree to which the component under test is documented or self-explaining.
- Automatability: The degree to which it is possible to automate testing of the component under test.
- Heterogeneity: The degree to which the use of diverse technologies requires to use diverse test methods and tools in parallel.
The testability of software components can be improved by:
Based on the number of test cases required to construct a complete test suite in each context (i.e. a test suite such that, if it is applied to the implementation under test, then we collect enough information to precisely determine whether the system is correct or incorrect according to some specification), a testability hierarchy with the following testability classes has been proposed: 
- Class I: there exists a finite complete test suite.
- Class II: any partial distinguishing rate (i.e. any incomplete capability to distinguish correct systems from incorrect systems) can be reached with a finite test suite.
- Class III: there exists a countable complete test suite.
- Class IV: there exists a complete test suite.
- Class V: all cases.
It has been proved that each class is strictly included into the next. For instance, testing when we assume that the behavior of the implementation under test can be denoted by a deterministic finite-state machine for some known finite sets of inputs and outputs and with some known number of states belongs to Class I (and all subsequent classes). However, if the number of states is not known, then it only belongs to all classes from Class II on. If the implementation under test must be a deterministic finite-state machine failing the specification for a single trace (and its continuations), and its number of states is unknown, then it only belongs to classes from Class III on. Testing temporal machines where transitions are triggered if inputs are produced within some real-bounded interval only belongs to classes from Class IV on, whereas testing many non-deterministic systems only belongs to Class V (but not all, and some even belong to Class I). The inclusion into Class I does not require the simplicity of the assumed computation model, as some testing cases involving implementations written in any programming language, and testing implementations defined as machines depending on continuous magnitudes, have been proved to be in Class I. Other elaborated cases, such as the testing framework by Matthew Hennessy under must semantics, and temporal machines with rational timeouts, belong to Class II.
Testability of requirements
Requirements need to fulfill the following criteria in order to be testable:
- quantitative (a requirement like "fast response time" can not be verification/verified)
- verification/verifiable in practice (a test is feasible not only in theory but also in practice with limited resources)
Treating the requirement as axioms, testability can be treated via asserting existence of a function (software) such that input generates output , therefore . Therefore, the ideal software generates the tuple which is the input-output set , standing for specification.
Now, take a test input , which generates the output , that is the test tuple . Now, the question is whether or not or . If it is in the set, the test tuple passes, else the system fails the test input. Therefore, it is of imperative importance to figure out : can we or can we not create a function that effectively translates into the notion of the set indicator function for the specification set .
By the notion, is the testability function for the specification . The existence should not merely be asserted, should be proven rigorously. Therefore, obviously without algebraic consistency, no such function can be found, and therefore, the specification cease to be termed as testable.
- Shalloway, Alan; Trott, Jim (2004). Design Patterns Explained, 2nd Ed. p. 133. ISBN 978-0321247148.
- Rodríguez, Ismael; Llana, Luis; Rabanal, Pablo (2014). "A General Testability Theory: Classes, properties, complexity, and testing reductions". IEEE Transactions on Software Engineering. 40 (9): 862–894. doi:10.1109/TSE.2014.2331690. ISSN 0098-5589.
- Rodríguez, Ismael (2009). "A General Testability Theory". CONCUR 2009 - Concurrency Theory, 20th International Conference, CONCUR 2009, Bologna, Italy, September 1–4, 2009. Proceedings. pp. 572–586. doi:10.1007/978-3-642-04081-8_38. ISBN 978-3-642-04080-1.