Test validity

<h2 id="historical-background">Historical background</h2>
Although psychologists and educators were aware of several facets of validity before World War II, their methods for establishing validity were commonly restricted to <a href="/facts/Correlation/egkluAEm">correlations</a> of test scores with some known criterion.<a class="footnote-ref" id="fnref:10" href="#fn:10">10</a> Under the direction of <a href="/facts/Lee_Cronbach/M3YoPUIw">Lee Cronbach</a>, the 1954 Technical Recommendations for Psychological Tests and Diagnostic Techniques<a class="footnote-ref" id="fnref:11" href="#fn:11">11</a> attempted to clarify and broaden the scope of validity by dividing it into four parts: (a) <a href="/facts/Concurrent_validity/Xe5NEwsR">concurrent validity</a>, (b) <a href="/facts/Predictive_validity/95Hb2PbE">predictive validity</a>, (c) <a href="/facts/Content_validity/AmepFdfR">content validity</a>, and (d) <a href="/facts/Construct_validity/exjLSRJI">construct validity</a>. Cronbach and Meehl's subsequent publication<a class="footnote-ref" id="fnref:12" href="#fn:12">12</a> grouped predictive and concurrent validity into a "criterion-orientation", which eventually became <a href="/facts/Criterion_validity/fUy8MeKm">criterion validity</a>.
Over the next four decades, many theorists, including Cronbach himself,<a class="footnote-ref" id="fnref:13" href="#fn:13">13</a> voiced their dissatisfaction with this three-in-one model of validity.<a class="footnote-ref" id="fnref:14" href="#fn:14">14</a><a class="footnote-ref" id="fnref:15" href="#fn:15">15</a><a class="footnote-ref" id="fnref:16" href="#fn:16">16</a> Their arguments culminated in <a href="/facts/Samuel_Messick/ew5MC7ln">Samuel Messick's</a> 1995 article that described validity as a single construct, composed of six "aspects".<a class="footnote-ref" id="fnref:17" href="#fn:17">17</a> In his view, various inferences made from test scores may require different types of evidence, but not different validities.
The 1999 Standards for Educational and Psychological Testing<a class="footnote-ref" id="fnref:18" href="#fn:18">18</a> largely codified Messick's model. They describe five types of validity-supporting evidence that incorporate each of Messick's aspects, and make no mention of the classical models’ content, criterion, and construct validities.

<h2 id="validation-process">Validation process</h2>
According to the 1999 Standards,<a class="footnote-ref" id="fnref:19" href="#fn:19">19</a> validation is the process of gathering evidence to provide "a sound scientific basis" for interpreting the scores as proposed by the test developer and/or the test user. Validation therefore begins with a framework that defines the scope and aspects (in the case of multi-dimensional scales) of the proposed interpretation. The framework also includes a rational justification linking the interpretation to the test in question.
Validity researchers then list a series of propositions that must be met if the interpretation is to be valid. Or, conversely, they may compile a list of issues that may threaten the validity of the interpretations. In either case, the researchers proceed by gathering evidence – be it original empirical research, meta-analysis or review of existing literature, or logical analysis of the issues – to support or to question the interpretation's propositions (or the threats to the interpretation's validity). Emphasis is placed on quality, rather than quantity, of the evidence.
A single interpretation of any test result may require several propositions to be true (or may be questioned by any one of a set of threats to its validity). Strong evidence in support of a single proposition does not lessen the requirement to support the other propositions.
Evidence to support (or question) the validity of an interpretation can be categorized into one of five categories:

<ol><li>Evidence based on test content</li>
<li>Evidence based on response processes</li>
<li>Evidence based on internal structure</li>
<li>Evidence based on relations to other variables</li>
<li>Evidence based on consequences of testing</li></ol>
Techniques to gather each type of evidence should only be employed when they yield information that would support or question the propositions required for the interpretation in question.
Each piece of evidence is finally integrated into a validity argument. The argument may call for a revision to the test, its administration protocol, or the theoretical constructs underlying the interpretations. If the test, and/or the interpretations of the test's results are revised in any way, a new validation process must gather evidence to support the new version.

<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Validity_scale/nbKn9poa">Validity scale</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">American Educational Research Association; American Psychological Association; National Council on Measurement in Education (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. ISBN 978-0-935302-25-7. Archived from the original on 15 January 2025. <a href="978-0-935302-25-7" target="_blank">978-0-935302-25-7</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">Guion, R. M. (1980). "On trinitarian doctrines of validity". Professional Psychology. 11: 385–398. doi:10.1037/0735-7028.11.3.385. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">Messick, S. (1995). "Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning" (PDF). American Psychologist. 50: 741–749. doi:10.1037/0003-066X.50.9.741. Archived (PDF) from the original on 11 December 2024. <a href="https://files.eric.ed.gov/fulltext/ED380496.pdf" target="_blank">https://files.eric.ed.gov/fulltext/ED380496.pdf</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">Popham, W. J. (2008). "All About Assessment / A Misunderstood Grail". Educational Leadership. 66 (1): 82–83. Archived from the original on 27 January 2025. <a href="https://ascd.org/el/articles/a-misunderstood-grail" target="_blank">https://ascd.org/el/articles/a-misunderstood-grail</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
<li id="fn:5">Messick, S. (1995). "Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning" (PDF). American Psychologist. 50: 741–749. doi:10.1037/0003-066X.50.9.741. Archived (PDF) from the original on 11 December 2024. <a href="https://files.eric.ed.gov/fulltext/ED380496.pdf" target="_blank">https://files.eric.ed.gov/fulltext/ED380496.pdf</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></li>
<li id="fn:6">Nitko, J.J.; Brookhart, S. M. (2004). Educational assessment of students. Upper Saddle River, NJ: Merrill-Prentice Hall. <a href="#fnref:6" class="footnote-back-ref">↩</a></li>
<li id="fn:7">American Psychological Association; American Educational Research Association; National Council on Measurement in Education (1954). Technical recommendations for psychological tests and diagnostic techniques. Washington, DC: The Association. doi:10.1037/h0053479. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></li>
<li id="fn:8">Messick, S. (1995). "Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning" (PDF). American Psychologist. 50: 741–749. doi:10.1037/0003-066X.50.9.741. Archived (PDF) from the original on 11 December 2024. <a href="https://files.eric.ed.gov/fulltext/ED380496.pdf" target="_blank">https://files.eric.ed.gov/fulltext/ED380496.pdf</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></li>
<li id="fn:9">American Educational Research Association; American Psychological Association; National Council on Measurement in Education (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. ISBN 978-0-935302-25-7. Archived from the original on 15 January 2025. <a href="978-0-935302-25-7" target="_blank">978-0-935302-25-7</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></li>
<li id="fn:10">Angoff, W. H. (1988). "Validity: An evolving concept". In Wainer, H.; Braun, H. (eds.). Test Validity. Hillsdale, NJ: Lawrence Erlbaum. pp. 19–32. doi:10.4324/9780203056905. <a href="/wiki/Howard_Wainer" target="_blank">/wiki/Howard_Wainer</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></li>
<li id="fn:11">American Psychological Association; American Educational Research Association; National Council on Measurement in Education (1954). Technical recommendations for psychological tests and diagnostic techniques. Washington, DC: The Association. doi:10.1037/h0053479. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></li>
<li id="fn:12">Cronbach, L. J.; Meehl, P. E. (1955). "Construct validity in psychological tests". Psychological Bulletin. 52: 281–302. doi:10.1037/h0040957. hdl:11299/184279. Archived from the original on 10 September 2024. <a href="https://conservancy.umn.edu/server/api/core/bitstreams/1531d762-af5a-4515-95f3-90c01ac994be/content" target="_blank">https://conservancy.umn.edu/server/api/core/bitstreams/1531d762-af5a-4515-95f3-90c01ac994be/content</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></li>
<li id="fn:13">Cronbach, L. J. (1969). Validation of educational measures. Proceedings of the 1969 Invitational Conference on Testing Problems. Princeton, NJ: Educational Testing Service. pp. 35–52. <a href="#fnref:13" class="footnote-back-ref">↩</a></li>
<li id="fn:14">Loevinger, J. (1957). "Objective tests as instruments of psychological theory" (PDF). Psychological Reports. 3: 634–694. doi:10.2466/pr0.1957.3.3.635. Archived (PDF) from the original on 7 July 2024. <a href="https://users.cla.umn.edu/~nwaller/prelim/loevenger.pdf" target="_blank">https://users.cla.umn.edu/~nwaller/prelim/loevenger.pdf</a> <a href="#fnref:14" class="footnote-back-ref">↩</a></li>
<li id="fn:15">Tenopyr, M. L. (1977). "Content-construct confusion". Personnel Psychology. 30: 47–54. doi:10.1111/j.1744-6570.1977.tb02320.x. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:15" class="footnote-back-ref">↩</a></li>
<li id="fn:16">Guion, R. M. (1977). "Content validity–The source of my discontent". Applied Psychological Measurement. 1: 1–10. doi:10.1177/014662167700100103. Archived from the original on 27 January 2025. <a href="https://conservancy.umn.edu/items/5a30191e-e9b9-4086-929a-20ad7055ba3b" target="_blank">https://conservancy.umn.edu/items/5a30191e-e9b9-4086-929a-20ad7055ba3b</a> <a href="#fnref:16" class="footnote-back-ref">↩</a></li>
<li id="fn:17">Messick, S. (1995). "Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning" (PDF). American Psychologist. 50: 741–749. doi:10.1037/0003-066X.50.9.741. Archived (PDF) from the original on 11 December 2024. <a href="https://files.eric.ed.gov/fulltext/ED380496.pdf" target="_blank">https://files.eric.ed.gov/fulltext/ED380496.pdf</a> <a href="#fnref:17" class="footnote-back-ref">↩</a></li>
<li id="fn:18">American Educational Research Association; American Psychological Association; National Council on Measurement in Education (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. ISBN 978-0-935302-25-7. Archived from the original on 15 January 2025. <a href="978-0-935302-25-7" target="_blank">978-0-935302-25-7</a> <a href="#fnref:18" class="footnote-back-ref">↩</a></li>
<li id="fn:19">American Educational Research Association; American Psychological Association; National Council on Measurement in Education (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. ISBN 978-0-935302-25-7. Archived from the original on 15 January 2025. <a href="978-0-935302-25-7" target="_blank">978-0-935302-25-7</a> <a href="#fnref:19" class="footnote-back-ref">↩</a></li>
</ol>

Test validity open-in-new

Test validity