Equivalence test

<h2 id="tost-procedure">TOST procedure</h2>
A very simple equivalence testing approach is the ‘two one-sided t-tests’ (TOST) procedure.<a class="footnote-ref" id="fnref:11" href="#fn:11">11</a> In the TOST procedure an upper (ΔU) and lower (–ΔL) equivalence bound is specified based on the smallest effect size of interest (e.g., a positive or negative difference of d = 0.3). Two composite null hypotheses are tested: H01: Δ ≤ –ΔL and H02: Δ ≥ ΔU. When both these one-sided tests can be statistically rejected, we can conclude that –ΔL < Δ < ΔU, or that the observed effect falls within the equivalence bounds and is statistically smaller than any effect deemed worthwhile and considered practically equivalent".<a class="footnote-ref" id="fnref:12" href="#fn:12">12</a> Alternatives to the TOST procedure have been developed as well.<a class="footnote-ref" id="fnref:13" href="#fn:13">13</a> A recent modification to TOST makes the approach feasible in cases of repeated measures and assessing multiple variables.<a class="footnote-ref" id="fnref:14" href="#fn:14">14</a>

<h2 id="comparison-between-t-test-and-equivalence-test">Comparison between t-test and equivalence test</h2>
The equivalence test can be induced from the <a href="/facts/Student%27s_t-test/dMH8uC28">t-test</a>.<a class="footnote-ref" id="fnref:15" href="#fn:15">15</a> Consider a t-test at the significance level αt-test with a <a href="/facts/Power_(statistics)/5KU54nW1">power</a> of 1-βt-test for a relevant effect size dr. If Δ=dr as well as αequiv.-test=βt-test and βequiv.-test=αt-test coincide, i.e. the error types (type I and type II) are interchanged between the t-test and the equivalence test, then the t-test will obtain the same results as the equivalence test. To achieve this for the t-test, either the sample size calculation needs to be carried out correctly, or the t-test significance level αt-test needs to be adjusted, referred to as the so-called revised t-test.<a class="footnote-ref" id="fnref:16" href="#fn:16">16</a> Both approaches have difficulties in practice since sample size planning relies on unverifiable assumptions of the standard deviation, and the revised t-test yields numerical problems.<a class="footnote-ref" id="fnref:17" href="#fn:17">17</a> Preserving the test behavior, those limitations can be removed by using an equivalence test.  
The figure below allows a visual comparison of the equivalence test and the t-test when the sample size calculation is affected by differences between the a priori standard deviation 
 
 
 
 σ
 
 
 {\textstyle \sigma }
 
 and the sample's standard deviation 
 
 
 
 
 
 
 σ
 ^
 
 
 
 
 
 {\textstyle {\widehat {\sigma }}}
 
, which is a common problem. Using an equivalence test instead of a t-test additionally ensures that αequiv.-test is bounded, which the t-test does not do in case that 
 
 
 
 
 
 
 σ
 ^
 
 
 
 >
 σ
 
 
 {\textstyle {\widehat {\sigma }}>\sigma }
 
 with the type II error growing arbitrary large. On the other hand, having 
 
 
 
 
 
 
 σ
 ^
 
 
 
 <
 σ
 
 
 {\textstyle {\widehat {\sigma }}<\sigma }
 
 results in the t-test being stricter than the dr specified in the planning, which may randomly penalize the sample source (e.g., a device manufacturer). This makes the equivalence test safer to use.

<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Bootstrap_(statistics)/zCHuBeIz">Bootstrap (statistics)</a>-based testing</li></ul>
<h2 id="literature">Literature</h2>
<ul><li>Walker, Esteban; Nowacki, Amy S. (February 2011). <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019319">"Understanding Equivalence and Noninferiority Testing"</a>. Journal of General Internal Medicine. 26 (2): 192–6. <a href="/facts/Doi_(identifier)/muM9Etpq">doi</a>:<a href="https://doi.org/10.1007%2Fs11606-010-1513-8">10.1007/s11606-010-1513-8</a>. <a href="/facts/PMC_(identifier)/dX1zMt71">PMC</a> <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019319">3019319</a>. <a href="/facts/PMID_(identifier)/JlHAvMHt">PMID</a> <a href="https://pubmed.ncbi.nlm.nih.gov/20857339">20857339</a>.</li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">Snapinn, Steven M. (2000). "Noninferiority trials". Current Controlled Trials in Cardiovascular Medicine. 1 (1): 19–21. doi:10.1186/CVM-1-1-019. PMC 59590. PMID 11714400. <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59590" target="_blank">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59590</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">Rogers, James L.; Howard, Kenneth I.; Vessey, John T. (1993). "Using significance tests to evaluate equivalence between two experimental groups". Psychological Bulletin. 113 (3): 553–565. doi:10.1037/0033-2909.113.3.553. PMID 8316613. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">Statistics applied to clinical trials (4th ed.). Springer. 2009. ISBN 978-1402095221. <a href="978-1402095221" target="_blank">978-1402095221</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">Piaggio, Gilda; Elbourne, Diana R.; Altman, Douglas G.; Pocock, Stuart J.; Evans, Stephen J. W.; CONSORT Group, for the (8 March 2006). "Reporting of Noninferiority and Equivalence Randomized Trials" (PDF). JAMA. 295 (10): 1152–60. doi:10.1001/jama.295.10.1152. PMID 16522836. <a href="https://researchonline.lshtm.ac.uk/id/eprint/12069/1/Reporting%20of%20Noninferiority%20and%20Equivalence%20Randomized%20Trials.pdf" target="_blank">https://researchonline.lshtm.ac.uk/id/eprint/12069/1/Reporting%20of%20Noninferiority%20and%20Equivalence%20Randomized%20Trials.pdf</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
<li id="fn:5">Piantadosi, Steven (28 August 2017). Clinical trials : a methodologic perspective (Third ed.). John Wiley & Sons. p. 8.6.2. ISBN 978-1-118-95920-6. <a href="978-1-118-95920-6" target="_blank">978-1-118-95920-6</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></li>
<li id="fn:6">Lakens, Daniël (2017-05-05). "Equivalence Tests". Social Psychological and Personality Science. 8 (4): 355–362. doi:10.1177/1948550617697177. PMC 5502906. PMID 28736600. <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502906" target="_blank">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502906</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></li>
<li id="fn:7">Siebert, Michael; Ellenberger, David (2019-04-10). "Validation of automatic passenger counting: introducing the t-test-induced equivalence test". Transportation. 47 (6): 3031–3045. arXiv:1802.03341. doi:10.1007/s11116-019-09991-9. ISSN 0049-4488. <a href="https://doi.org/10.1007%2Fs11116-019-09991-9" target="_blank">https://doi.org/10.1007%2Fs11116-019-09991-9</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></li>
<li id="fn:8">Schnellbach, Teresa (2022). Hydraulic Data Analysis Using Python. doi:10.26083/tuprints-00022026. <a href="http://tuprints.ulb.tu-darmstadt.de/22026/" target="_blank">http://tuprints.ulb.tu-darmstadt.de/22026/</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></li>
<li id="fn:9">Jahn, Nico; Siebert, Michael (2022). "Engineering the Neural Automatic Passenger Counter". Engineering Applications of Artificial Intelligence. 114. arXiv:2203.01156. doi:10.1016/j.engappai.2022.105148. <a href="https://www.sciencedirect.com/science/article/pii/S0952197622002652" target="_blank">https://www.sciencedirect.com/science/article/pii/S0952197622002652</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></li>
<li id="fn:10">Mazzolari, Raffaele; Porcelli, Simone; Bishop, David J.; Lakens, Daniël (March 2022). "Myths and methodologies: The use of equivalence and non‐inferiority tests for interventional studies in exercise physiology and sport science". Experimental Physiology. 107 (3): 201–212. doi:10.1113/EP090171. ISSN 0958-0670. PMID 35041233. S2CID 246051376. <a href="https://onlinelibrary.wiley.com/doi/10.1113/EP090171" target="_blank">https://onlinelibrary.wiley.com/doi/10.1113/EP090171</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></li>
<li id="fn:11">Schuirmann, Donald J. (1987-12-01). "A comparison of the Two One-Sided Tests Procedure and the Power Approach for assessing the equivalence of average bioavailability". Journal of Pharmacokinetics and Biopharmaceutics. 15 (6): 657–680. doi:10.1007/BF01068419. ISSN 0090-466X. PMID 3450848. S2CID 206788664. <a href="https://zenodo.org/record/1232484" target="_blank">https://zenodo.org/record/1232484</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></li>
<li id="fn:12">Lakens, Daniël (May 2017). "Equivalence Tests: A Practical Primer for t Tests, Correlations, and Meta-Analyses". Social Psychological and Personality Science. 8 (4): 355–362. doi:10.1177/1948550617697177. ISSN 1948-5506. PMC 5502906. PMID 28736600. <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502906" target="_blank">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502906</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></li>
<li id="fn:13">Wellek, Stefan (2010). Testing statistical hypotheses of equivalence and noninferiority. Chapman and Hall/CRC. ISBN 978-1439808184. <a href="978-1439808184" target="_blank">978-1439808184</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></li>
<li id="fn:14">Rose, Evangeline M.; Mathew, Thomas; Coss, Derek A.; Lohr, Bernard; Omland, Kevin E. (2018). "A new statistical method to test equivalence: an application in male and female eastern bluebird song". Animal Behaviour. 145: 77–85. doi:10.1016/j.anbehav.2018.09.004. ISSN 0003-3472. S2CID 53152801. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:14" class="footnote-back-ref">↩</a></li>
<li id="fn:15">Siebert, Michael; Ellenberger, David (2019-04-10). "Validation of automatic passenger counting: introducing the t-test-induced equivalence test". Transportation. 47 (6): 3031–3045. arXiv:1802.03341. doi:10.1007/s11116-019-09991-9. ISSN 0049-4488. <a href="https://doi.org/10.1007%2Fs11116-019-09991-9" target="_blank">https://doi.org/10.1007%2Fs11116-019-09991-9</a> <a href="#fnref:15" class="footnote-back-ref">↩</a></li>
<li id="fn:16">Siebert, Michael; Ellenberger, David (2019-04-10). "Validation of automatic passenger counting: introducing the t-test-induced equivalence test". Transportation. 47 (6): 3031–3045. arXiv:1802.03341. doi:10.1007/s11116-019-09991-9. ISSN 0049-4488. <a href="https://doi.org/10.1007%2Fs11116-019-09991-9" target="_blank">https://doi.org/10.1007%2Fs11116-019-09991-9</a> <a href="#fnref:16" class="footnote-back-ref">↩</a></li>
<li id="fn:17">Siebert, Michael; Ellenberger, David (2019-04-10). "Validation of automatic passenger counting: introducing the t-test-induced equivalence test". Transportation. 47 (6): 3031–3045. arXiv:1802.03341. doi:10.1007/s11116-019-09991-9. ISSN 0049-4488. <a href="https://doi.org/10.1007%2Fs11116-019-09991-9" target="_blank">https://doi.org/10.1007%2Fs11116-019-09991-9</a> <a href="#fnref:17" class="footnote-back-ref">↩</a></li>
</ol>

Equivalence test open-in-new

Equivalence test