Tuesday, November 3, 2009

Practical Significance Always Wins Out

Engineers and scientists are the most pragmatic people that I know when it comes to analyzing and extracting key information with the statistical tools they have at hand. It is this level of pragmatism that often leads me to recommend equivalence tests for comparing one population mean to a standard value k, in place of the more common test of significance. Think about how a Student's t-test plays out in an analysis to test the hypothesis, Null: μ = 50 ohm vs. Alternative: μ ≠ 50 ohm. If we reject the null hypothesis in favor of the alternative then we say that we have a statistically significant result. Once this is established, the next question is how far is the mean off from the target value of 50? In some cases, this difference is small, say 0.05 ohm, and is of no practical consequence.

The other possible outcome for this test of significance is that we do not reject the null hypothesis and, although we can never prove that μ = 50 ohm, we sometimes behave like we did and assume that the mean is no different from our standard value of 50. The natural question that arises is usually, "can I say that the average resistance = 50 ohm?" to which I reply "not really".

My secret weapon to combining statistical and practical significance in one fell swoop is to use an Equivalence Test. Equivalence tests allow us to prove that our mean is equivalent to a standard value within a stated bound. For instance, we can prove that the average DC resistance of a cable is 50 ohm within ± 0.25 ohm. This is accomplished by using two one-sided t-tests (TOST) on either side of the boundary conditions and we must simultaneously reject both sets of hypothesis to conclude equivalence. These two sets of hypotheses are:

a) H0: μ ≤ 49.75 vs. H1: μ > 49.75 and
b) H0: μ ≥ 50.25 vs. H1: μ < 50.25.

The equivalence test output for this scenario is shown below. Notice that, at the 5% level of significance, both p-values for the 2 one-sided t-tests are not statistically significant and therefore, we have NOT shown that our mean is 50 ± 0.25 ohm. But why not? The test-retest error for our measurement device is 0.2 ohm, which is close to the equivalence bound of 0.25 ohm. As a general rule, the equivalence bound should be larger than the test-retest error.


Let's look at one more example using this data to show that our mean is equivalent to 50 ohm within ± 0.6 ohm. We have chosen our equivalence bound to be 3 times the measurement error of 0.2 ohm. The JMP output below now shows that, at the 5% level of significance, both p-values from the 2 one-sided t-tests are statistically significant. Therefore, we have shown equivalence of the average resistance to the stated bounds of 49.4 and 50.6 ohm and therefore, equivalent to 50 ohm performance.


To learn more about comparing average performance to a standard, and one-sample equivalence tests, see Chapter 4 of out book, Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide.


No comments:

Post a Comment