Stat Insights: November 2009

Monday, November 30, 2009

Why Are My Control Limits So Narrow?

Statistical Process Control (SPC) charts are widely used in engineering applications to help us determine if our processes are predictable (in control). Below are Xbar and Range charts showing 25 subgroup averages and ranges for 5 Tensile Strength values (ksi) taken from each of 25 heats of steel. The Range chart tells us if our within subgroup variation is consistent from subgroup-to-subgroup and the Xbar chart tells us if our subgroup averages are similar. The Xbar chart has 19 out of 25 points outside of the limits. This process looks totally out-of-control, or does it?

Data taken from Wheeler and Chambers (1992), Understanding Statistical Process Control, 2nd edition, table 9.5, page 222.

The limits for Xbar are calculated using the within subgroup ranges, Rbar/d2. In other words, the within subgroup variation, which is a local measure of variation, is used as a yardstick to determine if the subgroup averages are predictable. In the context of our data, the within subgroup variation represents the variation among 5 samples of steel within one heat (batch) of the steel and the between subgroup variation represents the heat-to-heat variation. While the details are limited, we can imagine that every time we have to heat a batch of steel, we may be changing raw material lots, tweaking the oven conditions, or running them on a different shift, which can lead to more than one basic source of variation in the process.

Having multiple sources of variation is quite common for processes which are batch driven and the batch-to-batch variation is often the larger source of variation. For the Tensile Strength data, the heat-to-heat variation accounts for 89% of the total variation in the data. When we form rational subgroups based upon a batch, the control limits for the Xbar chart will only reflect the within batch variation and may result in control limits which are unusually tight and many points will be outside of the control limits.

In order to make the Xbar chart more useful for this type of data we need to adjust the control limits to incorporate the batch-to-batch variation. While there are several ways to appropriately adjust the limits on the Xbar chart, the easiest way is to treat the subgroup averages as individual measurements and use an Individuals and Moving Range chart to calculate the control limits.

The plot below shows the Tensile Strength data for the 25 heats of steel and was created using a JMP script for a 3-Way control chart. The first chart is the Xbar chart with the adjusted limits using the moving ranges for the subgroup averages and the chart below it is the moving range chart for the subgroup averages. The third chart (not shown here) is the Range chart already presented earlier. Note, the limits on the Range chart do not require any adjustments. Now what do we conclude about the predictability of this process?

Indeed, the picture now looks quite different. No points are outside of the limits and there are no violations in runs rule. The Range chart shows 3 points above the upper control limit suggesting that these three heats of steel had higher within subgroup variation. As Wheeler and Chambers point out, "this approach should not be used indiscriminately, and should only be used when the physical situation warrants its use".

Friday, November 20, 2009

Lack of Statistical Reasoning

In Sunday Book Review's Up Front: Steven Pinker section of the New York Times, it was interesting to read about Malcom Gladwell's comment on "getting a master's degree in statistics" in order "to break into journalism today". This has been a great year for statistics considering Google's chief economist, Hal Varian, comment earlier this year: “I keep saying that the sexy job in the next 10 years will be statisticians”, and the Wall Street Journal's The Best and Worst Jobs survey which has Mathematician as number 1, and Statistician as number 3.

What really caught my attention in Sunday's Up Front was Prof. Steven Pinker's, who wrote the review on Gladwell's new book "What the Dog Saw", remark when asked "what is the most important scientific concept that lay people fail to understand". He said: “Statistical reasoning. A difficulty in grasping probability underlies fallacies from medical quackery and stock-market scams to misinterpreting sex differences and the theory of evolution.”

I agree with him but I believe that is not only lay people that lack statistical reasoning, but as scientists and engineers we sometimes forget about Statistical Thinking. Statistical Thinking is a philosophy of learning and action that recognizes that:

All work occurs in a system of interconnected processes,
Variation exists in all processes, and
Understanding and reducing variation is key for success

Globalization and a focus on environmental issues is helping us to "think globally", or look at systems rather than individual processes. When it comes to realizing that variation exists in everything we do, we lose sight of it as if we were in a "physics lab where there is no friction". We may believe that if we do things in "exactly" the same way, we'll get the same result. Process engineers know first hand that doing things "exactly" the same way is a challenge because of variation in raw materials, equipment, methods, operators, environmental conditions, etc. They understand the need for operating "on target with minimum variation". Understanding and minimizing variation bring about consistency, more "elbow room" to move within specifications, and makes it possible to achieve six sigma levels of quality.

This understanding of variation is key in other disciplines as well. I am waiting for the day when financial reports do not just compare a given metric with the previous year, but utilize process behavior (control) charts to show the distribution of the metric over time, giving us a picture of its trends, of its variation, helping us not to confuse the signals with the noise.

Monday, November 16, 2009

Happy Birthday JMP!

We know we're late, JMP's birthday was October 5, but we have been busy with PR activities for our book, which includes creating and maintaining this blog. That said, JMP is 20 years old and, in those 20 years, JMP has become one of our favorite software packages that we use daily.

John Sall, co-founder and Executive Vice President of SAS, who leads the JMP business division recently wrote about JMP's 20th birthday in his blog, bLog-Normal Distribution. John describes the events that lead up to the first release of JMP on October 5, 1989 and the niche that it filled for engineers and scientists as a desktop point-and-click software tool that takes full advantage of the graphical user interface.

As we reflect upon using JMP, both in our own work as statisticians and in collaborating with engineers and scientists, our experiences mirror, almost exactly, what is described in JMP is 20 Years Old. John wrote, "We learned that engineers and scientists were our most important customer segment. These people were smart, motivated and in a hurry - too impatient to spend time learning languages, and eager to just point and click on their data." Things have not changed much. Engineers and scientists are busier than ever, and want to be able to get quick answers to the challenges they face. They really value JMP's powerful and easy-to-use features.

"What was missing was the exploratory role, like a detective, whose job is to discover things we didn't already know", writes John. JMP has made detectives of all of us by giving us the ability to easily conduct Exploratory Data Analysis (EDA) using features such as linked graphs and data tables, excluding/including observations from plots and analysis on the fly, and drag-and-drop tools, such as the Graph Builder and the Table Builder (Tabulate).

Here are some of our old and new JMP favorites that we find ourselves using over and over again.

- Graph Builder: new drag and drop canvas for creating a variety of graphs allowing us to display many data dimensions in one plot.
- Profiler Simulator: awesome tool that gives us the ability to use simulation techniques to define and evaluate process operating windows.
- Variability/Gauge Chart: one of our all time favorites to study and quantify sources of variability and look for systematic patterns or profiles in the data.
- Distribution: a real work horse. Great to examine and fit different distributions to our data, calculate statistical intervals (confidence, prediction, tolerance), conduct simple tests of significance on the mean and standard deviation of a population, and perform capability analysis.
- Control Chart > Presummarize: this function makes it even easier to fit more appropriate control limits to Xbar charts for data from a batch process, which contains multiple sources of variation.
- Bubble Plot: a dynamic visualization tool that shows a scatter plot in motion and is sure to wow your friends.
- Reliability Platform: new and improved reliability tools that make it easy to fit and compare different distributions, as well as, predict product performance.

Happy Birthday JMP. We look forward to 20 more years of discoveries and insights in engineering and science!

Brenda and José

Tuesday, November 10, 2009

Normal Calculus Scores

6DZU26SW93B5 A few weeks ago I was reading the post Double Calculus on the Learning Curves blog and the histogram of the grade distribution of the calculus scores really what caught my attention. For starters, the histogram was generated using JMP and I'm always glad to see other users of JMP, but most of all, the distribution looked quite normal. Quoting from the blog: "Can you believe this grade distribution? Way more normal than anything that comes out of my class. Skewness of 0.03."

Images of grading by the "curve", as well as "normal scores", came to mind, and this made me think of my favorite tool for assessing normality: the normal probability plot. The normal probability plot is a plot of the ordered data against the expected normal scores (Z scores) such that, if the normal distribution is a good approximation for the data, the points follow an aproximate straight line.

A normal probability plot is easily generated in JMP using the distribution platform by clicking the contextual menu to the right of the histogram title.

In a normal probability plot the points do not have to fall exactly on a straight line, just hover around it so that a "fat pen" will cover them (the "fat pen" test). JMP also provides confidence bands around the line to facilitate interpretation.

We can clearly see that the calculus scores follow closely the straight line, indicating that the data can be well approximated by a normal distribution. These calculus scores are in fact normal scores!

Tuesday, November 3, 2009

Practical Significance Always Wins Out

Engineers and scientists are the most pragmatic people that I know when it comes to analyzing and extracting key information with the statistical tools they have at hand. It is this level of pragmatism that often leads me to recommend equivalence tests for comparing one population mean to a standard value k, in place of the more common test of significance. Think about how a Student's t-test plays out in an analysis to test the hypothesis, Null: μ = 50 ohm vs. Alternative: μ ≠ 50 ohm. If we reject the null hypothesis in favor of the alternative then we say that we have a statistically significant result. Once this is established, the next question is how far is the mean off from the target value of 50? In some cases, this difference is small, say 0.05 ohm, and is of no practical consequence.

The other possible outcome for this test of significance is that we do not reject the null hypothesis and, although we can never prove that μ = 50 ohm, we sometimes behave like we did and assume that the mean is no different from our standard value of 50. The natural question that arises is usually, "can I say that the average resistance = 50 ohm?" to which I reply "not really".

My secret weapon to combining statistical and practical significance in one fell swoop is to use an Equivalence Test. Equivalence tests allow us to prove that our mean is equivalent to a standard value within a stated bound. For instance, we can prove that the average DC resistance of a cable is 50 ohm within ± 0.25 ohm. This is accomplished by using two one-sided t-tests (TOST) on either side of the boundary conditions and we must simultaneously reject both sets of hypothesis to conclude equivalence. These two sets of hypotheses are:

a) H0: μ ≤ 49.75 vs. H1: μ > 49.75 and
b) H0: μ ≥ 50.25 vs. H1: μ < 50.25.

The equivalence test output for this scenario is shown below. Notice that, at the 5% level of significance, both p-values for the 2 one-sided t-tests are not statistically significant and therefore, we have NOT shown that our mean is 50 ± 0.25 ohm. But why not? The test-retest error for our measurement device is 0.2 ohm, which is close to the equivalence bound of 0.25 ohm. As a general rule, the equivalence bound should be larger than the test-retest error.

Let's look at one more example using this data to show that our mean is equivalent to 50 ohm within ± 0.6 ohm. We have chosen our equivalence bound to be 3 times the measurement error of 0.2 ohm. The JMP output below now shows that, at the 5% level of significance, both p-values from the 2 one-sided t-tests are statistically significant. Therefore, we have shown equivalence of the average resistance to the stated bounds of 49.4 and 50.6 ohm and therefore, equivalent to 50 ohm performance.

To learn more about comparing average performance to a standard, and one-sample equivalence tests, see Chapter 4 of out book, Analyzing and Interpreting Continuous Data Using JMP: A Step-by-Step Guide.