Answer 1 (Question)
Dear Quartile Methodless,
The first quartile Q1 is usually defined as (n+1)/4 where n is the number of observations and interpolation is used. In your case (n+1)/4 = (10+1)/4 = 2.75. Thus,if x1, x2, …, xn are the n observations ordered, we interpolate using x2+.75(x3-x2) = 2 + 0.75(3-2) = 2.75.
The third quartile Q3 is usually defined as 3(n+1)/4 where n is the number of observations and interpolation is used. In your case 3(n+1)/4 = 3(10+1)/4 = 8.25. Thus,if x1, x2, …, xn are the n observations ordered, we interpolate using x8+.25(x9-x8) = 8 + 0.25(9-8) = 8.25.
Note the IQR would be 8.25-2.75 = 5.5. Since defect counts may be right skewed, you may find that a box & whisker plot gives many possible outlier signals on the high side. You may also think about modeling the defect distribution and setting some outlier signals based on percentiles of this fitted distribution.
We hope this will complete your quartile file.
Signed: Dear Abby-Normal
Answer 2 (Question)
Dear Lost in the p’s and q’s,
The overall test for the model is meant to control for the experiment-wise error rate. If this is significant, then you can look for active factors. However, if there is a high p-value for the model, then you should generally stop and rethink the problem. If the p-value is marginally significant and your more critical error is to miss a signal, then you may still want to look at the p-values for the individual model terms to get a clue as to what factors may be worthwhile to continue to explore in the next DOE.
Often people are piqued (which has both a p and q) by p-values. Hopefully you will now be able to find your way through the p’s and q’s.
Signed: Dear Abby-Normal
Answer 3 (Question)
Dear DisappOintEd,
The usual advise when factors are not found to be active is that you have to look for other factors. However, there can be more likely explanations. Perhaps there needed to be a blocking factor added to the experiment and without it the DrOnE of the block noise has DrOwnEd out the signal of the factors. Perhaps you have the right factors, but you experimented with the factors too close together or even too far apart so that we missed what was happening. If you experimented within the specifications, then it may not be surprising that you saw nothing active. If the specs are set correctly, they should limit the transmission of variation from the inputs to the outputs and you should see nothing active.
You are no DOpE to use your engineering knowledge as a guide. As a first look, I would try to assess whether you need to adjust the factor levels and then re-run the experiment. You need to set the levels to achieve a linear change in the response (perhaps a 2-sigma change in order to be adequately detected by the DOE).
Although you have DOubtEd DOE really works, it will work if DOnE correctly. The next experiment should DOnatE a glaDsOmE DOsE of DOminatE factors to reach the DOmE of success.
Signed: Dear Abby-Normal
Answer 4 (Question)
Dear Repeatedly Confused,
You are very astute to realize that there could be a problem here. When the numbers don’t match our expectations, we should be concerned about both the validity of our expectations and the validity of the analysis with the numbers. The %Study Var is calculated as (Total Gage R&R StdDev)/(Total Variation) where the total variation is calculated from the 10 parts in the study. With so few parts in the study generally one of two outcomes will result; either the parts are “randomly” selected but cannot fully represent lot-to-lot, week-to-week, batch-to-batch, machine-to-machine or other sources of variation giving in an UNDER-ESTIMATION of the total variation, or the parts are “purposely” selected to cover a wide operation range, probably wider than the standard production range, giving an OVER-ESTIMATION of the total variation. (Of course, it is possible that the result is close to the correct value of the total variation, but this would be a fortunate coincidence since it is very difficult to fairly estimate all sources of process and product variation from only 10 parts.) In the case of under-estimation of the total variation, the %Study Var would appear higher than it really should be resulting in needless work on the measurement process. In the case of over-estimation of the total variation, the %Study Var would appear lower than it really should be resulting in missing an opportunity to improve a measurement system that could be inadequate for use.
Since you suspect this result is too good to be true, you should check how the samples were selected for this study. You may well find that the samples were hand-picked to have unusually low to unusually high ash content. Choosing samples over such a wide range is not bad if an independent estimate of the total variation is used in the analysis.
Your remedy will be to go back to production data over a sufficiently long period of time and calculate a simple standard deviation of the percent ash contents. Input this as the historical standard deviation and you will be able to see the correct estimate of %R&R. This will be listed in a column called “%Process” corresponding to the row “Total Gage R&R”.
Your should reproduce this independent estimate with each gage study you perform so you will repeatedly get accurate information concerning your measurement system and precisely determine the correct action.
Signed: Dear Abby-Normal
Answer 5 (Question)
Dear Perplexed about p Values,
You share a common complaint. The notion of a p-value is often taught in a very superficial manner. There are three interpretations of a p-value that may be helpful to you. First, a p-value may be interpreted as the chance that one makes a false signal conclusion from your data (e.g. we conclude a factor is active when, in fact, it is not active). Second, a p-value may be interpreted as how rare your data assuming the H0 is true (e.g. the p-value shows how rare our data is under the assumption that the factor is not active). Third, and I like this one as a way to explain p-values to those without statistical training, a p-value tells us how consistent our results are with the assumption of no effect, or no difference (i.e. how consistent our results are with H0). The higher the p-value, the more consistent our results with the assumption of no effect. The lower the p-value, the more inconsistent our results with the assumption of no effect.
I am troubled by a common misconception about p-values of which you may also want to be aware. It is often taught that we accept H0 if p>0.05 and reject H0 if p 0.05. Life is seldom black and white. I would use a guideline to generally reject H0 if p 0.05 and I would generally fail to reject H0 if p 0.10. However, if 0.05<p<0.10, I would consider the risks of either a false signal or a missed signal and make my conclusion to minimize the more serious risk. For example, if you are testing a measurement gauge for bias to a reference value and obtain a p-value of 0.08, you might consider a false signal error (needless calibration) and a missed signal error (using an out-of-calibration gauge in production). To protect yourself against the more serious error of using an out-of-calibration gauge in production you might well want to tolerate a higher p-value. Your conclusion in this case might be to calibrate the gauge, as this is the less serious mistake.
To see how p-values can tell us other information as well, consider this, if your p-value is 0.18, we would generally consider the result to be non-significant. However, it may be giving us a clue since it is close to being significant. If you did not take very much data and your results appear practically important, but the p-value is not significant, but is somewhat close to being significant, then you may want to consider taking some more data and again check for statistical significance. How many times has an important factor been discarded or overlooked because the p-value just fell slightly above 0.05.
Finally, as a means to remember p-values. A rhyme may be helpful. “If p is low, the null must go; If p is high, the null doesn’t lie”. So remember to nullify a bad experience with p-values use the alternative explanation that the p-value tells us how consistent our result are with the assumption of no effect or no difference. Try this explanation but always phrase your answers in terms of the problems, your questioner may not be so familiar with nulls and alternatives as well. By increasing your p-value facility, you will also have greater power to convince others. A p-value should not stand for perplexing-value.
Signed: Dear Abby Normal
Answer 6 (Question)
Dear In the Dark,
Sorry for the late reply as Abby-Normal has been abnormally not feeling quite normal the last week. Now to shed some light on your problem. First, let me make sure I understand your question, by repeating it. You want to determine an upper spec so that you can achieve a Cpk of at least 1.67? I’m assuming that you do not have any lower spec since you did not mention one, and I’m assuming that you want the lowest such upper spec since otherwise you could just set it at plus infinity and have no need to write Abby-Normal. In fact, if you are so free to set the upper spec, this brings into question why the customer wants this spec in the first place? Ideally there would be a reliability reason, or part performance reason for determining the specs. To set the spec just to achieve a certain Cpk, begs the question of why have an upper spec at all? Arbitrary specs have less relevance and in turn makes the Cpk have less relevance.
But if you still must set a spec, because sometimes we are forced to do things in industry that otherwise do not quite make sense, then you might proceed along these lines. Out of the 245 data values you supplied, you have three outliers high (samples 4, 30 and 89). Leave them in if this is the usual practice when calculating a Cpk (this will only make your eventual upper spec higher). Then your mean is 1532.857143 and standard deviation is 36.47770485. If you set Cpk = (USL-mean)/stdev and then solve for USL you have USL=mean+1.67*3*stdev. Plugging in the values for the mean, stdev and solving, we would have USL=1715.6. However, we should remember that the mean and stdev are only sample values and subject to variation in estimation. Accounting for this variation in estimation using 95% confidence intervals we have the interval (1528.3, 1537.4) for the mean and (33.5, 40) for the stdev. The highest the USL could be would be when both the mean and stdev were highest. So using mean=1537.4 and stdev=40, and plugging back into the formula above for USL would give USL=1737.8. If you repeated this with the 3 outliers removed, assuming they were not valid data, then the USL would be 1689.2.
There are better ways to set specs, and in fact, SystatS may be only company offering such a course, because we received questions about setting specs so often over so many years that we put together a public training course called “How to Set and Evaluate Process Specifications”. You may well want to join that course at a later date. The section on statistical guardbanding in that course may also be of interest to you as well as using statistical tolerance intervals to set a spec for standardization as you appear to be tasked to do here.
Hope this is of some use to you and we hope that the shine of your customer’s smile in getting the required Cpk will keep you out of the dark. Just don’t tell them that the resulting Cpk has very little meaning because the spec was set just to get the required Cpk. Now we’re all in the dark again.
Thanks for the question, and lights out,
Abby-Normal
|