## Z1.4 Sample Size

Question

I am trying to determine the sampling size using my ANSI/ASQ Z1.4 table and I wanted to get some clarification. If I am using Table II A and my Sample Size Code letter is D, what would be my sample size? If it falls on an arrow does it mean that I have to change to the next sample size based on where the arrow points?

From Charlie Cianfrani:

If you are using Z1.4, your sample size is selected based on your lot size.  You would pick the AQL you need based on the risk you are willing to take for the process average of percent defective.  It is important to understand what you are doing when using sampling plans, what they are and the protection you are trying to ensure. Thus, the important step is to determine the AQL. Then you select the sample size to provide the level of protection you are striving to ensure. It is more important to understand the theory behind the tables than to mechanically use the tables.

From Fred Schenkelberg:

Use the sample size where the arrow points. In the 2008 and 2013 versions it explains this in section 9.4, “When no sampling plan is available for a given combination of AQL and code letter, the tables direct the user to a different letter. The sample size to be used is given by the new code letter, not by the original letter.”

From Steven Walfish:

The standard sample size for Code Letter D from IIA is a sample size of 8.  But depending on your AQL, a sample size of 8 would be inappropriate, so the standard has arrows to delineate alternative sample sizes to reach the target AQL.  So, you sample size and accept/reject values are changed.  For example, at an AQL of 0.25, you would move down to a sample size of 50, with an accept/reject of 0/1.  If the lot size is less than 50, you would need to do 100% inspection.  In other words, there is no sampling plan that can give an AQL of 0.25 without a minimum sample size of 50.

From James Werner:

Yes.  When using Z1.4 two items need to be known, lot size and the AQL (Acceptance Quality Limit).  You use Table I – Sample size code letters to determine the Sample size code letter based on the Lot or batch size.  In the question below that was determined to be “D”.  Next step is to use Table II-A to find the sample size related to the sample size code letter – D and the AQL.  On Table II-A go across the table’s row for letter D until it intersect the given AQL column heading.  If an arrow is in that intersection point, follow the arrow then go back to the sample size code letter column to find the actual sample size (if a up/down arrow is in there then you choose).

Example 1.  Code letter is D (as in the question below).  Let’s say the AQL is 0.25.  Starting at code letter D, move across that row until you intersect at the AQL 0.25 column.  There’s a down arrow this row/column intersection.  Follow the arrow downward until the “Ac Re” reads ” 0 1″.  Staying on this row go back to the Sample size code letter column and find Code Letter H and Sample size = 50.  This means for the lot size with code letter D and with an AQL of 0.25 the sample size = 50 and accept the entire lot if no nonconformances were found else reject the entire lot if 1 or more nonconformance were found in the sample.

Example 2.  Let’s say the Sample size code letter was determine from Table I to be “F”.  Looking at Table II-A; If the AQL = 0.65, then the sample size would be 20 and the lot would be accepted zero nonconformance.  But if the AQL = 0.15 then the sample size would be 80.

For more information on this topic, please visit ASQ’s website

## Sampling and Daylight Savings Time

Question

We are wondering if there is any generally accepted procedure to account for seemingly duplicate sample times when operating over the time period when clocks ‘fall back’ one hour for the end of daylight savings time? Our standard practice is to analyze a chemistry sample every half hour, so we foresee two each of the 01:00, 01:30 & 02:00 AM samples this next Sunday morning. Please advise on any generally accepted practice to account for such seemingly duplicate samples.

I do not know an industry accepted standard for this yet.

If used for control charting, just plot as normal, noting when the sample was taken. For the second 2:30am sample, just note it was after the time change… continue monitoring the process as normal.

If used for lot sampling, analyze the results as normal.

If doing a daily average, then adjust the calculation for the two extra samples, i.e. divide by 26 instead of 24.

At most it may require a slight change to calculations based on the number of samples, otherwise it’s just not a big issue that may require at most a comment or note about the seemingly duplicate sample times or two missing times (in spring).

Cheers,

Fred

Fred Schenkelberg
Reliability Engineering and Management Consultant
FMS Reliability
(408) 710-8248
fms@fmsreliability.com
www.fmsreliability.com
@fmsreliability

For more on this topic, please visit ASQ’s website.

## Sample Size and Z1.4

Question

My question is if I’m trying to determine the sample size of migrated data to see if it migrated correctly to the target database, is the Z1.4 table applicable to that?

The scenario is data is being transferred from an old system to a new system and I want to do a quality check on the data in the new database to make sure everything was transferred correctly. I’m hoping to use the Z1.4 table to determine the sample size if its applicable. Is it applicable and if not, do you know of other standards that I should be looking into that is more applicable?

The movement of a database from one system to another certainly may introduce errors and it may also carry over errors that already exist. In some cases the move may also find and repair errors, yet that generally is done by design.

So, let’s say it’s just a move and you are checking for any new errors that are introduced.

Since you have access to the entire population, the database, in a before (old system) and after the move (new system) and I’m assuming you do not want to check every entry, instead just a sample, then I would recommend using an hypothesis test approach rather than a lot sampling approach.

A hypothesis test based on the binomial distribution may be appropriate as you are checking field entries to determine if they are correct or not (pass/fail).

You can set a threshold defect rate that you want to check the new system is at least this good or better, or you can measure the old system and compare to the new system – it should be equal to the old system as null hypothesis.

You can find a bit more information about a p-test in a good stats book or online at a short tutorial I wrote at https://creprep.wordpress.com/2013/06/01/hypothesis-tests-for-proportion/

The Z1.4 standard would require you to artificially define a lot or consider the entire database as one lot. The standard lot testing approach does not provide the control and statistical power of hypothesis testing, thus my recommendation. With the p-test you can define the confidence, defect rate to detect, and sample size to fit your needs concerning ability to make measurements, cost, and risk.

Cheers,

Fred

Fred Schenkelberg
Reliability Engineering and Management Consultant
FMS Reliability
(408) 710-8248
fms@fmsreliability.com
www.fmsreliability.com
@fmsreliability

For more on this topic, please visit ASQ’s website.

## Sample Size

Question

If we have a lot size of 27 and we are using a normal inspection level II with an AQL of 2.5. What is the sample size?

Assuming an attribute is being measured, we use ANSI ASQ Z1.4.2013 to find the sample size.

Given a lot size of 27 we first find in Table I. Sample Size Code Letter that Code letter D represents the sampling plan code letter for lot sizes between 26 and 50 for normal sampling (General Inspection Level II).

The move to Table II-A Single sampling plans for normal inspection to find the row for code letter D and under column for ASQ 2.5 find an up arrow. This indicates that we should use the code letter C which suggests a sampling plan of 5 samples and accept the lot if there are zero defect and reject the lot with one or more rejects.

Hope that helps.

Cheers,

Fred

Fred Schenkelberg
Reliability Engineering and Management Consultant
FMS Reliability
(408) 710-8248
fms@fmsreliability.com
www.fmsreliability.com
@fmsreliability

For more on this topic, please visit ASQ’s website.

## Z1.4: 2008 Sampling

Question:

We are having an interpretation issue regarding the ANSI/ASQ Z1.4:2008 standard with some of our component vendors. We have a number of different defects that fall into an AQL of 1.0.

Please note that the same question applies to all AQL levels, as our critical and minor defects can also have multiple defects.

Our interpretation of the standard is that if the sampling plan table (based on sample size and inspection level) shows Accept 7 / Reject 8 then all defects in this major category would be cumulative for the accept / reject criteria. (i.e. 3 that fail outer diameter, 3 that fail height of the bottle finish and 3 that fail weight – total of 9 – would constitute a rejection of the lot). The vendor’s interpretation is that each of the items within the major category should have an accept / reject allowance of 7 / 8 (so potentially, in this case, 56 defects would still be accepted).

Response:

In this case, it depends on the question the lot sampling is trying to answer. If they want to know if individual units within the lot are acceptable – based on all criteria that is considered acceptable, then the tally of all defects found is correct. This is further supported by any item with one of the many specifications out of range would be deemed a failure.

On the other hand, if the lot sampling is to detect lots with specific faults, isolated to a specific specification then the defect types would be considered separately. If the AQL 1.0 is suitable for the specific defects, then considering them separate for the 8 criteria would no longer be an overall ASQ 1.0 protection; it would be much less.

Your example of 56 defects being accepted underscores the point that the AQL protection is no longer 1.0.

I’m assuming the specifications and causes of the defects are independent, yet that may not be the case. When not independent I’m not sure how to adjust the sample size to a present the same AQL protection. When independent you would need separate draws of samples for each defect of interest, then apply the Accept 7/Reject 8 criteria judging only the one specification.

In practice, if you want to inspect for isolated specifications, one should allocate the acceptable AQL and LPTD points and develop your sampling plan from there. Instead of a 1.0% defect rate for AQL it would need to less for one of the Reject 8 specifications; try 0.125 so that the tally of failure rates across the various specification of interest (assuming the possibility of failing any specifications is equal). This will lead to much larger sample sizes that may be useful when troubleshooting specific faults.

Cheers,
Fred

For more on this topic, please visit ASQ’s website.

## Combating Contamination

Q: We want to ensure that we are receiving clean containers to package our products. How can we improve our incoming inspection process?

A: You should encourage your vendor to ship only clean containers. Then, be sure that the shipping and receiving process doesn’t cause contamination. If you can determine the source or sources of the contamination, the best fix is to remove the cause.

If that approach is not possible and you have incoming containers that may have some contamination, then consider the following elements in creating an efficient incoming inspection process.

1) How do you detect the contamination?

Apparently, you are able detect the container contamination prior to filling them, or are able to detect the effect of the contamination on the final product. Given that you are interested in creating an incoming test, let’s assume you have one or more ways to detect faulty units.

As you may already know, there are many ways to detect contamination. Some are faster than others, and some are non-destructive. Ideally, a quick non-destructive test would permit you to inspect every unit and to divert faulty units to a cleaning process. If the testing has to be destructive, then you’ll have to consider lot sampling of some sort.

There are many testing options. One is the optical inspection technique, which may find gross discoloration or large debris effectively. Avoid using human inspectors unless it’s only a short term solution, as we humans are pretty poor visual inspectors.

Another approach is using light to illuminate the contamination, such as a black light (UVA). Depending on the nature and properties of the contamination, you may be able to find a suitable light to quickly spot units with problems.

Another approach, which is more time consuming, is conducting a chemical swab or solution rinse and a chemical analysis to find evidence of contamination. If the contamination is volatile, you might be able to use air to “rinse” the unit and conduct the analysis. This chemical approach may require specialized equipment. Depending on how fast the testing occurs, this approach may or may not be suitable for 100 percent screening.

There may be other approaches for detecting the faulty units, yet without more information about the nature and variety of contamination, it’s difficult to make a recommendation. Ideally, a very fast, effective and non-destructive inspection method is preferred over a slow, error prone, and destructive approach. Cost is also a consideration, since any testing will increase the production costs. Finding the right balance around these considerations is highly dependent on the nature of the issue, cost of failure, and local resources.

2) How many units do you have to inspect?

Ideally, the sample size is zero as you would first find and eliminate the source of the problem. If that is not possible or practical, then 100 percent inspection using a quick, inexpensive, and effective method permits you to avoid uncertainties with sampling.

If the inspection method requires lot sampling, then all of the basic lot sampling guidelines apply. There are many references available that will assist you in the selection of an appropriate sampling plan based on your desired sampling risk tolerance levels.

Another consideration is the percentage of contaminated units per lot. If there is a consistent low failure rate per lot, then lot sampling may require relatively large amounts of tested units. You’ll have to determine the level of bad units permitted to pass through to production. Short of 100 percent sampling, it’s difficult (and expensive) to find very low percentages of “bad” units in a lot using destructive testing.

3) Work to remove original source(s) of contamination to permit you to stop inspections.

I stress this approach because it’s the most cost effective in nearly all cases. In my opinion, incoming inspection should be stopped as soon as possible since the process to create, ship and receive components should not introduce contamination and require incoming inspection to “sort” the good from the bad.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56 on Reliability
Voting member of U.S. TAG to ISO/TC 69 on Applications of Statistical Methods
Reliability Engineering and Management Consultant
FMS Reliability
www.fmsreliability.com

For more on this topic, please visit ASQ’s website.

## Sampling in a Call Center

Q: I work as a quality assessor (QA) and I am assisting with a number of analyses in a call center. I need a little help with sampling. My questions are as follows:

1. How do I sample calls taken by an agent if there are six assessors and 20 call center agents that each make 100 calls per day?

2. I am assessing claims paid and I want to determine the error rate and the root cause. How many of those claims would have to be assessed by the same number of QAs if claims per day, per agent, exceed 100?

3. If there are 35 interventions made by an agent per day, with two QAs assessing 20 agents in this environment, then the total completed would amount to between 300 to 500 per month. What would be the sample size be in this situation?

A: I may be able to provide some ideas to help solve your problem.

The first question is about sampling calls per day by you and your fellow assessors. It is clear that the six assessors are not able to cover all of the calls handled by the 20 call center agents.

What is missing from the question is what are you measuring — customer satisfaction, correct resolution of issues, whether agents are appropriately following call protocols, or something else? Be very clear on what you are measuring.

For the sake of providing a response, let’s say you are able to judge whether the agents are appropriately addressing callers’ issues or not. A binary response, or simply a call, is either considered good or not (pass/fail). While this may oversimply your situation, it may be instructive on sampling.

Recalling some basic terms from statistics, remember that a sample is taken from some defined population in order to characterize or understand the population. Here, a sample of calls are assessed and you are interested in what portion of the calls are handled adequately (pass). If you could measure all calls, that would provide the answer. However, a limit on resources requires that we use sampling to estimate the population proportion of adequate calls.

Next, consider how sure you want the results of the sample to reflect the true and unknown population results. For example, if you don’t assess any calls and simply guess at the result, there would be little confidence in that result.

Confidence in sampling in one manner represents the likelihood that the sample is within a range of about the sample’s result. A 90 percent confidence means that if we repeatedly draw samples from the population, then the result from the sample would be within a confidence bound (close to the actual and unknown result) 90 percent of the time. That also means that the estimate will be wrong 10 percent of the time due to errors caused by sampling. This error is simply the finite chance that the sample draws from more calls that “pass” or “fail.” The sample, thus, is not able to accurately reflect the true population.

Setting the confidence is a reflection on how much risk one is willing to take related to the sample providing an inaccurate result. A higher confidence requires more samples.

Here is a simple sample size formula that may be useful in some situations.

n is samples size

C is confidence where 90% would be expressed as 0.9

pi is proportion considered passing, in this case good calls.

ln is  the natural logarithm

If we want 90 percent confidence that at least 90 percent of all calls are judged good (pass), then we need at least 22 monitored calls.

This formula is a special case of the binomial sample size calculation and assumes that there are no failed calls in the calls monitored. This assumes that if we assess 22 calls and none fail, that we have at least 90% confidence that the population has at least 90% good calls. If there is a failed call out the 22 assessments, we have evidence that we have less than 90 percent confidence of at least 90 percent good calls. This doesn’t provide information to estimate the actual proportion, yet it is a way to detect if the proportion falls below a set level.

If the intention is to estimate the population proportion of good vs. bad calls, then we use a slightly more complex formula.

pi is the same, the proportion of good calls vs. bad calls

z is the area under a standard normal distribution corresponding to alpha/2 (for 90 percent confidence, we have 90 = 100 percent (1-alpha), thus, in this case alpha is 0.1. The area under the standard normal distribution is 1.645.

E is related to accuracy of the result. It defines a range within which the estimate should reside about the resulting estimate of the population value. A higher value of E reduces the number of samples needed, yet the result may be further away from the true value than desired.

The value of E depends on the standard deviation of the population. If that is not known, just use an estimate from previous measurements or run a short experiment to determine a reasonable estimate. If the proportion of bad calls is the same from day-to-day and from agent-to-agent,  then the standard deviation may be relatively small. If, on the other hand, there is agent-to -agent and day-to-day variation, the standard deviation may be relatively large and should be carefully estimated.

The z value is directly related to the confidence and affects the sample size as discussed above.

Notice that pi, the proportion of good calls, is in the formula. Thus if you are taking the sample in order to estimate an unknown pi, then to determine sample size, assume pi is 0.5. This will generate the largest possible sample size and permit an estimate of pi with confidence of 100 percent (1-alpha) and accuracy of E or better. If you know pi from previous estimates, then use it to help reduce the sample size slightly.

Let’s do an example and say we want 90 percent confidence. The alpha is 0.1 and the z alpha/2 is 1.645. Let’s assume we do not have an estimate for pi, so we will use 0.5 for pi in the equation. Lastly, we want the final estimate based on the sample to be within 0.1 (estimate of pi +/- 0.1), so E is 0.1.

Running the calculation, we find that we need to sample 1,178 calls to meet the constraints of confidence and accuracy. Increasing the allowable accuracy or increasing the sampling risk (higher E or higher C) may permit finding a meaningful sample size.

It may occur that obtaining a daily sample rate with an acceptable confidence and accuracy is not possible. In that case, sample as many as you can. The results over a few days may provide enough of a sample to provide an estimate.

One consideration with the normal approximation of a binomial distribution for the second sample size formula is it breaks down when either pi n and n (1-pi) are less than five. If either value is less than five, then the confidence interval is large enough to be of little value. If you are in this situation, use the binomial distribution directly rather than the normal approximation.

One last note. In most sampling cases, the overall size of the population doesn’t really matter too much. A population of about 100 is close enough to infinite that we really do not consider the population size. A small population and a need to sample may require special treatment of sampling with or without replacement, plus adjustments to the basic sample size formulas.

Creating the right sample size to a large degree depends on what you want to know about the population. In part, you need to know the final result to calculate the “right” sample size, so it often just an estimate. By using the above equations and concepts, you can minimize risk of determining an unclear result, yet it will always be an evolving process to determine the right sample size for each situation.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56
Voting member of U.S. TAG to ISO/TC 69
Reliability Engineering and Management Consultant
FMS Reliability
http://www.fmsreliability.com

## AQL for Electricity Meter Testing

Q: We have implemented a program to test electricity meters that are already in use. This would target approximately 28,000 electricity meters that have been in operation for more than 15 years. Under this program, we plan to test a sample of meters and come to a conclusion about the whole batch  —  whether replacement is required or not. As per ANSI/ISO/ASQ 2859-1:1999: Sampling procedures for inspection by attributes — Part 1: Sampling schemes indexed by acceptance quality limit (AQL) for lot-by-lot inspection, we have selected a sample of 315 to be in line with the total number of electricity meters in the batch.

Please advice us on how to select an appropriate acceptable quality level (AQL) value to accurately reflect the requirement of our survey and come in to a decision on whether the whole batch to be rejected and replaced. Thank you.

A: One of the least liked phrases uttered by statisticians is “it depends.” Unfortunately, in response to your question, the selection of the AQL depends on a number of factors and considerations.

If one didn’t have to sample from a population to make a decision, meaning we could perform 100% inspection accurately and economically, we wouldn’t need to set an AQL. Likewise, if we were not able to test any units from the population at all, we wouldn’t need the AQL. It’s the sampling and associated uncertainty that it provides that requires some thought in setting an AQL value.

As you may notice, the lower the AQL the more samples are required. Think of it as reflecting the size of a needle. A very large needle (say, the size of a telephone pole) is very easy to find in a haystack. An ordinary needle is proverbially impossible to find. If you desire to determine if all the units are faulty or not (100% would fail the testing if the hypothesis is true), that would be a large needle and only one sample would be necessary. If, on the other hand, you wanted to find if only one unit of the entire population is faulty, that would be a relatively small needle and 100% sampling may be required, as the testing has the possibility of finding all are good except for the very last unit tested in the population.

AQL is not the needle or, in your case, the proportion of faulty fielded units. It is the average quality level which is related to the proportion of bad units. The AQL is fixed by the probability of a random sample being drawn from a population with an unknown actual failure rate of the AQL (say 0.5%), creating a sample that has a sample failure rate of 0.5% or less. We set the probability of acceptance relatively high, often 95%. This means if the population is actually mostly as good as or better than our AQL, we have a 95% chance of pulling a sample that will result in accepting the batch as being good.

The probability of acceptance is built into the sampling plan. Drafting an operating characteristic curve of your sampling plan is helpful in understanding the relationship between AQL, probability of acceptance, and other sampling related values.

Now back to the comment of “it depends.” The AQL is the statement that basically says the population is good enough – an acceptable low failure rate. For an electrical meter, the number of out of specification may be defined by contract or agreement with the utility or regulatory body. As an end customer, I would enjoy a meter that under reports my electricity use as I would pay for less than I received. The utility company would not enjoy this situation, as it provides their service at a discount. And you can imagine the reverse situation and consequences. Some calculations and assumptions would permit you to determine the cost to the consumers or to the utility for various proportions of units out of specification, either over or under reporting. Balance the cost of testing to the cost to meter errors and you can find a reasonable sampling plan.

Besides the regulatory or contract requirements for acceptable percent defective, or the balance between costs, you should also consider the legal and publicity ramifications. If you accept 0.5% as the AQL, and there are one million end customers, that is 5,000 customers with possibly faulty meters. What is the cost of bad publicity or legal action? While not likely if the total number of faulty units is small, there does exist the possibility of a very expensive consequence.

Another consideration is the measurement error of the testing of the sampled units. If the measurement is not perfect, which is a reasonable assumption in most cases, then the results of the testing may have some finite possibilities to not represent the actual performance of the units. If the testing itself has repeatability and reproducibility issues, then setting a lower AQL may help to provide a margin to guard from this uncertainty. A good test (accurate, repeatable, reproducible, etc.) should have less of an effect on the AQL setting.

In summary, if the decision based on the sample results is important (major expensive recall, safety or loss of account, for example), then use a relatively lower AQL. If the test result is for an information gathering purpose which is not used for any major decisions, then setting a relatively higher AQL is fine.

If my meter is in the population under consideration, I am not sure I want my meter evaluated. There are three outcomes:

• The meter is fine and in specification, which is to be expected and nothing changes.
• The meter is overcharging me and is replaced with a new meter and my utility bill is reduced going forward. I may then pursue the return of past overcharging if the amount is worth the effort.
• The meter is undercharging me, in which case I wouldn’t want the meter changed nor the back charging bill from the utility (which I doubt they would do unless they found evidence of tampering).

As an engineer and good customer, I would want to be sure my meter is accurate, of course.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56
Voting member of U.S. TAG to ISO/TC 69
Reliability Engineering and Management Consultant
FMS Reliability
http://www.fmsreliability.com

For more on this topic, please visit ASQ’s website

## Variation in Continuous and Discrete Measurements

Q: I would appreciate some advice on how I can fairly assess process variation for metrics derived from “discrete” variables over time.

For example, I am looking at “unit iron/unit air” rates for a foundry cupola melt furnace in which the “unit air” rate is derived from the “continuous” air blast, while the unit iron rate is derived from input weights made at “discrete” points in time every 3 to 5 minutes.

The coefficient of variation (CV), for the air rate is exceedingly small (good) due to its “continuous’ nature” but the CV for iron rate is quite large because of its “discrete nature,” even when I use moving averages for extended periods of time. Hence, that seemingly large variation for iron rate then carries over when computing the unit iron/unit air rate.

I think the discrete nature of some process variables results in unfairly high assessments of process variation, so I would appreciate some advice on any statistical methods that would more fairly assess process variation for metrics derived from discrete variables.

A: I’m not sure I fully understand the problem, But I do have a few assumptions and possibly a reasonable answer for you. As you know, when making a measurement, using a discrete scale (red, blue, green; on/off, or similar), the item being measured is placed into one of the “discrete” buckets. For continuous measurements, we use some theoretically infinite scale to place the units location on that scale. For this latter type of measurement, we are often limited by the accuracy of the equipment to the level of precision the measurement can be accomplished.

In the question, you mention measurements of air from the “continuous” air blast. The air may be moving without interruption (continuously), yet the measurement is probably recorded periodically unless you are using a continuous chart recorder. Even so, matching up the reading with the unit iron readings every 3 to 5 minutes, does create individual readings for the air value. The unit iron reading is a “weights” based reading (not sure what is meant by derived, yet let’s assume the measurement is a weight scale of some sort.) Weight, like mass or length, is an infinite scale measurement, limited by the ability of the specific measurement system to differentiate between sufficiently small units.

I think you see where I’m heading with this line of thought. The variability with the unit iron reading may simply reflect the ability of the measurement process. I do not think either air rate or unit iron (weight based) is a discrete measurement, per se. Improve the ability to measure the unit iron and that may reduce some measurement error and subsequent variation. Or, it may confirm that the unit iron is variable to an unacceptable amount.

Another assumption I could make is that the unit iron is measured for the batch that then has unit air rates regularly measured. The issue here may just be the time scales involved. Not being familiar with the particular process involved, I’ll assume some manner of metal forming, where a batch of metal is created then formed over time where the unit air is important. And, furthermore, assume the batch of metal takes an hour for the processing. That means we would have about a dozen or so readings of unit air for the one reading of unit iron.

If you recall, the standard deviation formula is divided by square root of n (number of samples). In this case, there is about a 10 to 1 difference in n (10 for unit air to one for unit iron). Over many batches of metal, the ratio of readings remains at or about 10 to 1, thus impacting the relative stability of the two coefficient of variations. Get more readings for unit iron or reduce the unit air readings, and it may just even out. Or, again, you may discover the unit iron readings and underlying process is just more variable.

From the information provided, I think this provides two areas to conduct further exploration. Good luck.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56
Voting member of U.S. TAG to ISO/TC 69
Reliability Engineering and Management Consultant
FMS Reliability
http://www.fmsreliability.com

For more on this topic, please visit ASQ’s website.

## Question

Design for Six Sigma (DFSS) involves the discovery, development, and understanding of critical to quality areas and fosters innovation. However, studies have shown that using focus groups, interviews, and etc., based on current users only bring forth ideas relative to incremental innovation, as the only knowledge that most customers have is of current products. But we know that the greatest potential for return is in radical innovation.

My question is: what useful tools are there for determining critical to quality areas of radical innovation products, or products that are new to market where customers have little to no knowledge of?

These are great questions that are not easy to answer as posed.

One of the dilemmas I’ve seen with companies building radical innovation without enough knowledge to identify the important quality aspects is that the company is often under intense pressure to get to market. In some cases, the innovation presents clear aspects that have to be controlled to create an acceptable product. In some cases, the issues are unknown.

I do not agree the work within a group only reflects the knowledge already present. One of the best tools in these situations is carefully crafted questions posed to those most familiar with the new technology. Given my personal bias, I would ask: “What will fail? Why?” and then ask about material, process, and feature performance variation. Focusing on the failure mechanisms and variation will often lead the team to uncover those aspects of the product that require well crafted specifications and monitoring.

Not a fancy tool, just a question or two. Yet, the focus is on what will cause the innovation to not meet the customer’s expectations. What could go wrong? Make it visible, talked about, and examined. Creating a safe atmosphere (no blame or personal attacks) to explore failure permits those most vested in making the product work examine the boundaries and paths that lead to failure.

Once the process of safely examining failures starts, a range of tools assist with the refinement and prioritization. Failure Modes and Effects Analysis (FMEA) and Highly Accelerated Life Testing (HALT), provide means to further discover areas to explore the paths to failure. I mention creating a safe environment first, because using FMEA and HALT when someone’s reputation or status is threatened generally leads to these tools being very ineffective.

One more thought on a safe environment for the exploration of failures. Focus on the process, materials and interaction with customers and their environment. “How can we make this better, more resilient, more robust, etc.?” Not, “Why did you design it this way?” or, “This appears to be a design mistake.” All involved have the same goal to create a quality product or service, yet there may be a lot unknown related to those conditions that lead to product failure. An open and honest exploration to discover the margins and product weaknesses is most effective in a safe environment for those concerned. And, by the way, this includes vendors, contractors, suppliers, and all those involved with the supply chain, development and manufacturing processes.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56
Voting member of U.S. TAG to ISO/TC 69
Reliability Engineering and Management Consultant
FMS Reliability
fmsreliability.com

For more on this topic, please visit ASQ’s website.