Recommended AQLs for Packaging Material in the Pharmaceutical Industry

Pharmaceutical sampling

Question:

Are there recommended AQLs (critical, major and minor defects) for packaging material (primary and secondary) for the Pharmaceutical Industry?
Thank you

Response:

Though there are no recommended AQLs (or LTPD) for packaging materials, some industry standards have begun to surface. The following table is a guideline that I have seen used successfully for a risk based approach to sampling. Based on the severity and criticality of the packaging materials, these guidelines can be adjusted up to down per your risk management process.  Utilizing a c=0 sampling plan based on the binomial distribution, the sample size can be calculated using the following formula.

 

Primary

Secondary

Confidence (%)

Reliability (%)

Sample Size

Confidence      (%)

Reliability (%)

Sample Size

Critical

99.5

99

527

99

97

151

Major

99

97

151

97

95

68

Minor

97

95

68

95

90

28

I hope this assists you.

Steven Walfish
Secretary, U.S. TAG to ISO/TC 69
ASQ, CQE
Principal Statistician, BD
http://statisticaloutsourcingservices.com

Find more information related to AQLs here.

Pesticide Residues Surveillance Program Sampling

ISO 14004, Environmental Management System, EMS

Question

I am plant production specialist working in government sector.  I am the manager of pesticides residues surveillance program, on this program we targeting local commodities of fresh fruits and vegetables (F&V) by sampling the targeted numbers and types of F&V in regular basis around the year and we analyze samples and results and establish the annual report. I have checked many similar program in other countries included USDA program but I didn’t find approach methodology or statistical way to identify the sample size to be targeted in the year taking in account type and number of crops, crop production,…etc. to elaborate annual sampling plan. My question here is how can elaborate sampling plan for mentioned program considering all valuable factors?

Your cooperation is highly appreciated.

Answer

Sampling is a method to estimate population parameters. For example, if the goal is to determine the amount of unacceptable residue on store bought apples, and testing every individual apple is impractical, then we use a sample to estimate the proportion with unacceptable residue.

The sample plan must focus on the goal and balance with the resources and technology available. If the goal is to accurately detect a very low proportion with residue, say 1 in 1 million, then the sample size will be larger than if the goal is to detect 1 in 100 with unacceptable residue. The goal to detect 1 in 100 is easier to accomplish (fewer apples tested) yet does not reveal is there is a 1 in 1000 level or not.

A key element is the specific goal for detection and design a sample plan that is capable to detect at or better than the goal’s level. Capable includes the measurement system errors and an understanding of the nature of how failures occur.

Another consideration is the nature of the measurement and goal. If the test is only pass / fail for presence of residue, then we have to use the relatively inefficient sampling plans based on the binomial distribution. If the data is a variable value, such as part per million residue presence, then we can use more efficient sampling plans based on the appropriate continuous distribution. If the testing is destructive to the item being tested that limits the sampling techniques available.

How is the lot defined? If this is an annual report then the lot may be the annual production of a specific fruit or vegetable, say a specific variety of apples. Define the population clearly and any relevant subgroups of interest. If the data is only for an annual report the sampling plan is marked different than if the goal is a monthly monitoring and warning system.

Another consideration is the thresholds along with confidence. For sampling plan creation we use two specific points of interest. The Producer Risk Point (PRP) made up of the Acceptable Quality Level (AQL) and the producers’ risk (Type I risk or Alpha – which is the probability of rejecting a good lot, or in this case stating the residual level is above a specific AQL or value when it actually is not). The second point is the Consumer’s Risk Point made up of the Lot Tolerance Percent Defective (LTPD) and the consumer’s risk (Type II risk or Beta – which is the probability of accepting a bad lot, or in this case stating the residual level is below the LTPD when it is actually is not.)

The closer the AQL and LTPD are the more difficult (more samples) it is to determine an accurate estimate of the population. Likewise the less risk either the producer or consumer desire to incur again results in higher sample sizes.

One more consideration which is often overlooked is the selection of samples for testing. Most sampling plans are based on the assumption that the samples are taken randomly from the entire population. For example with say 50 million apples of a specific variety we would create a system to select samples that each has an equal change of any specific apply being selected. This is not a trivial matter in most cases. The availability and distribution of apples along with storage, shipping and display of apples all contribute to limited or biasing selecting a random sample. If it is not possible to select test items randomly, then study the impact on the study and means to account for a non-random sample.

In summary, for any sample plan:

  • Define the population
  • Define the desire goal of the study
  • Understand the measurement system
  • Use variables data if at all possible
  • Define PRP and CRP
  • Determine capable sampling plan
  • Design method to select random sample

This quick summary of consideration is what I consider the essential elements, yet other may impact the sampling plan. For example, seasonal variations in production, location in supply chain when measurements are made, variations in supply chain impact on presence of residual, differing nature of residue commonly found of different fruit or vegetables, and probably a few more. Understanding the goal, measurement system and random sampling will help determine areas that require consideration.

Cheers,

Fred Schenkelberg

Combating Contamination

Workplace safety, OHSAS 18001, work environments

Q: We want to ensure that we are receiving clean containers to package our products. How can we improve our incoming inspection process?

A: You should encourage your vendor to ship only clean containers. Then, be sure that the shipping and receiving process doesn’t cause contamination. If you can determine the source or sources of the contamination, the best fix is to remove the cause.

If that approach is not possible and you have incoming containers that may have some contamination, then consider the following elements in creating an efficient incoming inspection process.

1) How do you detect the contamination?

Apparently, you are able detect the container contamination prior to filling them, or are able to detect the effect of the contamination on the final product. Given that you are interested in creating an incoming test, let’s assume you have one or more ways to detect faulty units.

As you may already know, there are many ways to detect contamination. Some are faster than others, and some are non-destructive. Ideally, a quick non-destructive test would permit you to inspect every unit and to divert faulty units to a cleaning process. If the testing has to be destructive, then you’ll have to consider lot sampling of some sort.

There are many testing options. One is the optical inspection technique, which may find gross discoloration or large debris effectively. Avoid using human inspectors unless it’s only a short term solution, as we humans are pretty poor visual inspectors.

Another approach is using light to illuminate the contamination, such as a black light (UVA). Depending on the nature and properties of the contamination, you may be able to find a suitable light to quickly spot units with problems.

Another approach, which is more time consuming, is conducting a chemical swab or solution rinse and a chemical analysis to find evidence of contamination. If the contamination is volatile, you might be able to use air to “rinse” the unit and conduct the analysis. This chemical approach may require specialized equipment. Depending on how fast the testing occurs, this approach may or may not be suitable for 100 percent screening.

There may be other approaches for detecting the faulty units, yet without more information about the nature and variety of contamination, it’s difficult to make a recommendation. Ideally, a very fast, effective and non-destructive inspection method is preferred over a slow, error prone, and destructive approach. Cost is also a consideration, since any testing will increase the production costs. Finding the right balance around these considerations is highly dependent on the nature of the issue, cost of failure, and local resources.

2) How many units do you have to inspect?

Ideally, the sample size is zero as you would first find and eliminate the source of the problem. If that is not possible or practical, then 100 percent inspection using a quick, inexpensive, and effective method permits you to avoid uncertainties with sampling.

If the inspection method requires lot sampling, then all of the basic lot sampling guidelines apply. There are many references available that will assist you in the selection of an appropriate sampling plan based on your desired sampling risk tolerance levels.

Another consideration is the percentage of contaminated units per lot. If there is a consistent low failure rate per lot, then lot sampling may require relatively large amounts of tested units. You’ll have to determine the level of bad units permitted to pass through to production. Short of 100 percent sampling, it’s difficult (and expensive) to find very low percentages of “bad” units in a lot using destructive testing.

3) Work to remove original source(s) of contamination to permit you to stop inspections.

I stress this approach because it’s the most cost effective in nearly all cases. In my opinion, incoming inspection should be stopped as soon as possible since the process to create, ship and receive components should not introduce contamination and require incoming inspection to “sort” the good from the bad.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56 on Reliability
Voting member of U.S. TAG to ISO/TC 69 on Applications of Statistical Methods
Reliability Engineering and Management Consultant
FMS Reliability
www.fmsreliability.com

Related Resources:

Digging for the Root Cause, Six Sigma Forum Magazine, open access

Many Six Sigma practitioners use the term “root cause” without a clear concept of its larger meaning, and similar situations occur in Six Sigma training programs. As a result, many practitioners overlook root causes. Read more.

The Bug and the Slurry: Bacterial Control in Aqueous Products, ASQ Knowledge Center Case Study, open access

When a customer reported a problem using the polycrystalline diamond (PCD) slurry supplied by Warren/Amplex, the company traced its product through the supply chain in order to identify the cause and quickly implement a solution. Read more.

Explore the ASQ Knowledge Center for more case studies, articles, benchmarking reports, and more.

Browse articles from ASQ magazines and journals here.

Z1.4 and Z1.9 in Micro Testing and API Chemical Analysis

Chemistry, micro testing, chemical analysis, sampling

Q: I work at a cosmetics manufacturing company that produces sunscreen in bulk amounts. When we make 3,000 kg of sunscreen, we will use that in 10,000 units of final sunscreen products which will weigh 300 g each.

How many samples do I need to collect from the 10,000 units to pass the qualification?

The products need to pass both attribute and variable sampling tests such as container damage, coding error, micro testing, and Active Pharmaceutical Ingredients (API)  failure. Almost 100 percent of final products were inspected for appearance error, but a small number of them should be measured for micro testing and API chemical analysis.

For Z1.4-2008: Sampling Procedures and Tables for Inspection by Attributes, we have to collect a sample of 200 (lot size of 3,201-10,000; general inspection level II;  acceptable quality level 4.0 L), and more than 179 should pass for qualification.

For Z1.9-2008: Sampling Procedures and Tables for Inspection by Variables for Percent Nonconforming, we have to collect a sample of 25 (lot size of 3,201-10,000; general inspection level II; acceptable quality level 4.0, L), to meet the requirement of 1.12 percent of nonconformance.

Which sampling plan should we follow for micro testing and API chemical analysis?

A: If the micro test is pass/fail, then you should use Z1.4. The API chemical test  probably yields a numerical result for which you can calculate the average and standard deviation. Then, the proper standard to use is Z1.9. If the micro test gives you a numerical result, then you can use Z1.9 for it as well.

One thing to consider is the fact that the materials are from a
batch. If the batch can be assumed to be completely mixed without settling or separation prior to loading into final packaging, then the API chemical test may only need to be done on the batch, not on the final product. Micro testing, which can be affected by the cleanliness of the packaging equipment, probably needs to be done on the final product.

Brenda Bishop
U.S. Liaison to TC 69/WG3
ASQ CQE, CQA, CMQ/OE, CRE, SSBB, CQIA
Belleville, Illinois

Related Resources:

Getting the Right Data Up Front: A Key Challenge, Quality Engineering, open access

Rational decisions require transforming data into useful information by appropriate analyses. Such analyses, however, can be only as good as the data upon which they are based. In this article, the authors urge that careful consideration be given, up front, to procuring the right data and provide some guidelines. Read more.

A Graphical Tool for Detection of Outliers in Completely Randomized, Unreplicated 2k and 2k-P Factorials, Quality Engineering, open access

With the increased awareness of statistical methods in industry today, many non-statisticians are implementing statistical studies and conducting statistically designed experiments (DOEs). With this increased use of DOEs by non-statisticians in applied settings, there is a need for more graphical methodologies to support both analysis and interpretations of DOE results. Read more.

Z1.4:2008 inspection levels

Q: I am reading ANSI/ASQ Z1.4-2008: Sampling procedures and tables for inspection by attributes, and there is a small section regarding inspection level (clause 9.2). Can I get further explanation of how one would justify that less discrimination is needed?

For example, my lot size is 720 which means, under general inspection level II, the sample size would be 80 (code J). However, we run a variety of tests, including microbial and heavy metal testing. These tests are very costly. We would like to justify that we can abide by level I or even lower if possible. Do you have any advice?

The product is a liquid dietary supplement.

 A: Justification of a specific inspection level is the responsibility of the “responsible party.” Rationale for using one of the special levels (S-1, S-2, S-3, S-4) could be based on the cost or time to perform a test. Less discrimination means that the actual Acceptable Quality Level (AQL) on the table underestimates the true AQL, as the sample size has been reduced from the table-suggested sample size (i.e. Table II-A has sample level G of 32 for a lot size of 151 to 280, while General Inspection level I would require Letter E or 13 samples for the same lot size).

Justification of a sampling plan is based on risk and a sampling plan can be justified based on the cost of the test, assuming you are willing to take larger sampling risks. If you use one of the special sampling plans based on the cost of the test, it is helpful to calculate the actual AQL and Limiting Quality (LQ) using the following formulas.

You solve the equation for AQL and LQ for a given sample size (n) and defects allowed (x):

Steven Walfish
Secretary, U.S. TAG to ISO/TC 69
ASQ CQE
Principal Statistician, BD
http://statisticaloutsourcingservices.com

Related Content:

Acceptance Sampling With Rectification When Inspection Errors Are Present, Journal of Quality Technology

In this paper the authors consider the problem of estimating the number of nonconformances remaining in outgoing lots after acceptance sampling with rectification when inspection errors can occur. Read more.

Zero Defect Sampling, World Conference on Quality and Improvement

Zero defect sampling is an alternative method to the obsolete Mil Std 105E sampling scheme previously used to accept or reject products, and the remaining ANSI Z1.4-1993 which is still in use. This paper discusses the development of zero defect sampling and compares it to Mil Std 105E. Read more.

Variation in Continuous and Discrete Measurements

Force majeure

Q: I would appreciate some advice on how I can fairly assess process variation for metrics derived from “discrete” variables over time.

For example, I am looking at “unit iron/unit air” rates for a foundry cupola melt furnace in which the “unit air” rate is derived from the “continuous” air blast, while the unit iron rate is derived from input weights made at “discrete” points in time every 3 to 5 minutes.

The coefficient of variation (CV), for the air rate is exceedingly small (good) due to its “continuous’ nature” but the CV for iron rate is quite large because of its “discrete nature,” even when I use moving averages for extended periods of time. Hence, that seemingly large variation for iron rate then carries over when computing the unit iron/unit air rate.

I think the discrete nature of some process variables results in unfairly high assessments of process variation, so I would appreciate some advice on any statistical methods that would more fairly assess process variation for metrics derived from discrete variables.

A: I’m not sure I fully understand the problem, But I do have a few assumptions and possibly a reasonable answer for you. As you know, when making a measurement, using a discrete scale (red, blue, green; on/off, or similar), the item being measured is placed into one of the “discrete” buckets. For continuous measurements, we use some theoretically infinite scale to place the units location on that scale. For this latter type of measurement, we are often limited by the accuracy of the equipment to the level of precision the measurement can be accomplished.

In the question, you mention measurements of air from the “continuous” air blast. The air may be moving without interruption (continuously), yet the measurement is probably recorded periodically unless you are using a continuous chart recorder. Even so, matching up the reading with the unit iron readings every 3 to 5 minutes, does create individual readings for the air value. The unit iron reading is a “weights” based reading (not sure what is meant by derived, yet let’s assume the measurement is a weight scale of some sort.) Weight, like mass or length, is an infinite scale measurement, limited by the ability of the specific measurement system to differentiate between sufficiently small units.

I think you see where I’m heading with this line of thought. The variability with the unit iron reading may simply reflect the ability of the measurement process. I do not think either air rate or unit iron (weight based) is a discrete measurement, per se. Improve the ability to measure the unit iron and that may reduce some measurement error and subsequent variation. Or, it may confirm that the unit iron is variable to an unacceptable amount.

Another assumption I could make is that the unit iron is measured for the batch that then has unit air rates regularly measured. The issue here may just be the time scales involved. Not being familiar with the particular process involved, I’ll assume some manner of metal forming, where a batch of metal is created then formed over time where the unit air is important. And, furthermore, assume the batch of metal takes an hour for the processing. That means we would have about a dozen or so readings of unit air for the one reading of unit iron.

If you recall, the standard deviation formula is divided by square root of n (number of samples). In this case, there is about a 10 to 1 difference in n (10 for unit air to one for unit iron). Over many batches of metal, the ratio of readings remains at or about 10 to 1, thus impacting the relative stability of the two coefficient of variations. Get more readings for unit iron or reduce the unit air readings, and it may just even out. Or, again, you may discover the unit iron readings and underlying process is just more variable.

From the information provided, I think this provides two areas to conduct further exploration. Good luck.

Fred Schenkelberg
Voting member of U.S. TAG to ISO/TC 56
Voting member of U.S. TAG to ISO/TC 69
Reliability Engineering and Management Consultant
FMS Reliability
http://www.fmsreliability.com

ANOVA for Tailgate Samples

Automotive inspection, TS 16949, IATF 16949

Q: I have a question that is related to comparison studies done on incoming inspections.

My organization has a process for which it receives a “tailgate” sample from a supplier and then compares that data with three samples of the next three shipments to “qualify” them. The reason behind this comparison is to determine if the production process of the vendor has changed significantly from the “tailgate” sample, or if they picked the best of the best for the “tailgate.”

It seems a student’s t-test for comparing two means might be a simple and quick evaluation, but I believe an ANOVA might in order for the various characteristics measured (there are multiple).

Can an expert provide some statistician advice to help me move forward in determining an effective solution?

A: Assuming the data is continuous,  ANOVA (or MANOVA for multiple responses) should be employed. Since the tailgate sample is a control, Dunnett’s multiple comparison test should be used if the p-value from ANOVA is less than 0.05.  If the data is discrete (pass/fail), then comparing the lots would require the use of a chi-square test.

Steven Walfish
Secretary, U.S. TAG to ISO/TC 69
ASQ CQE
Principal Statistician, BD
http://statisticaloutsourcingservices.com/

Guidance on Z1.4 Levels

Chart, graph, sampling, plan, calculation, z1.4

Q: My company is using ANSI/ASQ Z1.4-2008 Sampling Procedures and Tables for Inspection by Attributes, and we need some clarification on the levels and the sampling plans.

We are specifically looking at Acceptable Quality Limits (AQLs) 1.5, 2.5, 4.0, and 6.5 for post manufacturing of apparel, footwear, home products, and jewelry.

Do you have any guidelines to determine when and where to use levels I, II, and III? I understand that level II is the norm and used most of the time. However, we are not clear on levels I and III versus normal, tightened, and reduced.

Are there any recommended guidelines that correlate between levels I, II, III and single sampling plans, normal, tightened, and reduced?

The tables referenced in the standard show single sampling plans for normal, tightened, and reduced, can you confirm that these are for level II (pages 11, 12, 13)?

Do you have any tables showing the levels I and III for normal, tightened, and reduced?

A: Level I is used when you need less discrimination or when you are not as critical on the acceptance criteria. This is usually used for cosmetic defects where you may have color differences, but it is not noticeable in a single unit. Level III is used when you want to be very picky.  This is a more difficult level to get acceptance with, so it needs to be used sparingly or it can cost you a lot of money.

Each level has a normal, tightened and reduced scheme.  I am not sure about what you are asking for with respect to correlation to levels I, II and III and normal, tightened and reduced.  The goal is to simply inspect the minimum amount to get an accept or reject decision. Since inspection costs money, we do not want to do too much. Likewise, we do not want to reject much since that also costs money both in product availability and extra shipping.

Yes, the tables on pages 11, 12 and 13 are for normal, tightened, and reduced, but if you look at the letters for sample size, you will note that in most cases there are different letters for the levels I, II, and III.  Accept and reject numbers are based on the defect level and the sample size. The switching rules tell you when you can switch to either a reduced or tightened level. The tables can handle not just the levels I, II , and III, but also the special levels.

Jim Bossert
SVP Process Design Manger, Process Optimization
Bank of America
ASQ Fellow, CQE, CQA, CMQ/OE, CSSBB, CMBB
Fort Worth, TX

Z1.4 2008: AQL, Nonconformities, and Defects Explained

Pharmaceutical sampling

Q: My question is regarding the noncomformities per hundred units and percent nonconforming.  This topic is discussed in ANSI/ASQ Z1.4-2008 Sampling Procedures and Tables for Inspection by Attributes under sections 3.2 and 3.3 on page 2.  Regardless of the explanations provided, I find myself puzzled as to what the following numbers refer to in “Table II-A– Single sampling plans for normal inspection (Master table).”

Specifically, I am having problems understanding the following unit numbers just above the Acceptance and Rejection numbers (example, 0.010, 0.015, 0.025, 1000).  Do these represent percent noncomformities and if so,  does 0.010 = 0.01%, and conversely, how can 1000 = 1000%?

As you may see, I am very confused by these numbers, and I was hoping to have some light shed on this subject. Thank you for your answers in advance.

A: The numbers on the top of the table are just as the questioner stated: .0.010 = .01% defective.  That is the acceptable quality limit (AQL) number.  Generally, most companies want 1% or less, but as noted in the table, it does go up to 1000. It is extreme to think of something being more than 100%, but consider that it may be a minor or cosmetic defect that does not affect the function but just does not look good.  Scratch and dent sales are a common result of these higher numbers.

The AQL number is the worst quality level you would expect to find at this level.  The thing you have to remember is that these plans work best when the quality is very good or very bad.  If you are at the limit, you could end up taking more samples and spend a lot of time in tightened inspection.

Many people use percent nonconforming instead of percent defective, simply because of the connotation of “defective.” No one wants to say they shipped a defective product.  They may have shipped a nonconforming product that the customer could not use simply because their requirements were too strict, where another customer may be able to use the same thing because they have less stringent requirements.

Jim Bossert
SVP Process Design Manger, Process Optimization
Bank of America
ASQ Fellow, CQE, CQA, CMQ/OE, CSSBB, CMBB
Fort Worth, TX

Z1.9 Sigma for Variability Known Method

Audit, audit by exception

Q: I have a question about  Z1.9-2008: Sampling Procedures and Tables for Inspection by Variables for Percent Nonconforming. I have seen there is a “Variability Known” method. However I don’t know how to get a Sigma, so I don’t know how to use this method. Could you please share how to get a Sigma?

A: To get a Sigma to use for the Variability Known method is to have data that has been collected over a period of time and calculate the standard deviation. The rule here is at least six months of data with at least 50 data points.  Depending on the process, if the data has been collected and there is over 1000 data points, the time limitation goes away since you have an extremely large data set to work with.

Q: During the 6 months, the process should be under control, right? And data should be normal distribution, right? Is there any process control needed? And how do I maintain this process and Sigma?

A: Yes, there is the assumption that the process is normally distributed and is stable.  That means some type of process control is being used.  Ideally this would be an X-bar and r or an X-bar and S chart. If an out of control situation occurs and you can bring the process back into control, then you are ok.

Q: Could you tell me the meaning of “data point”? As you know, during the 6 months, we will get lots of batches. For each batch, we will have a certificate of analysis (COA), and many data. I am not sure how do you combine data for different batches. How do you calculate this?

A: Data point, in the most simple format, could be the statistics associated with a batch or a mean and standard deviation/range. Each batch gives you a new set of data points. You can combine the time based data in a couple of different ways:

1. You can take each batch and use the means and plot them on an X-bar and R or an X-bar and S-chart.
2. You can take the raw data and combine it into one large distribution.

The preferred way is the control chart approach since you will know if the process is stable since it is already plotted.

Jim Bossert
SVP Process Design Manger, Process Optimization
Bank of America
ASQ Fellow, CQE, CQA, CMQ/OE, CSSBB, CMBB
Fort Worth, TX