IVD Technology
Magazine
IVDT Article Index
Originally published November, 1997
Formal experimental design and analysis for immunochemical product development
John A. Wass
Part 2: Designed experiments can minimize the number of experimental runs required to capture adequate data and optimize outputs.
In IVD manufacturing, design of experiments (DOE) is all about pinpointing the most efficient combination of inputs required to generate, at the lowest cost, a product that meets exacting specifications. The first part of this article (IVDT, September 1997) described the designs that are most commonly used for screening runs and response surfaces. This second part demonstrates the application of these principles, using a practical IVD manufacturing example. As the example proceeds, it will look at some of the techniques for dealing with unruly data.
Master lot testing for a serum protein to be used on an automated immunoanalyzer provides a good example of the application of designed experiments. The experiment discussed in detail here was designed and analyzed using a program called ECHIP. A description of some commercially available DOE software packages is provided in the sidebar on page 35.
Generating the Screening Design
To identify values of a control substance within a desirable range, researchers typically mix probe, conjugate, and microparticle reagents and test them using designed experiments. Table I outlines the screening design for immunochemical data representative of this example. The probe is a biotinylated antibody molecule used in standard immunologic sandwich techniques (linking two other molecules). The conjugate is a fluorescently labeled antibiotin molecule. The microparticles are antibody-coated beads. The reaction is the fluorescence generated as all of these molecules bind together with the antigen from the sample.
| Trial | Conjugate | Probe | Microparticle |
|---|---|---|---|
| 4 | 0.10 | 1.000 | 2.00 |
| 2 | 0.10 | 0.050 | 0.50 |
| 5 | 0.10 | 1.000 | 0.50 |
| 7 | 0.10 | 0.050 | 2.00 |
| 3 | 1.00 | 0.050 | 0.50 |
| 8 | 1.00 | 1.000 | 0.50 |
| 1 | 1.00 | 1.000 | 2.00 |
| 9 | 0.55 | 0.525 | 1.25 |
| 5 | 0.10 | 1.000 | 0.50 |
| 1 | 1.00 | 1.000 | 2.00 |
| 4 | 0.10 | 1.000 | 2.00 |
| 3 | 1.00 | 0.050 | 0.50 |
| 6 | 1.00 | 0.050 | 2.00 |
| 2 | 0.10 | 0.050 | 0.50 |
Table I. Immunochemical screening design. Concentrations are in arbitrary units and run order is randomized.
For the sake of simplicity, this example uses only probe, microparticle, and conjugate reagents. But in actual experiments, there are usually many more input variables than necessary. Since the number of variables should usually be reduced, the software program initially generates a screening design.
The software generates a linear-with-center-point design that includes sufficient runs to assess the replicate error, considered the noise floor. This type of design is mathematically simple, yet fully capable of accurately describing physical relevance. It is usually the screening design of choice, unless there is a compelling reason to select a more complex model. Unless otherwise requested, the order of experimental runs is randomized to control systematic error. The technicians take the design to run in the laboratory.
Excluding Insignificant Inputs
The results of the screening tests are shown in Table II. This table displays the significance of the various inputs to the output (here only the reaction rate). The stars within the table denote the significance of each effect and reflect the alpha significance values from the analysis of variance test (ANOVA) performed by the software. One star denotes the 5% significance level; two stars, 1%; and three stars, 0.1%.
| Reaction Rate | Inputs |
|---|---|
| *** | Conjugate |
| *** | Probe |
| *** | Microparticle |
| LOF |
Table II. Summary of results for screening design. Stars denote significance of test results (see text); LOF = lack of fit.
In this example, all main effects are considered highly important to the measured output, the reaction rate. This result is not surprising, since all the reagents are needed to generate the reaction.
If no stars were associated with a particular reagent, their absence would not necessarily indicate a lack of importance to the reaction but merely a lack of importance over the range of the reagent concentration used in the experiment. This distinction is important, because the range may need to be broadened.
The "LOF" message at the bottom of the reaction column in Table II denotes a lack of fit for this model to the data. This lack of fit is not of great concern, because most biochemical and immunochemical data are not readily fit by the overly simplistic straight line of a screening model. To verify testing adequacy, the statisticians need only examine the size of the deviations from the line in the residual table (see Table III). There they will invariably find small areas of large-enough deviation to trigger the LOF message. These residuals represent the difference between the observed data and what was calculated by the model.
| Trial | Residuals |
|---|---|
| 1 | 0.56 |
| 2 | 0.65 |
| 3 | 0.04 |
| 4 | 0.05 |
| 5 | 0.71 |
| 6 | 1.24 |
| 7 | 0.31 |
| 8 | 0.14 |
| 9 | 0.66 |
Table III. Residuals from screening experiment.
The next step is to exclude those input factors found to be statistically insignificant in the screening outcome and to design a response surface experiment with only the most important factors. In this case, those factors are only the main effects: microparticle, probe, and conjugate (see Table IV).
| Trial | Microparticles | Probe | Conjugate |
|---|---|---|---|
| 1 | 0.50 | 1.000 | 1.00 |
| 13 | 1.25 | 0.050 | 0.10 |
| 5 | 2.00 | 1.000 | 1.00 |
| 4 | 0.50 | 0.050 | 1.00 |
| 12 | 1.25 | 1.000 | 0.10 |
| 15 | 0.50 | 0.525 | 0.10 |
| 4 | 0.50 | 0.050 | 1.00 |
| 8 | 1.25 | 0.525 | 1.00 |
| 3 | 0.50 | 1.000 | 0.10 |
| 3 | 0.50 | 1.000 | 0.10 |
| 14 | 2.00 | 0.525 | 1.00 |
| 2 | 2.00 | 0.525 | 0.10 |
| 1 | 0.50 | 1.000 | 1.00 |
| 9 | 0.50 | 0.050 | 0.10 |
| 2 | 2.00 | 0.525 | 0.10 |
| 5 | 2.00 | 1.000 | 1.00 |
| 11 | 2.00 | 1.000 | 0.55 |
| 10 | 2.00 | 0.050 | 0.55 |
| 7 | 0.50 | 0.525 | 0.55 |
| 6 | 1.25 | 1.000 | 0.55 |
Table IV. Response surface design. Concentrations are in arbitrary units and run order is randomized.
The experimenters may perform more runs at this stage to define the response surface more completely. In most screening experiments, however, the total number of required runs is minimized by eliminating a number of input factors (see Table V).
| Trial | Reaction Rate |
|---|---|
| 1 | 6.9 |
| 13 | 3.6 |
| 5 | 9.8 |
| 4 | 3.95 |
| 12 | 6.75 |
| 15 | 2.9 |
| 4 | 4.0 |
| 8 | 6.75 |
| 3 | 5.88 |
| 3 | 5.25 |
| 14 | 7.0 |
| 2 | 5.7 |
| 1 | 6.49 |
| 9 | 1.99 |
| 2 | 5.89 |
| 5 | 9.56 |
| 11 | 8.67 |
| 10 | 5.99 |
| 7 | 3.4 |
| 6 | 7.49 |
Table V. Response surface data. Concentrations are in arbitrary units, and run numbers correspond to the input conditions given in the design table (Table IV).
The response surface significance summary is presented in Table VI. All main effects retain significance on the reaction rate, and it can now be seen that the interaction between the probe molecule and the conjugate also exerts a significant effect on the reaction rate. The three stars next to the probe-squared row indicate that the program had to bend the response surface in proportion to the square of the value of the probe molecule concentration. The LOF message is now gone, indicating an adequate fit of the model to the data.
| Reaction Rate | Inputs |
|---|---|
| *** | Microparticles |
| *** | Probe |
| *** | Conjugate |
| * | Microparticles* Probe |
| * | Microparticles* Conjugate |
| ** | Probe* Conjugate |
| * | Microparticles 2 |
| *** | Probe 2 |
| * | Conjugate 2 |
Table VI. Summary of response surface design showing the statistical importance of factors. Stars denote significance of test results (see text); dots indicate insignificant interactions (p<0.10).
Dealing with Lack of Fit
Had there actually been a lack of fit, several strategies could have been used to better fit the model to the data. These strategies are:
Do Nothing, and Accept the Lack of Fit. Lack of fit due to chance alone occurs about 5% of the time. The experimenters may therefore examine the residuals and, if they are sufficiently small or if the lack of fit occurs only in an area of the response surface not important to the physical process, ignore it and proceed.
Remove Certain Data. There may be cause to remove certain points due to known exceptions to the experimental protocol. The experimenters may also apply statistical tests for outlier status, but the best method is to repeat the experiment in those areas where the anomalies occurred.
Transform the Data. If there is no cause to remove data, transforming them is the easiest method. However, it is not a good idea to pull down a list of transforms and apply them one at a time to the data until the LOF message disappears. Certain transforms are most useful in certain situations. They may affect data in unwanted ways when applied in a random, shotgun fashion.
The other rule is not to go to heroic lengths to remove the lack of fit. If complex and lengthy mathematical manipulations are required, chances are the data are best left alone.
Use a More-Complex Model. Although it is sometimes of value, use of a more-complex model requires further data collection. Time and resource availability may be the deciding factors here.
Assessing Data Adequacy
Once the lack-of-fit issue is resolved, the experimenters may assess data adequacy by using standard plots (see Figure 1). In a plot of normal data versus studentized residuals (errors standardized by distance from a central point within a distribution), the straight-line relationship implies that the errors (the disparity between what was expected and what was actually observed) are normally distributed. The plot of fitted values documents that these errors are independent (the points are scattered and not clustered) with nearly constant variance (all the points lie within the standard deviation of ±3).
Figure 1. Assessment of data adequacy by residual plots.
Other plots are available and may yield further insights depending on the error distribution and region of interest in the data. Nevertheless, statistical testing can take researchers only so far. Continued data aberrations that arise from problems with instrumentation or chemistry may require staff engineers or immunochemists to intervene and change the design of the product.
Optimizing Outputs
Assuming that such testing and intervention are not required, the experimenter requests that the software optimize the reaction rate value to a desired number, perhaps maximizing it (see Figure 2) or requesting a specific target value. The software then generates not only the proper settings for the inputs but also a guard band for the outputs that yield the upper and lower 95% confidence limits. These limits reflect the errors of prediction and, more important, the error that will occur when a new observation is taken. A new measurement, taken at the given settings of the input variables, may therefore be expected to lie between these limits.
Figure 2. Two-dimensional response surface plot. Reaction rate as a function of probe and microparticles, with conjugate held constant (=1.00).
Figure 3. Three-dimensional response surface plot. Reaction rate as a function of probe and microparticles, with conjugate held constant (=1.00).
Figure 3 gives a three-dimensional overview of the design space, displaying how the output varies with two selected input variables. A third, off-axis variable is fixed at the value given below the graph. In Figure 2, the reaction rate is maximized under the crosshairs in the upper right corner of the design space (bounded by the red lines). This graph indicates that use of one unit of probe with two units of microparticles and one unit of conjugate will generate a reaction rate of about 9.5 units (in most cases, between 8.52 and 10.63 units).
The above methodology allows observation of input variable interactions and configuration of the system to allow derivation of a desired output variable. Many more inputs and outputs could be tested. Many variations are possible in both the design and the analytic strategy. These methodologies may even be applied to gain a better understanding of the physical mechanisms underlying the process, for example, whether antigen a is binding more strongly to antibody x or to antibody y.
Conclusion
Formal design of experiments is based on well-accepted statistics and computational algorithms that are easily implemented using commercially available software. The methodologies are flexible enough to apply to a wide variety of industries and useful in designing cost-effective strategies in many settings. DOE may enhance the experimenters' insight into many physical processes and actually speed discovery.
SOFTWARE RESOURCES
Many commercial software packages are available for formal design of experiments. The following list is far from all-inclusive; it represents those programs the author has used or examined. The commentary is meant to orient the reader rather than to be a comparative review. The particular nuances of any package may be more or less attractive to the user based on personal preference and experience.
ECHIP. The ECHIP program is devoted entirely to experimental design and is not a general statistics package. It offers many standard designs as well as the ability to create customized designs. The main design screen takes the novice step by step through the variable definitions, designs, data entry, and results analysis screens. The software includes a power/sample-size calculator that is very useful for assessing the resolving ability of an experiment, which is the ability to find a prespecified difference if one really exists. The user manual has many helpful examples. There is also a reference manual for those interested in the details behind the designs.
Contact: ECHIP, Inc., 724 Yorklyn Rd., Hockessin, DE 19707-8703, phone 302/239-5429.
Minitab. The newest version (release 11) of the Minitab statistical package has a simplified DOE interface that reduces the programming required to straightforward button-pushing. A design may be created by either of two methods. Those unfamiliar with the process may request assistance from the program. Standard as well as customized designs are available, and choices are made via familiar dialog boxes. A user manual reviews the designs via the interface, while the reference manual gives the programming steps.
Contact: Minitab, Inc., 3081 Enterprise Dr., State College, PA 16801-3008, phone 814/238-3280.
SAS. Strictly a statisticians' program, SAS in its current release allows experimental design only via programming. The methods and steps are well documented but geared to the statistically sophisticated. Presently the SAS Institute is developing a graphical user interface for its DOE system. A new release featuring this enhancement should be available soon.
Contact: SAS Institute, Inc., 100 SAS Campus Drive, Cary, NC 27513-2414, phone 919/677-8000.
SAS JMP. To more immediately address the needs of the novice designer, the SAS Institute has developed SAS JMP, a user-friendly package specifically for exploratory data analysis and experimental design. The DOE section uses JMP's colorful, interactive graphics and offers a variety of design types at the click of a button. The 2-D contours are informative. The manuals strive for clarity through a number of real-world examples.
Other Packages. The latest versions of the following programs contain DOE modules but have not been reviewed by the author: Systat (version 7.0), SPSS, Inc., 444 N. Michigan Ave., Chicago, IL 60611-3962, phone 312/329-2400; Statistica, StatSoft, 2300 E. 14th St., Tulsa, OK 74104, phone 918/749-1119.
Bibliography
Atkinson AC, and Donev AN, Optimum Experimental Designs, Oxford, England, Clarendon Press, 1992.
Myers RH, and Montgomery DC, Response Surface Methodology: Process and Product Optimization Using Designed Experiments, New York, John Wiley, 1995.
Schmidt SR, and Launsby RG, Understanding Industrial Designed Experiments, 4th ed, Colorado Springs, CO, Air Academy Press, 1997.
Wheeler B, ECHIP Reference Manual, Hockessin, DE, ECHIP, 1993.
John A. Wass is a mathematical analyst in the scientific support group at Abbott Laboratories (Abbott Park, IL).



