IVD Technology
Magazine
IVDT Article Index
Originally Published March 2000
Dealing with discrepancy analysis, Part 2: Alternative analytical strategies
IVD product developers can save time and money by designing study protocols and selecting analytical methods with scientific validity uppermost in their minds.
Cheryl L. Hayden and Michael L. Feldstein
When IVD manufacturers perform product testing to compare the accuracy of a new test to that of an existing one, any discrepancies between the results of the two tests will naturally raise a number of questions. Although it is commonplace for researchers to resolve such questions by focusing their attention on the discrepant results, such a practice has the potential to introduce statistical bias into the researchers' analysis of the test data.
The first installment of this article, discussed some of the sources of statistical bias that can be encountered during discrepancy analysis, and suggested some approaches that can enable researchers to avoid introducing bias into their studies. In this installment, the authors look at the techniques currently available to researchers, with special reference to their adherence to emerging FDA guidances.
The Impact of the Problem
When manufacturers are seeking to demonstrate the performance of a new test, the problems inherent in discrepancy analysis can present a significant obstacle. The precise effect of such methodological problems on a comparison between two assay methods depends on the expected performance of the assay under investigation.
For example, an assay for an analyte such as a tumor marker is generally not expected to be especially sensitive or specific. Acceptable sensitivity for such a marker would be 75 to 80%, with comparable specificity. Furthermore, the reference assay would be expected to have accuracy in about the same range.
When the accuracy of both the reference assay and the assay under investigation are in such a range, it can be useful to determine which of the two tests best reflects the subject's disease status at the time the samples were collected. As described in the first installment of this article, examination of discrepant results can help to discover this information by establishing the accuracy of each test relative to a third, confirmatory assay. The results of such an examination can be reported as a separate analysis, supplementing the initial report on sensitivity and specificity developed by comparing the new assay to the reference assay.
Another, more rigorous, approach would be to determine the subject's disease status at the time that each sample is collected, and to report sensitivity and specificity for each method using patient status as the gold standard. In this manner, researchers can determine the clinical utility of both methods, and can compare them using results drawn from the same set of patient samples.
The effects of the methodological problems in discrepancy analysis are quite different when either the sensitivity or specificity of the test under investigation is very high. If the new test is theoretically more sensitive, more specific, and more accurate than the reference assay, researchers can find themselves without an accepted method for demonstrating the true performance of the new test.
Such is the case with viral DNA or RNA amplification tests, which are expected to have sensitivity in the range of 85 to 99% and specificity between 94 and 99%. Because DNA or RNA target amplification yields thousands of copies/ml of amplicon, the sensitivity of such tests is inherently greater than that of immunoassay methods that are used to detect antibodies to the same infectious organisms. Without some kind of further analysis, comparison to the results of standard tests could grossly underestimate the true performance of the new test.
Reporting Discrepant Results
Correcting sensitivity and specificity analyses by using results from a discrepancy analysis can introduce bias into the final statistical analysis. However, researchers can avoid this difficulty by making use of several other approaches that have greater scientific and statistical validity.
| New Assay | Existing Assay Positive | Existing Assay Negative | Total |
| Positive | 130 | 20 | 150 |
| Negative | 5 | 245 | 250 |
| Total | 135 | 265 | 400 |
| Relative sensitivity: 130/135 x 100 = 96.3% | |||
| Relative specificity: 245/265 x 100 = 92.5% | |||
| Relative accuracy: [(130 + 245)/400] x 100 = 93.8% | |||
Table I. A 2 x 2 table comparing a hypothetical new assay to an existing assay.
The easiest and least costly approach is simply to present the data as they are; that is, to create
| Table I Results | Third Assay Positive | Third Assay Negative | Total |
| New assay positive, existing assay negative | 15 | 5 | 20 |
| New assay negative, existing assay positive | 1 | 4 | 5 |
| Total | 16 | 9 | 25 |
Table II. A 2 x 2 table showing evaluation of the discrepant results from Table I.
A second approach is to present the data as they are, but also to conduct additional investigation and analysis of the samples that showed disagreement (see Table II, above). Such additional investigations could include any or all of the following.
- Reassay the discrepant samples by both methods, if appropriate for the analyte.
- Assay the discrepant samples by a third method that is considered more accurate.
- Gather and evaluate additional clinical information from the time of sample collection.
The additional information obtained by these methods should not be incorporated into the original statistical analysis as a corrected analysis, but should be reported separately. This approach still limits researchers to reporting relative sensitivity and specificity. However, it may also enable them to determine which assay more accurately reflects analyte level or patient status. In this hypothetical example, in which the discrepant results from Table I are reassayed using a third method, as shown in Table II, it is apparent that the new assay is more accurate than the existing assay. However, for such a conclusion to reflect "truth," the third assay must measure the analyte extremely accurately by an independent method. If the third assay measures the analyte in a manner similar to that of the new assay, the results presented will be biased in favor of the new assay.
| Table I Results | Third Assay Positive | Third Assay Negative | Total |
| New assay positive, existing assay negative | 15 | 5 | 20 |
| New assay negative, existing assay positive | 1 | 4 | 5 |
| Both assays positive | 25 | 5 | 30 |
| Both assays negative | 3 | 27 | 30 |
Table III. Table showing evaluation of discrepant results from Table I, plus a randomly selected group of nondiscrepant samples.
A slightly more rigorous version of this approach is to perform additional studies on all of the discrepant samples plus a randomly selected group of the nondiscrepant samples (see Table III). Using this approach enables researchers to estimate accuracy of both assay methods, using a third method as the gold standard (see Tables IV(a) and IV(b)). This approach treats all possible initial test results similarly, selecting for retesting representative samples of results from all categoriestrue positives, true negatives, false positives, and false negatives. It has practical appeal because it does not require time-consuming and expensive retesting by a third method (as described below). Judging from past discussions between FDA and industry, this method may also offer a reasonable compromise that is acceptable to the agency in premarket submissions.
| Existing Assay Results | Third Assay Positive | Third Assay Negative | Total |
| Positive | 26 | 9 | 35 |
| Negative | 18 | 32 | 50 |
| Total | 44 | 41 | 85 |
| Accuracy: [(26 + 32)/85] x 100 = 68.2% | |||
Table IV(a). A 2 x 2 table comparing the hypothetical existing assay to a third assay defined as the gold standard.
| New Assay Results | Third Assay Positive | Third Assay Negative | Total |
| Positive | 40 | 10 | 50 |
| Negative | 4 | 31 | 35 |
| Total | 44 | 41 | 85 |
| Accuracy: [(40 + 31)/85] x 100 = 83.5% | |||
Table IV(b). A 2 x 2 table comparing the hypothetical new assay to a third assay defined as the gold standard.
A third approachand by far the most rigorousis to reassay all samples by a third method whose results are considered definitive, as described in the first installment of this article. This approach is possible, of course, only if such a method exists.
A variation of this approach would be to obtain definitive clinical information about every sample, and to use this information to define clinical truth in the analysis. When valid clinical criteria are identified prior to initiating the study, this approach enables researchers to determine the absolute sensitivity and specificity of both test methods being evaluated, and thereby to compare their accuracy. If the new assay is truly more accurate than the reference test, this method has the advantage of providing the data needed to demonstrate such superiority. FDA has previously accepted this approach for use with tumor marker tests, and there is some reason to think that the agency would accept such a precedent as a model for other analytes. However, it is also the most costly approach.
FDA Guidance
The National Committee for Clinical Laboratory Standards (NCCLS) has published an approved guideline for method comparison and bias estimation using patient samples, which includes a discussion of ways to handle outliers when comparing assay methods.1 However, this guideline was written for use by laboratories seeking to compare a new assay method to a previously used method for reporting similar assay results. The methods and sample sizes described in the guideline are unlikely to be adequate for establishing the clinical utility of a new assay that a manufacturer is attempting to bring to market.
Although FDA has historically accepted the use of discrepancy resolution analyses in premarket submissions for infectious-disease IVDs, the agency's current guidelines do not provide a method for resolving the issues that such use may raise. The agency's draft guidance on labeling says only that "discrepancy analysis may be performed and described in the submission and package insert if statistically valid techniques are employed," and encourages manufacturers to consult with the agency's reviewers or statisticians about appropriate techniques.2
The Office of Surveillance and Biometrics at FDA's Center for Devices and Radiological Health has been developing a guidance that specifically addresses this problem, but no schedule has been announced for its publication. It is not known what approaches to discrepancy resolution FDA's eventual guidance might recommend, but it is certain that the agency will somehow have to deal with the problems inherent in the approach most commonly used for discrepancy analysis. Agency acceptance of some of the scientifically and statistically valid alternatives outlined above would go a long way toward enabling industry researchers to prepare useful information for their product approval submissions.
Whatever guidance FDA chooses to offer, however, manufacturers would do well to remember that agency guidances are advisory only and do not constitute regulatory requirements. With careful strategic planning that takes into account FDA expectations and current scientific thought, it is possible to negotiate with the agency to accept analyses that can be shown to be scientifically and statistically valid. To facilitate this process, companies should consult with key FDA reviewers and their department heads well in advance of submitting any data.
The approaches suggested here may or may not be appropriate for evaluating a particular new product. The statistical procedures and regulatory requirements for comparing new tests to those currently in use, and for any subsequent discrepancy analyses, are still undergoing refinement. Because that process is a public one, manufacturers should be active participants in reviewing and commenting on new agency guidances.
References
1. Method Comparison and Bias Estimation Using Patient Samples, approved guideline, NCCLS document EP9-A (Wayne, PA: National Committee for Clinical Laboratory Standards, 1995).
2. "Guidance on Labeling for Laboratory Tests," (Rockville, MD: Division of Clinical Laboratory Devices, Office of Device Evaluation, Center for Devices and Radiological Health, FDA, 1999), p 4.
Cheryl L. Hayden is a principal research scientist at New England Research Institutes (Watertown, MA); Michael L. Feldstein is director of clinical services at Medical Device Consultants Inc. (North Attleboro, MA).
Return to the IVDT Mar/Apr table of contents | Return to the IVDT home page



