Skip to : [Content] [Navigation]

 

Originally Published IVD Technology May 2005

Regulations & Standards

Mock regulatory submissions using microarray data

Laura Reid, Rich Cohn, Wendell Jones, Thomas Goralski, and Steve McPhail
Figure 1. Format of the mock submission (click to enlarge).

Current IVD tests analyze one protein, DNA, or RNA marker at a time. In contrast, microarray experiments simultaneously measure the expression of thousands of gene sequences arranged in a microscopic grid. While such genomewide molecular tests have multiple applications, their multiplex format generates complex data sets that are difficult to manage and review. With microarray results being maintained in large electronic spreadsheets, key questions about data variability and standardization methods must be resolved as this postgenomic research tool transitions to patient care.

FDA has realized the need for a cooperative effort among regulatory, industry, and technology experts to address the challenges of microarray data in diagnostic applications.1,2 The agency has been sponsoring meetings, releasing documents, and soliciting voluntary submissions to help in developing appropriate guidelines for microarray test results.3,4 Even though the data in these early submissions contained actual microarray results for regulatory review, the data are provided to assist in organizational learning and need not be related to any other FDA submission or application.

Scientists at Expression Analysis Inc. (Durham, NC) initiated a collaboration with investigators at Schering-Plough Research Institute (Lafayette, NJ) to provide a mock submission of microarray data to the Nonclinical Pharmacogenomics Subcommittee of the Pharmacology-Toxicology Coordinating Committee at FDA’s Center for Drug Evaluation and Research (CDER). The goals of this exercise were to provide a framework to define recommendations, contribute to a consensus about specific guidance issues, and assist in building a process by which microarray data may be submitted to FDA. This article summarizes the content of the mock submission along with lessons learned during the review process, and the value gained by all participants.

Data Set

The data set for the mock submission was derived from a toxicity study performed at the Schering-Plough Research Institute. Female rats were treated with either a vehicle control or a compound designed to reduce cholesterol levels by inhibiting HMGCoA reductase. The animals were sacrificed after 1, 7, and 30 days of dosing. RNA was extracted from liver samples for gene expression analysis. Additional assays were performed before and after the animals were sacrificed in order to identify any toxicological effects of the compound.

Biotinylated cRNA targets were prepared from three control animals and three treated animals for each time point. Eighteen targets were hybridized to rat genome U34A GeneChips by Affymetrix (Santa Clara, CA), which contain short oligonucleotide probes representing more than 7000 well-characterized rat genes. Microarray processing was performed at a facility unrelated to Expression Analysis, and the raw image files were used for the data analyses and quality reviews included in the mock submission.

Mock Submission Format

Due to the lack of any guidelines, the mock submission document did not follow a specific format. The mock submission was primarily designed to address developing needs in the microarray community and serve as an educational tool. As such, the mock submission also included background and descriptive information when appropriate. In future submissions to FDA, the degree of regulatory review will likely depend on the stage of product development, the biological questions being addressed, and the role of microarray data in supporting the evaluation of safety and efficacy.

Nevertheless, the mock submission was divided into seven sections (see Figure 1). The first two sections included details of the laboratory and data management infrastructure. The third section described controls that were used to evaluate array performance. The following two sections discussed statistical issues, such as bias mitigation and data analysis methods. The final two sections included toxicology results and an interpretation of the genomic data so that the microarray results could be linked to a biological system.

Laboratory Protocols and Data Management

Even though microarray laboratories follow similar protocols, variations exist in their procedures, equipment, reagents, etc. Since such differences can affect data comparability, the mock submission began with a description of lab protocols to demonstrate the competency of the facility that generated the microarray data.

The laboratory infrastructure section contained details of the microarray design, detection method, and laser scanner. This section also included accuracy and reproducibility calculations demonstrating the microarray processing facility’s competency. Such details were based on validation data generated at the microarray facility, but were not technically part of the data set in the mock submission. For this reason, the laboratory quality data were separated from the study-specific quality data.

Although not all microarray facilities follow good laboratory practices (GLPs) and are compliant with 21 CFR Part 11 requirements, the thorough documentation and training required for GLP compliance provide an essential resource for identifying and quantifying the impact of laboratory variables over time. The informatics infrastructure section describes a GLP-compliant informatics program, including validated software, laboratory information management system, and relevant data transfer protocols. FDA indicated that this documentation and system validation was useful, but should be maintained by the source institution.

RNA Standards and Quality Control Metrics

Several microarray controls can evaluate performance at multiple steps during target preparation and data collection. In the mock submission, descriptions of several quality parameters, including those not in the original study, were provided as a reference.

Figure 2. Standards and quality control metrics. The figure illustrates typical RNA labeling and microarray hybridization steps based on Affymetrix protocols. RNA standards and QC metrics described in the mock submission are highlighted in boxes and ovals, respectively (click to enlarge).

Controls were divided into two complementary categories (see Figure 2). One category is RNA standards, which are external reference materials that confirm the reproducibility of the microarray protocols and reagents. Three microarray standards were described in the mock submission:

• Labeling controls are sense-strand, polyadenylated transcripts that are spiked into the total RNA sample before cDNA synthesis.
• Hybridization controls are antisense, biotin-labeled cRNA fragments that are prepared separately and then added to the hybridization cocktail for each sample.
• Batch controls are reference RNA samples that are labeled and hybridized concurrently with the biological samples. They are useful for monitoring microarray performance in larger experiments when the biological samples are segregated into sets or batches that are processed together using the same master mixes.

In the mock submission, the Affymetrix labeling and hybridization standards are referred to as polyA controls and eukaryotic controls, respectively. The transcripts are generated from bacterial and viral sequences that will not cross-hybridize to probes on eukaryotic arrays.

While similar labeling and hybridization standards are available on other microarray platforms, these transcripts are derived from other species. This lack of standardization complicates cross-platform comparisons.

The External RNA Controls Consortium is developing a set of universal labeling standards (or external RNA controls) that will be useful for sample control on a variety of microarray platforms and by quantitative real-time polymerase chain reaction (qRT-PCR).5 The Clinical and Laboratory Standards Institute (CLSI; Wayne, PA) has a complementary initiative.6

The other category of controls is quality control (QC) metrics, which are measurements collected during target generation and hybridization that can evaluate the quality of microarray data. Four types of QC metrics were described in the mock submission, and are distinguished by the template being evaluated:
• RNA sample metrics (e.g., 28S rRNA/ 18S rRNA).
• cDNA metrics (e.g., detection of high- and low-abundance transcripts by qRT-PCR).
• cRNA metrics (e.g., IVT yield, 260/ 280).
• Hybridization metrics (e.g., background, percent present, 3'/5' ratios).

Figure 3. Data evaluation process (click to enlarge).

Not all laboratories perform each of the RNA standards, nor do they collect all possible QC metrics. For the data set used in the mock submission, only data for the hybridization controls and hybridization metrics were available. For the other standards and metrics, results from typical experiments were presented. FDA reviewers were interested in the microarray controls and sought straightforward accept/ reject criteria based on appropriate metrics. RNA, cRNA, and hybridization metrics, as well as image review, were identified as appropriate data evaluation metrics (see Figure 3).

Array Performance and Validation

In addition to the RNA standards and QC metrics, evidence of the validity of the data set was included in the mock submission. This information was provided in the reproducibility data between appropriate sample replicates, which suggests a level of confidence in the resulting lists of differentially expressed genes.
The mock submission included three biological replicates for each condition. To examine the reproducibility of the data set, the array results generated from different samples representing the same experimental condition were compared. Three reproducibility statistics were calculated:

• Correlation. The concordance of the intensity of each probe cell in one hybridization graphed against its corresponding intensity in a replicate hybridization.
• Detection call agreement. The number of transcripts called present in both samples.
• Signal value agreement. The number of transcripts with a less-than-twofold difference in signal values between replicates. Transcripts called absent in both samples are excluded, and signal values below 64 are censored.

Table I. Three reproducibility measures were calculated for the three possible replicate pairs in each of the treatment groups. Average reproducibility measures are shown. The signal value agreement is calculated from the subset of transcripts detected in at least one sample using a lower-limit signal value of 64 (click to enlarge).

The mock submission found that the data set had good reproducibility (see Table I). qRT-PCR is often considered the gold standard for RNA expression research and has been used extensively to validate microarray platforms. However, while qRT-PCR confirmation may be useful for some transcripts, there is considerable debate as to whether it is necessary for every study that includes microarray data. For most purposes, and given a validated platform, the reproducibility of the data combined with a set of appropriate QC metrics should be sufficient for judging array performance. In the mock submission, minimal qRT-PCR data were included, recognizing that investigators may choose to reserve qRT-PCR confirmation studies for transcripts with borderline significance values or biologically critical genes.

Data Analysis and Format

Data analysis may be one of the most variable aspects in microarray research.7 Future FDA submissions will likely utilize a variety of data extraction and analysis protocols. The mock submission provided two measures of gene expression: MicroArray Suite (MAS) 5.0 by Affymetrix and perfect match (PM)-background. Several others may also be suitable relative to the numbers of samples analyzed.

Figure 4. Comparison of alternative signal measures. Each figure plots the average log2 signal of similar sample sets (labeled G1 and G2). Signal values were calculated using either the PM-background method (A) or the MAS 5.0 algorithm (B). The data are derived from an Affymetrix Latin Square experiment (1251 run) where G1 represents the average of samples n, o, and p; and G2 is the average of samples q, r, and s (G2). Transcripts spiked at different concentrations into the two sample sets are highlighted in purple (click to enlarge).

Each transcript on an Affymetrix GeneChip is represented by a set of perfect match (PM) probes, which are complementary to the target, and mismatch (MM) probes, which contain a single nucleotide difference. The MAS 5.0 algorithm examined hybridization intensities to the PM probes after subtracting background, and the estimated nonspecific hybridization to the MM probes. The PM-background measure calculated a similar signal value, except that only local background was subtracted from the PM intensities. The measures also used different normalization techniques and filtering criteria. The PM-background measure, which does not consider MM probes, often reduces the variance in microarray data, especially at lower abundance levels (see Figure 4).

Up- and down-regulated genes were identified using the significance analysis of microarrays (SAM) method.8 Such gene lists were presented in electronic files. Clustering was also used as a method of detecting potentially coregulated genes.

The volume of information contained in microarray data files can be challenging. Further work is needed to identify the type of data required and to compile such information in a user-friendly format.

Toxicogenomics

Pictured from left: Laura Reid, PhD, is director of research and development; Rich Cohn, PhD, is director of statistical sciences; Wendell Jones, PhD, is senior statistician; Thomas Goralski, PhD, is laboratory director; and Steve McPhail is president and chief executive officer at Expression Analysis Inc. (Durham, NC).

The mock submission’s final two sections demonstrated the role of the microarray data in a toxicological report. The toxicology section summarized the animal observations and measurements collected during the study. Although decreases in body weight were observed in the treated animals, this study showed no significant toxicity from the compound in the liver or other tissues at the dose tested. The interpretation of the genomic data section linked some of the expression results identified in the microarray study to the pharmacological mechanism of the drug compound.

Several features of the microarray data supported the biological conclusions. First, the two data extraction and normalization methods (MAS 5.0 and PM-background) generated similar clusters of differentially expressed genes. Second, both analyses reported increased expression of the HMGCoA reductase gene in the treated animals as expected after exposure to an inhibitor compound. Third, expression changes in other genes in the cholesterol biosynthesis pathway and related metabolic pathways were detected. These interpretations required informed connections between the toxicogenomic and toxicology results. FDA reviewers recommended some alternatives in content and format for interpretation discussion, perhaps suggesting the difficulty of this stage.

Conclusion

FDA reviewers indicated that the mock submission was helpful in delineating the complexities and the amount of data generated in a microarray study. When judging the quality of the data set, they found the measures of reproducibility between replicate samples and variability (i.e., coefficient of variation) in the RNA standards and QC metrics to be most useful.

A portion of this exercise was aimed at familiarizing FDA reviewers with microarray technology. Several of the sections included detailed descriptions and background information that would not be included in a typical submission. Likewise, the multiple RNA standards and QC metrics were included for informational purposes and were not intended to be an exhaustive list. Each metric included in the mock submission may not be required in future FDA submissions. Rather, it is assumed the agency will provide guidance on the metrics that would be most helpful during the review process.

This exercise was beneficial to all participants. While FDA learned from experiences obtained in working with a variety of microarray users, Expression Analysis and Schering-Plough gained an understanding of FDA’s needs in handling data submissions. The format used in this mock submission represents only one possible approach. Two other companies have also presented mock submissions to the CDER Nonclinical Pharmacogenomics Subcommittee. The goals and design of those submissions differ from and complement the Expression Analysis and Schering-Plough submission. Many of the lessons learned from the review of the three mock submissions will likely be reflected in the next FDA guidance document on microarray data submissions.

References


1. EF Petricoin et al., “Medical Applications of Microarray Technologies: A Regulatory Science Perspective,” Nature Genetics, supplement 32 (2002): 474–479.
 
2. JL Hackett and LJ Lesko, “Microarray Data—the U.S. FDA, Industry, and Academia,” Nature Biotechnology 7 (2003): 742–743.
 
3. LJ Lesko et al., “Pharmacogenetics and Pharmacogenomics in Drug Development and Regulatory Decision Making: Report of the First FDA-PWG-PhRMA-DruSafe Workshop,” Journal of Clinical Pharmacology 43 (2003): 342–358.
 
4. “FDA Guidance for Industry: Pharmacogenomic Data Submissions,” FDA Web site (Rockville, MD: 2003 [accessed 14 April 2005]); available from Internet: www.fda.gov/ cder/guidance/6400fnl.pdf.
 
5. “NIST Workshop for the External RNA Controls Consortium,” National Institute of Standards and Technology Web site (Gaithersburg, MD: [accessed 14 April 2005]); available from Internet: www.cstl.nist.gov/biotech/ workshops/ERCC2003/index.html.
 
6. “Proposed Guideline MM16-P: Use of External RNA Controls in Gene Expression Assays,” Clinical and Laboratory Standards Institute Web site (Wayne, PA: [accessed 14 April 2005]); available from Internet: www.clsi. org.
 
7. The Tumor Analysis Best Practices Working Group, “Expression Profiling—Best Practices for Data Generation and Interpretation in Clinical Trials,” Nature Reviews Genetics 5 (2004): 229–237.
 
8. VG Tusher, R Tibshirani, and G Chu, “Significance Analysis of Microarrays Applied to Ionizing Radiation Response,” Proceedings of the National Academy of Sciences 98, no. 9 (2001): 5116–5121.

Copyright ©2005 IVD Technology