Chapter 23 - The Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) initiative: methods of the EGAPP™ Working Group Tables
This website is archived for historical purposes and is no longer being maintained or updated.
Human Genome Epidemiology (2nd ed.): Building the evidence for using genetic information to improve health and prevent disease
necessarily represent the views of the funding agency.”
Steven M. Teutsch, Linda A. Bradley, Glenn E. Palomaki, James E. Haddow, Margaret Piper, Ned Calonge, W. David Dotson, Michael P. Douglas, and Alfred O. Berg
Table 23-1
Categories of genetic test applications and some characteristics of how clinical validity and utility are assessed
Application of test | Clinical validity | Clinical utility |
---|---|---|
Diagnosis (symptomatic patient) | Association of marker with disorder | Improved clinical outcomes*—health outcomes based on diagnosis and subsequent intervention or treatment
Availability of information useful for personal or clinical decision making End of diagnostic odyssey |
Disease screening (asymptomatic patient) | Association of marker with disorder | Improved health outcome based on early intervention for screen positive individuals to identify a disorder for which there is intervention or treatment, or provision of information useful for personal or clinical decision making |
Risk assessment/ susceptibility | Association of marker with future disorder (consider possible effect of penetrance) | Improved health outcomes based on prevention or early detection strategies |
Prognosis of diagnosed disease | Association of marker with natural history benchmarks of the disorder | Improved health outcomes, or outcomes of value to patients, based on changes in patient management |
Predicting treatment response or adverse events (pharmacogenomics) | Association of marker with a phenotype/ metabolic state that relates to drug efficacy or adverse drug reactions | Improved health outcomes or adherence based on drug selection or dosage |
*Clinical outcomes are the net health benefit (benefits and harms) for the patients and/or population in which the test is used.
Table 23-2
Criteria for preliminary ranking of topics
Criteria related to health burden | What is the potential public health impact based on the prevalence/incidence of the disorder, the prevalence of gene variants, or the number of individuals likely to be tested?
What is the severity of the disease? How strong is the reported relationship between a test result and a disease/drug response? Is there an effective intervention for those with a positive test or their family members? Who will use the information in clinical practice (e.g., health care providers, payers) and how relevant might this review be to their decision making? |
---|---|
Criteria related to practice issues | What is the availability of the test in clinical practice?
Is an inappropriate test use possible or likely? What is the potential impact of an evidence review or recommendations on clinical practice? On consumers? |
Other considerations | How does the test add to the portfolio of EGAPP™ evidence-based reviews? As a pilot project, EGAPP™ aims to develop a portfolio of evidence reviews that adequately tests the process and methodologies.
Will it be possible to make a recommendation, given the body of data available? EGAPP™ is attempting to balance selection of somewhat established tests versus emerging tests for which insufficient evidence or unpublished data are more likely. Are there other practical considerations? For example, avoiding duplication of evidence reviews already underway by other groups. How does this test contribute to diversity in reviews? In what category is this test? |
Table 23-3
Hierarchies of data sources and study designs for the components of evaluation
Level* | Analytic validity | Clinical validity | Clinical utility |
---|---|---|---|
1 | Collaborative study using a large panel of well- characterized samples
Summary data from well-designed external proficiency testing schemes or interlaboratory comparison programs |
Well-designed longitudinal cohort studies
Validated clinical decision rule† |
Meta-analysis of randomized controlled trials (RCT) |
2 | Other data from proficiency testing schemes
Well-designed peer- reviewed studies (e.g., method comparisons, validation studies) Expert panel reviewed FDA summaries |
Well-designed case- control studies | A single randomized controlled trial |
3 | Less well-designed peer-reviewed studies | Lower quality case- control and cross- sectional studies
Unvalidated clinical decision rule† |
Controlled trial without randomization
Cohort or case-control study |
4 | Unpublished and/or nonpeer-reviewed research, clinical laboratory, or manufacturer data
Studies on performance of the same basic methodology, but used to test for a different target |
Case series
Unpublished and/or nonpeer-reviewed research, clinical laboratory or manufacturer data Consensus guidelines Expert opinion |
Case series
Unpublished and/or nonpeer-reviewed studies Clinical laboratory or manufacturer data Consensus guidelines Expert opinion |
*Highest level is 1.
†A clinical decision rule is an algorithm leading to result categorization. It can also be defined as a clinical tool that quantifies the contributions made by different variables (e.g., test result, family history) in order to determine classification/interpretation of a test result (e.g., for diagnosis, prognosis, therapeutic response) in situations requiring complex decision making ( 55).
Table 23-4
Criteria for assessing quality of individual studies (internal validity) (55)
Analytic validity | Clinical validity | Clinical utility |
---|---|---|
Adequate descriptions of the index test (test under evaluation)
Source and inclusion of positive and negative control materials Quality control/assurance Adequate descriptions of the test under evaluation Specific methods/platforms evaluated Number of positive samples and negative controls tested Adequate descriptions of the basis for the “right answer” Comparison to a “gold standard” referent test Consensus (e.g., external proficiency testing) Characterized control materials (e.g., NIST, sequenced) Avoidance of biases Blinded testing and interpretation Specimens represent routinely analyzed clinical specimens in all aspects (e.g., collection, transport, processing) Reporting of test failures and uninterpretable or indeterminate results Analysis of data Point estimates of analytic sensitivity and specificity with 95% confidence intervals Sample size/power calculations addressed |
Clear description of the disorder/phenotype and outcomes of interest
Status verified for all cases Appropriate verification of controls Verification does not rely on index test result Prevalence estimates are provided Adequate description of study design and test/ methodology Adequate description of the study population Inclusion/exclusion criteria Sample size, demographics Study population defined and representative of the clinical population to be tested Allele/genotype frequencies or analyte distributions known in general and subpopulations Independent blind comparison with appropriate, credible reference standard(s) Independent of the test Used regardless of test results Description of handling of indeterminate results and outliers Blinded testing and interpretation of results Analysis of data Possible biases are identified and potential impact discussed Point estimates of clinical sensitivity and specificity with 95% confidence intervals Estimates of positive and negative predictive values |
Clear description of the outcomes of interest
What was the relative importance of outcomes measured; which were prespecified primary outcomes and which were secondary? Clear presentation of the study design Was there clear definition of the specific outcomes or decision options to be studied (clinical and other endpoints)? Was interpretation of outcomes/endpoints blinded? Were negative results verified? Was data collection prospective or retrospective? If an experimental study design was used, were subjects randomized? Were intervention and evaluation of outcomes blinded? Did the study include comparison with current practice/empirical treatment (value added)? Intervention What interventions were used? What were the criteria for the use of the interventions? Analysis of data Is the information provided sufficient to rate the quality of the studies? Are the data relevant to each outcome identified? Is the analysis or modeling explicit and understandable? Are analytic methods prespecified, adequately described, and appropriate for the study design? Were losses to follow-up and resulting potential for bias accounted for? Is there assessment of other sources of bias and confounding? Are there point estimates of impact with 95% CI? Is the analysis adequate for the proposed use? |
NIST = National Institute of Standards and Quality.
Table 23-5
Grading the quality of evidence for the individual components of the chain of evidence (key questions) (57)
Adequacy of information to answer key questions | Analytic validity | Clinical validity | Clinical utility |
---|---|---|---|
Convincing |
Studies that provide confident estimates of analytic sensitivity and specificity using intended sample types from representative populations
Two or more Level 1 or 2 studies that are generalizable, have a sufficient number and distribution of challenges, and report consistent results One Level 1 or 2 study that is generalizable and has an appropriate number and distribution of challenges |
Well-designed and conducted studies in representative population(s) that measure the strength of association between a genotype or biomarker and a specific and well-defined disease or phenotype
Systematic review/metaanalysis of Level 1 studies with homogeneity Validated Clinical Decision Rule High quality Level 1 cohort study |
Well-designed and conducted studies in representative population(s) that assess specified health outcomes
Systematic review/ meta-analysis of randomized controlled trials showing consistency in results At least one large randomized controlled trial (Level 2) |
Adequate | Two or more Level 1 or 2 studies that
Lack the appropriate number and/or distribution of challenges Are consistent, but not generalizable Modeling showing that lower quality (Level 3, 4) studies may be acceptable for a specific well-defined clinical scenario |
Systematic review of lower quality studies
Review of Level 1 or 2 studies with heterogeneity Case-control study with good reference standards Unvalidated Clinical Decision Rule (Level 2) |
Systematic review with heterogeneity
One or more controlled trials without randomization (Level 3) Systematic review of Level 3 cohort studies with consistent results |
Inadequate | Combinations of higher quality studies that show important unexplained inconsistencies
One or more lower quality studies (Level 3 or 4) Expert opinion |
Single case-control study
Nonconsecutive cases Lacks consistently applied reference standards Single Level 2 or 3 cohort/ case-control study Reference standard defined by the test or not used systematically Study not blinded Level 4 data |
Systematic review of Level 3 quality studies or studies with heterogeneity
Single Level 3 cohort or case-control study Level 4 data |
Table 23-6
Recommendations based on certainty of evidence, magnitude of net benefit, and contextual issues
Level of certainty | Recommendation |
---|---|
High or moderate |
Recommend for…
. . . if the magnitude of net benefit is Substantial, Moderate, or Small*, unless additional considerations warrant caution. Consider the importance of each relevant contextual factor and its magnitude or finding. |
Recommend against…
. . . if the magnitude of net benefit is zero or there are net harms. Consider the importance of each relevant contextual factor and its magnitude or finding. |
|
Low |
Insufficient evidence…
. . . if the evidence for clinical utility or clinical validity is insufficient in quantity or quality to support conclusions or make a recommendation. Consider the importance of each contextual factor and its magnitude or finding. Determine whether the recommendation should be Insufficient (neutral), Insufficient (encouraging), or Insufficient (discouraging). Provide information on key information gaps to drive a research agenda. |
*Categories for the “magnitude of effect” or “magnitude of net benefit” used are Substantial, Moderate, Small, and Zero (57).
- Page last reviewed: January 6, 2010 (archived document)
- Content source: