Chapter 23 - The Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) initiative: methods of the EGAPP™ Working Group Tables

This website is archived for historical purposes and is no longer being maintained or updated.

Human Genome Epidemiology (2^nd ed.): Building the evidence for using genetic information to improve health and prevent disease

“The findings and conclusions in this book are those of the author(s) and do not
necessarily represent the views of the funding agency.”

These chapters were published with modifications by Oxford University Press (2010)

Steven M. Teutsch, Linda A. Bradley, Glenn E. Palomaki, James E. Haddow, Margaret Piper, Ned Calonge, W. David Dotson, Michael P. Douglas, and Alfred O. Berg

Table 23-1
Categories of genetic test applications and some characteristics of how clinical validity and utility are assessed

Application of test	Clinical validity	Clinical utility
Diagnosis (symptomatic patient)	Association of marker with disorder	Improved clinical outcomes^*—health outcomes based on diagnosis and subsequent intervention or treatment Availability of information useful for personal or clinical decision making End of diagnostic odyssey
Disease screening (asymptomatic patient)	Association of marker with disorder	Improved health outcome based on early intervention for screen positive individuals to identify a disorder for which there is intervention or treatment, or provision of information useful for personal or clinical decision making
Risk assessment/ susceptibility	Association of marker with future disorder (consider possible effect of penetrance)	Improved health outcomes based on prevention or early detection strategies
Prognosis of diagnosed disease	Association of marker with natural history benchmarks of the disorder	Improved health outcomes, or outcomes of value to patients, based on changes in patient management
Predicting treatment response or adverse events (pharmacogenomics)	Association of marker with a phenotype/ metabolic state that relates to drug efficacy or adverse drug reactions	Improved health outcomes or adherence based on drug selection or dosage

^*Clinical outcomes are the net health benefit (benefits and harms) for the patients and/or population in which the test is used.

[back to chapter]

Table 23-2
Criteria for preliminary ranking of topics

Criteria related to health burden	What is the potential public health impact based on the prevalence/incidence of the disorder, the prevalence of gene variants, or the number of individuals likely to be tested? What is the severity of the disease? How strong is the reported relationship between a test result and a disease/drug response? Is there an effective intervention for those with a positive test or their family members? Who will use the information in clinical practice (e.g., health care providers, payers) and how relevant might this review be to their decision making?
Criteria related to practice issues	What is the availability of the test in clinical practice? Is an inappropriate test use possible or likely? What is the potential impact of an evidence review or recommendations on clinical practice? On consumers?
Other considerations	How does the test add to the portfolio of EGAPP™ evidence-based reviews? As a pilot project, EGAPP™ aims to develop a portfolio of evidence reviews that adequately tests the process and methodologies. Will it be possible to make a recommendation, given the body of data available? EGAPP™ is attempting to balance selection of somewhat established tests versus emerging tests for which insufficient evidence or unpublished data are more likely. Are there other practical considerations? For example, avoiding duplication of evidence reviews already underway by other groups. How does this test contribute to diversity in reviews? In what category is this test? As a pilot project, EGAPP™ aims to consider different categories of tests (e.g., pharmacogenomics or cancer), mutation types (e.g., inherited or somatic) or test types (e.g., predictive or diagnostic).

Criteria related to health burden

What is the potential public health impact based on the prevalence/incidence of the disorder, the prevalence of gene variants, or the number of individuals likely to be tested?

What is the severity of the disease?

How strong is the reported relationship between a test result and a disease/drug response?

Is there an effective intervention for those with a positive test or their family members?

Who will use the information in clinical practice (e.g., health care providers, payers) and how relevant might this review be to their decision making?

Criteria related to practice issues

What is the availability of the test in clinical practice?

Is an inappropriate test use possible or likely?

What is the potential impact of an evidence review or recommendations on clinical practice? On consumers?

Other considerations

How does the test add to the portfolio of EGAPP™ evidence-based reviews? As a pilot project, EGAPP™ aims to develop a portfolio of evidence reviews that adequately tests the process and methodologies.

Will it be possible to make a recommendation, given the body of data available? EGAPP™ is attempting to balance selection of somewhat established tests versus emerging tests for which insufficient evidence or unpublished data are more likely.

Are there other practical considerations? For example, avoiding duplication of evidence reviews already underway by other groups.

How does this test contribute to diversity in reviews? In what category is this test?
As a pilot project, EGAPP™ aims to consider different categories of tests (e.g., pharmacogenomics or cancer), mutation types (e.g., inherited or somatic) or test types (e.g., predictive or diagnostic).

[back to chapter]

Table 23-3
Hierarchies of data sources and study designs for the components of evaluation

Level^*	Analytic validity	Clinical validity	Clinical utility
1	Collaborative study using a large panel of well- characterized samples Summary data from well-designed external proficiency testing schemes or interlaboratory comparison programs	Well-designed longitudinal cohort studies Validated clinical decision rule^†	Meta-analysis of randomized controlled trials (RCT)
2	Other data from proficiency testing schemes Well-designed peer- reviewed studies (e.g., method comparisons, validation studies) Expert panel reviewed FDA summaries	Well-designed case- control studies	A single randomized controlled trial
3	Less well-designed peer-reviewed studies	Lower quality case- control and cross- sectional studies Unvalidated clinical decision rule^†	Controlled trial without randomization Cohort or case-control study
4	Unpublished and/or nonpeer-reviewed research, clinical laboratory, or manufacturer data Studies on performance of the same basic methodology, but used to test for a different target	Case series Unpublished and/or nonpeer-reviewed research, clinical laboratory or manufacturer data Consensus guidelines Expert opinion	Case series Unpublished and/or nonpeer-reviewed studies Clinical laboratory or manufacturer data Consensus guidelines Expert opinion

^*Highest level is 1.
^†A clinical decision rule is an algorithm leading to result categorization. It can also be defined as a clinical tool that quantifies the contributions made by different variables (e.g., test result, family history) in order to determine classification/interpretation of a test result (e.g., for diagnosis, prognosis, therapeutic response) in situations requiring complex decision making ( 55).

[back to chapter]

Table 23-4
Criteria for assessing quality of individual studies (internal validity) (55)

Analytic validity	Clinical validity	Clinical utility
Adequate descriptions of the index test (test under evaluation) Source and inclusion of positive and negative control materials Reproducibility of test results Quality control/assurance measures Adequate descriptions of the test under evaluation Specific methods/platforms evaluated Number of positive samples and negative controls tested Adequate descriptions of the basis for the “right answer” Comparison to a “gold standard” referent test Consensus (e.g., external proficiency testing) Characterized control materials (e.g., NIST, sequenced) Avoidance of biases Blinded testing and interpretation Specimens represent routinely analyzed clinical specimens in all aspects (e.g., collection, transport, processing) Reporting of test failures and uninterpretable or indeterminate results Analysis of data Point estimates of analytic sensitivity and specificity with 95% confidence intervals Sample size/power calculations addressed	Clear description of the disorder/phenotype and outcomes of interest Status verified for all cases Appropriate verification of controls Verification does not rely on index test result Prevalence estimates are provided Adequate description of study design and test/ methodology Adequate description of the study population Inclusion/exclusion criteria Sample size, demographics Study population defined and representative of the clinical population to be tested Allele/genotype frequencies or analyte distributions known in general and subpopulations Independent blind comparison with appropriate, credible reference standard(s) Independent of the test Used regardless of test results Description of handling of indeterminate results and outliers Blinded testing and interpretation of results Analysis of data Possible biases are identified and potential impact discussed Point estimates of clinical sensitivity and specificity with 95% confidence intervals Estimates of positive and negative predictive values	Clear description of the outcomes of interest What was the relative importance of outcomes measured; which were prespecified primary outcomes and which were secondary? Clear presentation of the study design Was there clear definition of the specific outcomes or decision options to be studied (clinical and other endpoints)? Was interpretation of outcomes/endpoints blinded? Were negative results verified? Was data collection prospective or retrospective? If an experimental study design was used, were subjects randomized? Were intervention and evaluation of outcomes blinded? Did the study include comparison with current practice/empirical treatment (value added)? Intervention What interventions were used? What were the criteria for the use of the interventions? Analysis of data Is the information provided sufficient to rate the quality of the studies? Are the data relevant to each outcome identified? Is the analysis or modeling explicit and understandable? Are analytic methods prespecified, adequately described, and appropriate for the study design? Were losses to follow-up and resulting potential for bias accounted for? Is there assessment of other sources of bias and confounding? Are there point estimates of impact with 95% CI? Is the analysis adequate for the proposed use?

Analytic validity

Clinical validity

Clinical utility

Adequate descriptions of the index test (test under evaluation)

Source and inclusion of positive and negative control materials
Reproducibility of test results

Quality control/assurance
measures

Adequate descriptions of the test under evaluation

Specific methods/platforms evaluated

Number of positive samples and negative controls tested

Adequate descriptions of the basis for the “right answer”

Comparison to a “gold standard” referent test

Consensus (e.g., external proficiency testing)

Characterized control materials (e.g., NIST, sequenced)

Avoidance of biases

Blinded testing and interpretation

Specimens represent routinely analyzed clinical specimens in all aspects (e.g., collection, transport, processing)

Reporting of test failures and uninterpretable or indeterminate results

Analysis of data

Point estimates of analytic sensitivity and specificity with 95% confidence intervals

Sample size/power calculations addressed

Clear description of the disorder/phenotype and outcomes of interest

Status verified for all cases Appropriate verification of controls

Verification does not rely on index test result

Prevalence estimates are provided

Adequate description of study design and test/ methodology

Adequate description of the study population

Inclusion/exclusion criteria

Sample size, demographics

Study population defined and representative of the clinical population to be tested

Allele/genotype frequencies or analyte distributions known in general and subpopulations

Independent blind comparison with appropriate, credible reference standard(s)

Independent of the test

Used regardless of test results

Description of handling of indeterminate results and outliers

Blinded testing and interpretation of results

Analysis of data

Possible biases are identified and potential impact discussed

Point estimates of clinical sensitivity and specificity with 95% confidence intervals

Estimates of positive and negative predictive values

Clear description of the outcomes of interest

What was the relative importance of outcomes measured; which were prespecified primary outcomes and which were secondary?

Clear presentation of the study design

Was there clear definition of the specific outcomes or decision options to be studied (clinical and other endpoints)?

Was interpretation of outcomes/endpoints blinded?

Were negative results verified?

Was data collection prospective or retrospective?

If an experimental study design was used, were subjects randomized? Were intervention and evaluation of outcomes blinded?

Did the study include comparison with current practice/empirical treatment (value added)?

Intervention

What interventions were used?

What were the criteria for the use of the interventions?

Analysis of data

Is the information provided sufficient to rate the quality of the studies?

Are the data relevant to each outcome identified?

Is the analysis or modeling explicit and understandable?

Are analytic methods prespecified, adequately described, and appropriate for the study design?

Were losses to follow-up and resulting potential for bias accounted for?

Is there assessment of other sources of bias and confounding?

Are there point estimates of impact with 95% CI?

Is the analysis adequate for the proposed use?

NIST = National Institute of Standards and Quality.

[back to chapter]

Table 23-5
Grading the quality of evidence for the individual components of the chain of evidence (key questions) (57)

Adequacy of information to answer key questions	Analytic validity	Clinical validity	Clinical utility
Convincing	Studies that provide confident estimates of analytic sensitivity and specificity using intended sample types from representative populations Two or more Level 1 or 2 studies that are generalizable, have a sufficient number and distribution of challenges, and report consistent results One Level 1 or 2 study that is generalizable and has an appropriate number and distribution of challenges	Well-designed and conducted studies in representative population(s) that measure the strength of association between a genotype or biomarker and a specific and well-defined disease or phenotype Systematic review/metaanalysis of Level 1 studies with homogeneity Validated Clinical Decision Rule High quality Level 1 cohort study	Well-designed and conducted studies in representative population(s) that assess specified health outcomes Systematic review/ meta-analysis of randomized controlled trials showing consistency in results At least one large randomized controlled trial (Level 2)
Adequate	Two or more Level 1 or 2 studies that Lack the appropriate number and/or distribution of challenges Are consistent, but not generalizable Modeling showing that lower quality (Level 3, 4) studies may be acceptable for a specific well-defined clinical scenario	Systematic review of lower quality studies Review of Level 1 or 2 studies with heterogeneity Case-control study with good reference standards Unvalidated Clinical Decision Rule (Level 2)	Systematic review with heterogeneity One or more controlled trials without randomization (Level 3) Systematic review of Level 3 cohort studies with consistent results
Inadequate	Combinations of higher quality studies that show important unexplained inconsistencies One or more lower quality studies (Level 3 or 4) Expert opinion	Single case-control study Nonconsecutive cases Lacks consistently applied reference standards Single Level 2 or 3 cohort/ case-control study Reference standard defined by the test or not used systematically Study not blinded Level 4 data	Systematic review of Level 3 quality studies or studies with heterogeneity Single Level 3 cohort or case-control study Level 4 data

Adequacy of information to answer key questions

Analytic validity

Clinical validity

Clinical utility

Convincing

Studies that provide confident estimates of analytic sensitivity and specificity using intended sample types from representative populations

Two or more Level 1 or 2 studies that are generalizable, have a sufficient number and distribution of challenges, and report consistent results

One Level 1 or 2 study that is generalizable and has an appropriate number and distribution of challenges

Well-designed and conducted studies in representative population(s) that measure the strength of association between a genotype or biomarker and a specific and well-defined disease or phenotype

Systematic review/metaanalysis of Level 1 studies with homogeneity

Validated Clinical Decision Rule

High quality Level 1 cohort study

Well-designed and conducted studies in representative population(s) that assess specified health outcomes

Systematic review/ meta-analysis of randomized controlled trials showing consistency in results

At least one large randomized controlled trial (Level 2)

Adequate

Two or more Level 1 or 2 studies that

Lack the appropriate number and/or distribution of challenges

Are consistent, but not generalizable

Modeling showing that lower quality (Level 3, 4) studies may be acceptable for a specific well-defined clinical scenario

Systematic review of lower quality studies

Review of Level 1 or 2 studies with heterogeneity

Case-control study with good reference standards

Unvalidated Clinical Decision Rule (Level 2)

Systematic review with heterogeneity

One or more controlled trials without randomization (Level 3)

Systematic review of Level 3 cohort studies with consistent results

Inadequate

Combinations of higher quality studies that show important unexplained inconsistencies

One or more lower quality studies (Level 3 or 4)

Expert opinion

Single case-control study

Nonconsecutive cases

Lacks consistently applied reference standards

Single Level 2 or 3 cohort/ case-control study

Reference standard defined by the test or not used systematically

Study not blinded Level 4 data

Systematic review of Level 3 quality studies or studies with heterogeneity

Single Level 3 cohort or case-control study Level 4 data

[back to chapter]

Table 23-6
Recommendations based on certainty of evidence, magnitude of net benefit, and contextual issues

Level of certainty	Recommendation
High or moderate	Recommend for… . . . if the magnitude of net benefit is Substantial, Moderate, or Small^*, unless additional considerations warrant caution. Consider the importance of each relevant contextual factor and its magnitude or finding.
	Recommend against… . . . if the magnitude of net benefit is zero or there are net harms. Consider the importance of each relevant contextual factor and its magnitude or finding.
Low	Insufficient evidence… . . . if the evidence for clinical utility or clinical validity is insufficient in quantity or quality to support conclusions or make a recommendation. Consider the importance of each contextual factor and its magnitude or finding. Determine whether the recommendation should be Insufficient (neutral), Insufficient (encouraging), or Insufficient (discouraging). Provide information on key information gaps to drive a research agenda.

Level of certainty

Recommendation

High or moderate

Recommend for…

. . . if the magnitude of net benefit is Substantial, Moderate, or Small^*, unless additional considerations warrant caution.

Consider the importance of each relevant contextual factor and its magnitude or finding.

Recommend against…

. . . if the magnitude of net benefit is zero or there are net harms.

Consider the importance of each relevant contextual factor and its magnitude or finding.

Low

Insufficient evidence…

. . . if the evidence for clinical utility or clinical validity is insufficient in quantity or quality to support conclusions or make a recommendation.

Consider the importance of each contextual factor and its magnitude or finding.

Determine whether the recommendation should be Insufficient (neutral), Insufficient (encouraging), or Insufficient (discouraging).

Provide information on key information gaps to drive a research agenda.

^*Categories for the “magnitude of effect” or “magnitude of net benefit” used are Substantial, Moderate, Small, and Zero (57).

back to Chapter 23

File Formats Help:

How do I view different file formats (PDF, DOC, PPT, MPEG) on this site?

Page last reviewed: January 6, 2010 (archived document)
Content source:
- Center for Surveillance, Epidemiology and Laboratory Services (CSELS) ,
  Public Health Genomics

Related Books…

Chapter 23 - The Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) initiative: methods of the EGAPP™ Working Group Tables

Human Genome Epidemiology (2nd ed.): Building the evidence for using genetic information to improve health and prevent disease

Table 23-2 Criteria for preliminary ranking of topics

Table 23-3 Hierarchies of data sources and study designs for the components of evaluation

Table 23-4 Criteria for assessing quality of individual studies (internal validity) (55)

Table 23-5 Grading the quality of evidence for the individual components of the chain of evidence (key questions) (57)

Table 23-6 Recommendations based on certainty of evidence, magnitude of net benefit, and contextual issues

Related Books…

File Formats Help:

Human Genome Epidemiology (2^nd ed.): Building the evidence for using genetic information to improve health and prevent disease

Table 23-2
Criteria for preliminary ranking of topics

Table 23-3
Hierarchies of data sources and study designs for the components of evaluation

Table 23-4
Criteria for assessing quality of individual studies (internal validity) (55)

Table 23-5
Grading the quality of evidence for the individual components of the chain of evidence (key questions) (57)

Table 23-6
Recommendations based on certainty of evidence, magnitude of net benefit, and contextual issues