U.S. Genome Variation Estimates

The U.S. Genome Variation Estimates are the first allele frequency and genotype prevalence estimates of human genetic variants for the entire U.S. population. These estimates are represented here in summary tables stratified by age, sex, and race/ethnicity for 90 polymorphisms in 50 candidate genes. The estimates are based on DNA data collected from participants in the Third National Health and Nutrition Examination Survey (NHANES III). This resource also includes a summary of the demographic characteristics of the participants, as well as a description of the genotyping methods used.

These nationally-representative data on allele frequency and genotype prevalence have been published in the American Journal of Epidemiology article: Prevalence in the United States of Selected Candidate Gene Variants: Third National Health and Nutrition Examination Survey, 1991-1994.

NHANES III DNA Bank
NHANES is a program of population-based studies designed to assess the health and nutritional status of adults and children in the United States that is under the direction of CDC’s National Center for Health Statistics (NCHS). During the second phase (1991-1994) of NHANES III, DNA samples were collected. To create a DNA bank, white blood cells were frozen and cell lines were immortalized using Epstein-Barr virus. The DNA bank contains specimens from 7,159 participants aged 12 years or older. It is jointly maintained by NCHS and CDC’s National Center for Environmental Health (NCEH). Samples from this DNA bank were used to measure the prevalence of selected genetic variants in the U.S. population and to test their association with health outcomes and diseases of public health importance. Read more about the NHANES III DNA bank.

Accessing the NHANES III Data
These NHANES III genetic data are not publicly-accessible. These data are available only within the Research Data Center (RDC) at NCHS in Hyattsville, Maryland. Approval from the NCHS Ethics Review Board (ERB) is required before these data can be accessed. More information on how to apply for access is available on the NCHS Web site.

NHANES III Collaborative Genomics Project
The U.S. Genome Variation Estimates were developed by the Office of Public Health Genomics based on an initiative of the CDC and the National Cancer Institute called the NHANES III Collaborative Genomics Project. Read more about this initiative.

Suggested Citation
We recommend the following citation for these online data estimates:

Office of Public Health Genomics, Centers for Disease Control and Prevention. U.S. Genome Variation Estimates 2008. [cited Oct 23, 2008]. Available from URL: https://www.cdc.gov/genomics/population/genvar/index.htm.

Please use the following citation for the paper published in the American Journal of Epidemiology:

Prevalence in the United States of Selected Candidate Gene Variants: Third National Health and Nutrition Examination Survey, 1991-1994
Man-huei Chang; Mary Lou Lindegren; Mary A. Butler; Stephen J. Chanock; Nicole F. Dowling; Margaret Gallagher; Ramal Moonesinghe; Cynthia A. Moore; Renee M. Ned; Mary R. Reichler; Christopher L. Sanders; Robert Welch; Ajay Yesupriya; Muin J. Khoury; for the CDC/NCI NHANES III Genomics Working Group
American Journal of Epidemiology 2008; doi: 10.1093/aje/kwn286

Disclaimer: This online resource is sponsored by Office of Public Health Genomics at the Centers for Disease Control and Prevention (CDC). The content in these Web pages should not be construed as official positions of the CDC or the U.S. Department of Health and Human Services. In addition, any mention of trade names, commercial products, or organizations does not imply endorsement by the U.S. Government.