Genomic Data Commons To Facilitate Data, Clinical Information Sharing
The Genomic Data Commons (GDC), a unified data system that promotes sharing of genomic and clinical data between researchers, launched June 6. An initiative of the National Cancer Institute, the GDC will be a core component of the National Cancer Moonshot and the President’s Precision Medicine Initiative. It benefits from $70 million allocated to NCI to lead efforts in cancer genomics as part of PMI for Oncology. The GDC will centralize, standardize and make accessible data from large-scale NCI programs such as the Cancer Genome Atlas and its pediatric equivalent, Therapeutically Applicable Research to Generate Effective Treatments (TARGET).
Together, TCGA and TARGET represent some of the largest and most comprehensive cancer genomics datasets in the world, consisting of more than 2 petabytes of data (1 petabyte is equivalent to 223,000 DVDs filled to capacity with data). In addition, the GDC will accept submissions of cancer genomic and clinical data from researchers around the world who wish to share their data broadly. In so doing, researchers will be able to use the state-of-the-art analytic methods of the GDC, allowing them to compare their findings with other data in the GDC.
Data in the GDC, representing thousands of cancer patients and tumors, will be harmonized using standardized software algorithms so that they are accessible and broadly useful to any cancer researcher. The storage of raw genomic data in the GDC will also allow it to be reanalyzed as computational methods and genome annotations improve.
“With the GDC, NCI has made a major commitment to maintaining long-term storage of cancer genomic data and providing researchers with free access to these data,” said NCI acting director Dr. Douglas Lowy. “Importantly, the explanatory power of data in the GDC will grow over time as data from more patients are included, and ultimately the GDC will accelerate our efforts in precision medicine.”
The GDC is being built and managed by the University of Chicago Center for Data Intensive Science, in collaboration with the Ontario Institute for Cancer Research, all under an NCI contract with Leidos Biomedical Research, Frederick, Md.
“Of particular significance, the GDC will also house data from a number of newer NCI programs that will sequence the DNA of patients enrolled in NCI clinical trials,” said Dr. Louis Staudt, co-chief of NCI’s Lymphoid Malignancies Branch. “These datasets will lead to a much deeper understanding of which therapies are most effective for individual cancer patients. With each new addition, the GDC will evolve into a smarter, more comprehensive knowledge system that will foster important discoveries in cancer research and increase the success of cancer treatment for patients.”
The hope is that the GDC will form the basis for a comprehensive knowledge system for cancer. GDC researchers will be able to integrate genetic and clinical data, such as cancer imaging and histological data, with information on the molecular profiles of tumors as well as treatment response.