PMI cohort will require data management on a massive scale

Vanderbilt University center is making plans to acquire and organize an enormous database of precision medicine indicators, says Josh Denny, MD.


Few health data management efforts are as ambitious or challenging as the National Institutes of Health’s Precision Medicine Initiative (PMI) Cohort Program.

The landmark longitudinal research study will collect genomic information, electronic health records, as well as lifestyle and environmental exposure data from 1 million or more U.S. volunteers. The job of acquiring, organizing, and securing what will be one of the world’s largest and most diverse datasets for precision medicine research falls squarely on the shoulders of Vanderbilt University Medical Center (VUMC) in Nashville, Tenn.

Last month, VUMC was awarded a five-year, $71.6 million grant from NIH to establish and operate a Data and Research Support Center for the PMI Cohort Program.

The center, directed by Josh Denny, MD, associate professor of biomedical informatics and medicine at Vanderbilt, also will provide research support and analysis tools to the scientists who will mine the enormous database of health data. They’ll be working to sifting through the data to better understand the factors that influence health and disease.

“It’s a lot of data, and we have a skilled team to tackle the different challenges of the robust datasets across several domains,” says Denny, who also serves as co-chair of the PMI Cohort Program Steering and Executive Committees. “Our approach is to assemble team members that are world-class in each of the particular areas of the program to bring together diverse expertise.”

As Denny points out, VUMC’s Department of Biomedical Informatics is home to the largest group of informatics faculty in an academic medical center, whose pioneering efforts in precision medicine include BioVU—one of the nation’s largest DNA databanks—with 216,000 unique samples of human DNA linked to 2.5 million de-identified electronic health records, as well as PREDICT (Pharmacogenomic Resource for Enhanced Decisions in Care and Treatment), a clinical decision-support program that tests patients for genetic variations that may affect their response to certain drugs.

“We’ve done a lot with genomic data. But, really what our forte at Vanderbilt has been is dealing with electronic health record data and how to use that efficiently for clinical genomic research integrated with clinical trials,” according to Denny.

In addition, under the PMI Cohort Program, VUMC’s Data and Research Support Center will be working with the Broad Institute in Cambridge, Mass., and Verily Life Sciences (formerly Google Life Sciences) of Mountain View, Calif.

“We’re leveraging the best from different parts of the data community, from Verily for large-scale data management and Broad, which has a lot of experience with that as well—particularly genomic data analysis,” adds Denny. “We will have truly centralized data that is harmonized and will bring researchers into the cloud to replicate the data they need in their own environment/workbench to play with it. Our goal is also to come up with all sorts of web tools to make a lot of the routine analyses easy to do.”

He notes that Verily is currently helping to build the “raw repository” where data will initially arrive at VUMC’s Data and Research Support Center. “We’re putting everything in the Google cloud and so they are the experts.”

Vanderbilt’s other collaborators include Columbia University Medical Center in New York, Northwestern University Feinberg School of Medicine in Chicago, the University of Michigan School of Public Health in Ann Arbor, as well as the University of Texas School of Bioinformatics in Houston.

NIH’s PMI Cohort Program will begin enrolling participants this fall and aims to meet its enrollment goal of 1 million volunteers by 2020 with the assistance of healthcare provider organizations, including regional medical centers, federally qualified health centers and Department of Veterans Affairs’ medical centers.

“We potentially will have just a huge number of EHR systems that will be donating data, putting that into common data models that will enable rapid querying,” says Denny. At the same time, he contends that maintaining data security and privacy is a priority for the PMI Cohort Program and involves all of the team members.

For its part in the program, a Participant Technologies Center—awarded to the Scripps Research Institute in San Diego and Vibrent Health of Fairfax, Va.—will oversee the direct enrollment of volunteers and develop mobile health technologies and smartphone apps that participants will use to record their lifestyle data and environmental exposures in real time. In addition, a PMI Cohort Program Biobank, which is being built by the Mayo Clinic in Rochester, Minn., will oversee the collection, storage and analyses of blood and urine samples.

“We sit in the center of all the data,” concludes Denny. “We’re having multiple planning calls per week with both the (Participant Technologies Centers) and the Biobank as we’re designing all the connection pieces between us.”

More for you

Loading data for hdm_tax_topic #better-outcomes...