Principal Investigators
Benedict Paten, PhD, co-Investigator, University of California Santa Cruz |
Research Description
The LungMAP 3 Data Coordination Center (DCC) serves as a nexus of LungMAP collective knowledge and activities. The DCC is responsible for data collation, re-analysis, and integration; secondary annotation tracking; developing tools to facilitate collection, sharing and data dissemination; operating a web resource for data, expertise, and collaboration; and coordinating activities across the Research Centers (RCs) and Human Tissue Core. The DCC also facilitate literacy for investigator use of developed tools and best practices for analysis, data provenance and metadata annotation, and engage the larger research community. To host the DCC, we have assembled a multidisciplinary team with data network leadership, along with leaders in single-cell genomics, image analysis, functional inference, and data re-utilization. The DCC leverages unique expertise at CCHMC, UCSC, and the Broad Institute to interoperate pulmonary-oriented single-cell and high-resolution imaging data with other atlas programs. We also include world-renowned pulmonary researchers into our leadership team to ensure the data and knowledge we provide to the research community has the greatest scientific impact. Collectively, we propose to accelerate the LungMAP scientific agenda by coordinating efforts across funded Centers, the NIH, and the pulmonary research community; cross-validate, annotate, deposit and link Consortium datasets and metadata that encompass advanced AI enabled tools, molecular -omics, spatial transcriptomics, imaging, and associated structural models; and enable sharing of data, results, and models within LungMAP and the research community. The lung disease datasets and results derived from the RCs are expected to yield significant new insights into intra-donor variation and disease pathogenesis. To ensure the underlying data produced by the RCs is findable, accessible, interoperable and re-usable (FAIR), the DCC will work closely with the RCs to establish and share best practices, coordinate metadata annotation, ensure studies are sufficiently powered, assist with the deposition of harmonized data of high integrity to secure repositories, and provide data access and standardized analysis workflows. Through the continued development of structured ontologies and metadata frameworks, RC-derived datasets will be annotated and harmonized using emerging best practices. The DCC will support the ingestion and validation of data and analysis from new technologies as they emerge. We will support the generation of centralized, cloud-enabled data processing workflows that are compatible with external initiatives such as HubMAP, BRAIN, and the HCA. Providing these functions in a web-enabled LungMAP Commons will promote interaction across many stakeholders.