= German Human Genome-Phenome Archive =

German Human Genome-Phenome Archive
- Abbreviation: GHGA
- Formation: 2020-10-01
- Status: Consortia in Nationale Forschungsdateninfrastruktur (NFDI) e.V.
- Location City: Heidelberg
- Location Country: Germany
- Products: https://data.ghga.de
- Leader Name: Oliver Stegle (Speaker)
- Board Of Directors: Oliver Kohlbacher, Jan Korbel, Oliver Stegle, Eva Winkler

The German Human Genome-Phenome Archive (GHGA) is a consortium within the German National Research Data Infrastructure (NFDI).

== Mission ==

As a secure national omics data infrastructure, GHGA enables the use of human omics data in research, while ensuring data security and preventing misuse. This data is to be made accessible in accordance with the FAIR principles. This enables the secondary use of data primarily collected in diagnostics, personalized medicine and biomedical research.

Human omics data is sensitive, personal data and requires careful protection to minimise the risk of re-identification of the data subject. It is therefore protected under the European General Data Protection Regulation (GDPR). GHGA addresses the legal basis for data processing and consent in the national context and takes a multilayered approach to data security. Advanced infrastructure allows the data to be archived and shared safely. In addition, a framework for GDPR-compliant data processing helps data producers to inform patients and navigate consent. GHGA considers ethical and social implications of human omics data sharing by involving patients in the conception and governance of GHGA. Enabling controlled, yet FAIR, data access is the last layer to ensure data is protected and at the same time fulfils its potential advancing research.

Goals:
- Establishing a national, secure long-term archive for human omics data.
- Tackling legal and ethical obstacles for data sharing through the implementation of an unified ethico-legal framework.
- Increasing the FAIRness of omics data and facilitating its embedding in national and international data resources and infrastructures.
- Democratising access to and analysis of large-scale omics data for research via a cloud-based analysis platform.
- Increasing the value of research data by integrating multiple omics modalities and linking omics data with phenotype data.
- Training the next generation of scientists in the efficient and responsible use and management of omics data in research.

== Resources/ Services ==

GHGA is developing a variety of services for the research community. Aside from setting up a data portal, the focus is on tackling ethical and legal issues. GHGA also works on data analysis tools.

- Infrastructure for GDPR-compliant sharing of human omics data for secondary purposes
- Standardised, interoperable and reproducible omics workflows for the scientific community, including continuous benchmarking efforts
- Legal and ethical basis for omics research, including the development of a legal basis for data sharing and tools on consent
- Metadata model to provide standardised information on submitted omics data and to facilitate data findability
- Educational material for and about omics research and its societal relevance

==National and international context==

GHGA plays a crucial role in Germany's genomic medicine initiatives by providing the data infrastructure for secure storage and research use of genomic data.
Partnering with genomDE in the conception phases, GHGA now operates Genome Data Centers within the model project genome sequencing, hence ensuring the safe archiving and controlled access to sequencing data for both clinical care and research.

Within Europe, GHGA is part of the federated network of the European Genome-phenome Archive (EGA). Functioning as the German node of FEGA, the data from GHGA is findable and usable with data from other European studies via compatible standards and metadata. In the context of the GDI project, funded by the European Commission and the Federal Ministry of Education and Research (Germany), GHGA ensures that German data collections can also be used within the framework of the „1+ Million Genomes“-Initiative.

== History ==

On 4 July 2019, the German Cancer Research Center, as the applicant institution, submitted the binding pre-application (Letter of Intent) to the DFG Head Office. On 26 June 2020, GHGA was approved by a funding decision of the Joint Science Conference together with eight other consortia in the first application round.

In March 2023, the GHGA Metadata Catalog was made available as part of the project's first phase. The GHGA Metadata Catalog is a public portal for searching study data from German research institutions.

In August 2024, GHGA announced the official launch of the GHGA Archive phase. A major feature release of the GHGA Data Portal allows secure access to the first datasets through a fully integrated data access management system.

== Participating institutions ==

- German Cancer Research Center
- University of Tübingen
- University Hospital Tübingen
- Charité
- Berlin Institute for Health at Charité
- Technical University of Munich
- European Molecular Biology Laboratory
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association
- TU Dresden
- University Hospital Heidelberg
- Heidelberg University
- University of Cologne
- Kiel University
- Helmholtz Zentrum München
- German Center for Neurodegenerative Diseases
- Saarland University
- NAKO e.V.

== Partner institutions ==
- European Bioinformatics Institute
- Helmholtz Centre for Information Security
- Leibniz Supercomputing Centre
- Helmholtz Centre for Infection Research
- National Centre for Tumor Diseases
