About the PGS Catalog

This page contains information regarding the PGS Catalog Project.

What is a Polygenic Score?

A polygenic score (PGS) aggregates the effects of many genetic variants into a single number which predicts genetic predisposition for a phenotype. PGS are typically composed of hundreds-to-millions of genetic variants (usually SNPs) which are combined using a weighted sum of allele dosages multiplied by their corresponding effect sizes, as estimated from a relevant genome-wide association study (GWAS).

PGS nomenclature is heterogeneous: they can also be referred to as genetic scores or genomic scores, and as polygenic risk scores (PRS) or genomic risk scores (GRS) if they predict a discrete phenotype, such as a disease.

The PGS Catalog Project

The PGS Catalog is an open database of published polygenic scores (PGS). Each PGS in the Catalog is consistently annotated with relevant metadata; including scoring files (variants, effect alleles/weights), annotations of how the PGS was developed and applied, and evaluations of their predictive performance. See the PGS Catalog Data Description page for a complete description of the metadata captured for PGS, Samples, Performance Metrics, Traits, and Publications.

Citation

The PGS Catalog development is led by Samuel Lambert under the supervision of Michael Inouye (University of Cambridge & Baker Institute) in collaboration with Health Data Research - UK (Laurent Gil) and the EBI Samples, Phenotypes and Ontologies team / NHGRI-EBI GWAS Catalog (Helen Parkinson, Aoife McMahon, Laura Harris).

The Catalog is under active development, and we continue to add new features and curate new data. If you use the Catalog in your research we ask that you cite our recent paper:

Samuel A. Lambert, Laurent Gil, Simon Jupp, Scott C. Ritchie, Yu Xu, Annalisa Buniello, Aoife McMahon, Gad Abraham, Michael Chapman, Helen Parkinson, John Danesh, Jacqueline A. L. MacArthur and Michael Inouye.

The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation

Nature Geneticsdoi: 10.1038/s41588-021-00783-5 (2021).

Individual PGS obtained from the database should also be cited appropriately, and used in accordance with any licensing restrictions set by the authors (see our Terms of Use for more information).

PGS Catalog Inclusion Criteria

For a publication's data to be included in the PGS Catalog it must contain one of the following:

A complete description of the data captured for each PGS and publication can be found here.

In the pilot PGS Catalog (presented at ASHG 2019) we focused on curating PGS developed after 2010, and included well-studied scores for the following traits: coronary artery disease (CAD), diabetes (types 1 and 2), obesity / body mass index (BMI), breast cancer, prostate cancer and Alzheimer’s disease. The catalog however is not limited to these traits, and since then we have focused on broadening the coverage of traits and scores. Researchers are invited to submit their PGS and evaluations to us by e-mail for curation and inclusion in the PGS Catalog - we plan to provide a streamlined interface to submit these data in the future.

Data Submission

If you have a PGS or publication that meets the Catalog's eligibility requirements we invite you to submit your data by e-mail ( pgs-info@ebi.ac.uk). To ensure a speedy curation and inclusion into the catalog it would be helpful if you provide the following information about your study:

Pre-publication submissions: The PGS Catalog also allows pre-publication submissions that authors may wish to embargo until publication. In this case the journal name can be provided, and a filled out curation template is required. Scores can then be assigned PGS Catalog IDs so that they may be added to the manuscript.
Missing PGS studies: You can also report/recommend studies for inclusion in the PGS Catalog using this form: Report missing PGS study. However, please send us the PGS by e-mail if you are the paper’s author and can share the variant-level score information.

PGS Catalog Software/Tools

All the code developed in PGS Catalog is publicly available on GitHub [PGSCatalog]. Here are some of the tools that can be useful for the community:

Features Under Development

Feedback & Contact Information

To submit a PGS to the catalog, provide feedback, or ask questions please contact the PGS Catalog team at pgs-info@ebi.ac.uk.

Acknowledgements

We wish to acknowledge the help of the following people & teams for their support of the PGS Catalog:

PGS Catalog Team: Sam Lambert1, Laurent Gil2, Benjamin Wingfield3, Florent Yvon4, Aoife McMahon1,5, Santhi Ramachandran5, Elizabeth Lewis5, Laura Harris5, Helen Parkinson6, Richard Houghton2, Prof. John Danesh2, Michael Inouye4

Previous Contributors: Emily Tinsley3, Shirin Saverimuttu3, Jackie MacArthur3, Simon Jupp3, James Hayhurst3, Trish Whetzel3, Michael Chapman2, Jonathan Marten4, Petar Scepanovic4, Gad Abraham4

The PGS Catalog is delivered by collaboration between the EMBL-EBI and University of Cambridge and funded by NHGRI (1U24HG012542-01), Health Data Research UK and the Baker Heart & Diabetes Institute.

University of Cambridge
EMBL-EBI
HDR-UK (Cambridge)
Baker Heart and Diabetes Institute
National Human Genome Research Institute