This page contains information regarding the PGS Catalog Project.
A polygenic score (PGS) aggregates the effects of many genetic variants into a single number which predicts genetic predisposition for a phenotype. PGS are typically composed of hundreds-to-millions of genetic variants (usually SNPs) which are combined using a weighted sum of allele dosages multiplied by their corresponding effect sizes, as estimated from a relevant genome-wide association study (GWAS).
PGS nomenclature is heterogeneous: they can also be referred to as genetic scores or genomic scores, and as polygenic risk scores (PRS) or genomic risk scores (GRS) if they predict a discrete phenotype, such as a disease.
The PGS Catalog is an open database of published polygenic scores (PGS). Each PGS in the Catalog is consistently annotated with relevant metadata; including scoring files (variants, effect alleles/weights), annotations of how the PGS was developed and applied, and evaluations of their predictive performance. See the PGS Catalog Data Description page for a complete description of the metadata captured for PGS, Samples, Performance Metrics, Traits, and Publications.
The PGS Catalog development is led by Samuel Lambert under the supervision of Michael Inouye (University of Cambridge & Baker Institute) in collaboration with Health Data Research - UK (Laurent Gil) and the EBI Samples, Phenotypes and Ontologies team / NHGRI-EBI GWAS Catalog (Helen Parkinson, Aoife McMahon, Laura Harris).
The Catalog is under active development, and we continue to add new features and curate new data. If you use the Catalog in your research we ask that you cite our recent paper:
Samuel A. Lambert, Laurent Gil, Simon Jupp, Scott C. Ritchie, Yu Xu, Annalisa Buniello, Aoife McMahon, Gad Abraham, Michael Chapman, Helen Parkinson, John Danesh, Jacqueline A. L. MacArthur and Michael Inouye.
The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation
Nature Geneticsdoi: 10.1038/s41588-021-00783-5 (2021).
Individual PGS obtained from the database should also be cited appropriately, and used in accordance with any licensing restrictions set by the authors (see our Terms of Use for more information).
For a publication's data to be included in the PGS Catalog it must contain one of the following:
A complete description of the data captured for each PGS and publication can be found here.
In the pilot PGS Catalog (presented at ASHG 2019) we focused on curating PGS developed after 2010, and included well-studied scores for the following traits: coronary artery disease (CAD), diabetes (types 1 and 2), obesity / body mass index (BMI), breast cancer, prostate cancer and Alzheimer’s disease. The catalog however is not limited to these traits, and since then we have focused on broadening the coverage of traits and scores. Researchers are invited to submit their PGS and evaluations to us by e-mail for curation and inclusion in the PGS Catalog - we plan to provide a streamlined interface to submit these data in the future.
If you have a PGS or publication that meets the Catalog's eligibility requirements we invite you to submit your data by e-mail ( pgs-info@ebi.ac.uk). To ensure a speedy curation and inclusion into the catalog it would be helpful if you provide the following information about your study:
All the code developed in PGS Catalog is publicly available on GitHub [PGSCatalog]. Here are some of the tools that can be useful for the community:
To submit a PGS to the catalog, provide feedback, or ask questions please contact the PGS Catalog team at pgs-info@ebi.ac.uk.
We wish to acknowledge the help of the following people & teams for their support of the PGS Catalog:
The development of the PGS Catalog is supported by: