Multi-trait analysis of genome-wide association summary statistics using MTAG methods
Aim. Evidence-backed execution summary for Multi-trait analysis of genome-wide association summary statistics using MTAG methods from Multi-trait analysis of genome-wide association summary statistics using MTAG.
Show snapshot details
On this page
This experiment, in seven questions
Jump straight to the part of the recipe you need. Data and provenance labels stay close to the action they support.
Shopping and prep list
What do I need before I start?
mouse
Subject model for the experiment.
- Use
- confirm full cohort details in the source paper
Summary
reagent used in the protocol.
- Use
- The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score regressions, (ii) estimate the variance-covariance matrix of the SNP effects, Ω ^, using method of moments, and (iii) for each SNP, sub...
ONLINE METHODS
reagent used in the protocol.
- Use
- MTAG is a generalized method of moments (GMM) estimator. To obtain the key moment conditions we will use, we consider the best linear prediction of the GWAS estimate for trait s, β ^ j, s, from the SNP's true effect on trait t, β j, t. We use a first-order condition of this best linear predicti...
Summary
The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score regressions, (ii) estimate the variance-covariance matrix of the SNP effects, Ω ^, using method of moments, and (iii) for each SNP, sub...
- Use
- The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score regressions, (ii) estimate the variance-covariance matrix of the SNP effects, Ω ^, using method of moments, and (iii) for each SNP, sub...
Biological Annotation
For brevity, we discuss the specific results only for DEP; the results for NEUR and SWB are similar but more limited. For the tissues tested by DEPICT, plots the P values based on both the GWAS and MTAG results. As expected, nearly all of the enrichment of signal is found in the nervous system. To facilitate interpr...
- Use
- For brevity, we discuss the specific results only for DEP; the results for NEUR and SWB are similar but more limited. For the tissues tested by DEPICT, plots the P values based on both the GWAS and MTAG results. As expected, nearly all of the enrichment of signal is found in the nervous system. To facilitate interpr...
CODE AVAILABILITY
MTAG software available at: https://github.com/omeed-maghzian/mtag.
- Use
- MTAG software available at: https://github.com/omeed-maghzian/mtag.
ONLINE METHODS
For expositional simplicity, our derivations above and in are parameterized in terms of the parameter vector β ^ j. We note, however, that the input to the MTAG software is the standard output from meta-analysis software: z -statistics and sample sizes. Because MTAG is applied to z -statistics, the GWAS summar...
- Use
- For expositional simplicity, our derivations above and in are parameterized in terms of the parameter vector β ^ j. We note, however, that the input to the MTAG software is the standard output from meta-analysis software: z -statistics and sample sizes. Because MTAG is applied to z -statistics, the GWAS summar...
Special Cases
There are three special cases of MTAG that may often be relevant in practice and for which the estimation procedure is made faster and more efficient. The MTAG software offers the option to specialize the analysis for these cases.
- Use
- There are three special cases of MTAG that may often be relevant in practice and for which the estimation procedure is made faster and more efficient. The MTAG software offers the option to specialize the analysis for these cases.
Polygenic Prediction
We used the Health and Retirement Study (HRS) and the National Longitudinal Study of Adolescent to Adult Health (Add Health) as our prediction cohorts. We applied the same SNP filters as in the main MTAG analyses. Additionally, we restricted the set of SNPs used to construct the scores to HapMap3 SNPs for comparabil...
- Use
- We used the Health and Retirement Study (HRS) and the National Longitudinal Study of Adolescent to Adult Health (Add Health) as our prediction cohorts. We applied the same SNP filters as in the main MTAG analyses. Additionally, we restricted the set of SNPs used to construct the scores to HapMap3 SNPs for comparabil...
∑ ^ j accurately captures sample overlap
Software used for acquisition, scoring, statistics, or reporting.
- Use
- MTAG relies on bivariate LD score regression (and by extension its assumptions) to estimate the correlation in GWAS estimation error due to sample overlap. To gauge MTAG's performance, we simulate an extreme case of sample overlap using real data from the UK Biobank (UKB). We run three GWASs of height, each us...
Polygenic Prediction
Software used for acquisition, scoring, statistics, or reporting.
- Use
- and summarize the results from our pooled analysis of Add Health and HRS. The GWAS-based polygenic scores have incremental R 2 's of 1.00% for DEP, 1.27% for NEUR, and 1.20% for SWB. The corresponding MTAG-based polygenic scores all have greater predictive power: 1.17% for DEP, 1.65% for NEUR, and 1.57% for SW...
ONLINE METHODS
Software used for acquisition, scoring, statistics, or reporting.
- Use
- For expositional simplicity, our derivations above and in are parameterized in terms of the parameter vector β ^ j. We note, however, that the input to the MTAG software is the standard output from meta-analysis software: z -statistics and sample sizes. Because MTAG is applied to z -statistics, the GWAS summar...
Before you run
What should be confirmed before execution?
First confirmation
Equipment is listed but no product mappings are linked.
Confirm before execution
This page is backed by a publishable Replication Data Ledger package with zero critical source-verification issues.
Confirm before execution
Open the source paper before finalizing run-specific details.
Procurement checkpoint
Use source-stated vendors where present. Treat mapped products as sourcing options unless the page marks an exact source match.
Open quote workflowStep-by-step procedure
What do I do, in order?
MTAG Framework
In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one. For SNP j, we denote the vector of marginal (i.e., not controlling for other SNPs), true effects on each of the T traits by β j. We treat these true effects as random effects with E ( β j ) = 0 and Var ( β j ) = Ω. If the true effects are correlated across traits, then the off-diagonal elements of Ω are non-zero. MTAG's key assumption is that Ω is homogeneous across SNPs, i.e., it does not depend on j.
Polygenic Prediction
We measure the predictive power of each polygenic score by its incremental R 2, defined as the increase in coefficient of determination ( R 2 ) as we move from a regression of the trait only on a set of controls (year of birth, year of birth squared, sex, their interactions, and 10 principal components of the genetic data) to a regression that additionally includes the polygenic score as an independent variable.
Biological Annotation
For a final comparison, we analyze both the GWAS and MTAG results using the bioinformatics tool DEPICT. We present the prioritized genes, enriched gene sets, and enriched tissues identified by DEPICT at the standard FDR threshold of 5%.
ONLINE METHODS
There are T traits, which may be binary or quantitative. We standardize each trait and the genotype for each single-nucleotide polymorphism (SNP) j so that they all have mean zero and variance one. The length- T vector of marginal (i.e., not controlling for other SNPs), true effects of SNP j on each of the traits is denoted β j. We assume that these are random effects with mean 0 and variance-covariance matrix Ω that is the same across j. The mean is zero because we treat the choice of reference allele as arbitrary. We make the common assumption,, that the β j 's are identically distributed across j. The assumption implies that the expected amount of phenotypic variance explained is equal for each SNP, regardless of SNP characteristics such as allele frequency.
Perfect genetic correlation and equal heritabilities
This special case corresponds to the "traits" being (the same measure of) a single trait; in other words, applying MTAG instead of inverse-variance-weighted meta-analysis to T GWAS results. Doing so can be useful if there is sample overlap in the GWAS results. In this case, as noted in the main text, MTAG specializes to β ^ MTAG, j, t = 1 ′ ∑ j - 1 1 ′ ∑ j - 1 1 β ^ j for all t, and it is no longer necessary to estimate Ω.
Measurement outputs
What raw and processed outputs should exist?
There are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
In standard meta-analysis, the diagonal elements of ∑ ^ j would be constructed using the squared standard errors from the GWAS results, and the off-diagonal elements of &#...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
Therefore, MTAG proceeds by running linkage disequilibrium (LD) score regressions on the GWAS results and using the estimated intercepts to construct the diagonal elements of &#...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score r...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
Analysis plan
How should the outputs become interpretable results?
Acquisition
Collect raw experimental outputs with enough metadata to preserve sample identity, condition, and timing.
inferred from protocolPreprocessing / cleaning
This section briefly discusses three analytic formulas we have derived regarding the expected performance of MTAG for each trait: its mean squared error (MSE) across SNPs, its statistical power to detect a true single-SNP association, and its false discovery rate (FDR) (Online...
from paperScoring or quantification
Quantify the primary readouts for this experiment: There are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...; In standard meta-analysis, the diagonal elements of ∑ ^ j would be constructed using the squared standard errors from the GWAS results, and the off-diagonal elements of &#...; Therefore, MTAG proceeds by running linkage disequilibrium (LD) score regressions on the GWAS results and using the estimated intercepts to construct the diagonal elements of &#...; The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score r....
from paperStatistical comparison
This section briefly discusses three analytic formulas we have derived regarding the expected performance of MTAG for each trait: its mean squared error (MSE) across SNPs, its s...; The power and FDR formulas (in contrast to the fully general MSE formula) assume that the true effect sizes β j are drawn from some known mean-zero mixture of multivariate...; The derivation of MTAG relies on three important assumptions: (1) Ω is homogeneous across SNPs, (2) sampling variation in Ω ^ and ∑ ^ j can be ignored, and (3) t...; If the homogeneous- Ω assumption is violated, then there are different types of SNPs with different Ω 's. Because MTAG combines the GWAS estimates using the geno...
from paperReporting output
Report representative outputs alongside summary comparisons for There are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω..., In standard meta-analysis, the diagonal elements of ∑ ^ j would be constructed using the squared standard errors from the GWAS results, and the off-diagonal elements of &#..., Therefore, MTAG proceeds by running linkage disequilibrium (LD) score regressions on the GWAS results and using the estimated intercepts to construct the diagonal elements of &#..., The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score r....
inferred from protocolStructured statistical methods
This section briefly discusses three analytic formulas we have derived regarding the expected performance of MTAG for each trait: its mean squared error (MSE) across SNPs, its s...; The power and FDR formulas (in contrast to the fully general MSE formula) assume that the true effect sizes β j are drawn from some known mean-zero mixture of multivariate...; The derivation of MTAG relies on three important assumptions: (1) Ω is homogeneous across SNPs, (2) sampling variation in Ω ^ and ∑ ^ j can be ignored, and (3) t...; If the homogeneous- Ω assumption is violated, then there are different types of SNPs with different Ω 's. Because MTAG combines the GWAS estimates using the geno...
source structuredSource and audit
What supports the facts on this page?
Evidence quotes (5)
In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one. For SNP j, we denote the vector of marginal (i.e., not controlling for other SNPs), true effects on each of the T traits by β j. We treat these true effects as random effects with E ( β j ) = 0 and Var ( β j ) = Ω. If the true effects are correlated across traits, then the off-diagonal elements of Ω are non-zero. MTAG's key assumption is that Ω is homogeneous across SNPs, i.e., it does not depend on j.
We measure the predictive power of each polygenic score by its incremental R 2, defined as the increase in coefficient of determination ( R 2 ) as we move from a regression of the trait only on a set of controls (year of birth, year of birth squared, sex, their interactions, and 10 principal components of the genetic data) to a regression that additionally includes the polygenic score as an independent variable.
For a final comparison, we analyze both the GWAS and MTAG results using the bioinformatics tool DEPICT. We present the prioritized genes, enriched gene sets, and enriched tissues identified by DEPICT at the standard FDR threshold of 5%.
There are T traits, which may be binary or quantitative. We standardize each trait and the genotype for each single-nucleotide polymorphism (SNP) j so that they all have mean zero and variance one. The length- T vector of marginal (i.e., not controlling for other SNPs), true effects of SNP j on each of the traits is denoted β j. We assume that these are random effects with mean 0 and variance-covariance matrix Ω that is the same across j. The mean is zero because we treat the choice of reference allele as arbitrary. We make the common assumption,, that the β j 's are identically distributed across j. The assumption implies that the expected amount of phenotypic variance explained is equal for each SNP, regardless of SNP characteristics such as allele frequency.
This special case corresponds to the "traits" being (the same measure of) a single trait; in other words, applying MTAG instead of inverse-variance-weighted meta-analysis to T GWAS results. Doing so can be useful if there is sample overlap in the GWAS results. In this case, as noted in the main text, MTAG specializes to β ^ MTAG, j, t = 1 ′ ∑ j - 1 1 ′ ∑ j - 1 1 β ^ j for all t, and it is no longer necessary to estimate Ω.
Machine-readable layer
[
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "Multi-trait analysis of genome-wide association summary statistics using MTAG methods",
"description": "Evidence-backed execution summary for Multi-trait analysis of genome-wide association summary statistics using MTAG methods from Multi-trait analysis of genome-wide association summary statistics using MTAG.",
"step": [
{
"@type": "HowToStep",
"position": 1,
"name": "MTAG Framework",
"text": "In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one. For SNP j, we denote the vector of marginal (i.e., not controlling for other SNPs), true effects on each of the T traits by β j. We treat these true effects as random effects with E ( β j ) = 0 and Var ( β j ) = Ω. If the true effects are correlated across traits, then the off-diagonal elements of Ω are non-zero. MTAG's key assumption is that Ω is homogeneous across SNPs, i.e., it does not depend on j."
},
{
"@type": "HowToStep",
"position": 2,
"name": "Polygenic Prediction",
"text": "We measure the predictive power of each polygenic score by its incremental R 2, defined as the increase in coefficient of determination ( R 2 ) as we move from a regression of the trait only on a set of controls (year of birth, year of birth squared, sex, their interactions, and 10 principal components of the genetic data) to a regression that additionally includes the polygenic score as an independent variable."
},
{
"@type": "HowToStep",
"position": 3,
"name": "Biological Annotation",
"text": "For a final comparison, we analyze both the GWAS and MTAG results using the bioinformatics tool DEPICT. We present the prioritized genes, enriched gene sets, and enriched tissues identified by DEPICT at the standard FDR threshold of 5%."
},
{
"@type": "HowToStep",
"position": 4,
"name": "ONLINE METHODS",
"text": "There are T traits, which may be binary or quantitative. We standardize each trait and the genotype for each single-nucleotide polymorphism (SNP) j so that they all have mean zero and variance one. The length- T vector of marginal (i.e., not controlling for other SNPs), true effects of SNP j on each of the traits is denoted β j. We assume that these are random effects with mean 0 and variance-covariance matrix Ω that is the same across j. The mean is zero because we treat the choice of reference allele as arbitrary. We make the common assumption,, that the β j 's are identically distributed across j. The assumption implies that the expected amount of phenotypic variance explained is equal for each SNP, regardless of SNP characteristics such as allele frequency."
},
{
"@type": "HowToStep",
"position": 5,
"name": "Perfect genetic correlation and equal heritabilities",
"text": "This special case corresponds to the \"traits\" being (the same measure of) a single trait; in other words, applying MTAG instead of inverse-variance-weighted meta-analysis to T GWAS results. Doing so can be useful if there is sample overlap in the GWAS results. In this case, as noted in the main text, MTAG specializes to β ^ MTAG, j, t = 1 ′ ∑ j - 1 1 ′ ∑ j - 1 1 β ^ j for all t, and it is no longer necessary to estimate Ω."
}
],
"tool": [
{
"@type": "HowToTool",
"name": "Summary"
},
{
"@type": "HowToTool",
"name": "Biological Annotation"
},
{
"@type": "HowToTool",
"name": "CODE AVAILABILITY"
},
{
"@type": "HowToTool",
"name": "ONLINE METHODS"
},
{
"@type": "HowToTool",
"name": "Special Cases"
},
{
"@type": "HowToTool",
"name": "Polygenic Prediction"
}
],
"supply": [
{
"@type": "HowToSupply",
"name": "Summary"
},
{
"@type": "HowToSupply",
"name": "ONLINE METHODS"
}
],
"isBasedOn": {
"@type": "ScholarlyArticle",
"headline": "Multi-trait analysis of genome-wide association summary statistics using MTAG",
"datePublished": "2018",
"author": [
{
"@type": "Person",
"name": "Patrick Turley"
},
{
"@type": "Person",
"name": "Raymond K. Walters"
},
{
"@type": "Person",
"name": "Omeed Maghzian"
},
{
"@type": "Person",
"name": "Aysu Okbay"
},
{
"@type": "Person",
"name": "James J. Lee"
},
{
"@type": "Person",
"name": "Mark Alan Fontana"
},
{
"@type": "Person",
"name": "Tuan Anh Nguyen-Viet"
},
{
"@type": "Person",
"name": "Robbee Wedow"
},
{
"@type": "Person",
"name": "Meghan Zacher"
},
{
"@type": "Person",
"name": "Nicholas A. Furlotte"
},
{
"@type": "Person",
"name": "Patrik Magnusson"
},
{
"@type": "Person",
"name": "Sven Oskarsson"
},
{
"@type": "Person",
"name": "Magnus Johannesson"
},
{
"@type": "Person",
"name": "Peter M. Visscher"
},
{
"@type": "Person",
"name": "David Laibson"
},
{
"@type": "Person",
"name": "David Cesarini"
},
{
"@type": "Person",
"name": "Benjamin M. Neale"
},
{
"@type": "Person",
"name": "Daniel J. Benjamin"
}
],
"identifier": "10.1038/s41588-017-0009-4"
}
},
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Experiments",
"item": "https://replicatescience.com/experiments"
},
{
"@type": "ListItem",
"position": 2,
"name": "Multi-trait analysis of genome-wide association summary statistics using MTAG methods",
"item": "https://replicatescience.com/experiments/multi-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-methods-patrick-turley-pmc5805593/multi-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-mlpgy8ee"
}
]
}
]