ReplicateScience

Submit paper Get API key

Multi-trait analysis of genome-wide association summary statistics using MTAG methods - patrick turley | ReplicateScience

experimentsmulti-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-methods-patrick-turley-pmc58055932018

Multi-trait analysis of genome-wide association summary statistics using MTAG methods

Aim. Evidence-backed execution summary for Multi-trait analysis of genome-wide association summary statistics using MTAG methods from Multi-trait analysis of genome-wide association summary statistics using MTAG.

Nature genetics · 2018model mousesteps 5full RDL packagemethods backedsource paper only

10.1038/s41588-017-0009-4 PMC5805593 Quote workflow

Show snapshot details

Source titleMulti-trait analysis of genome-wide association summary statistics using MTAG

AuthorsPatrick Turley, Raymond K. Walters, Omeed Maghzian et al.

ReadinessFull RDL Package · 100%

Page URLreplicatescience.com/experiments/multi-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-methods-patrick-turley-pmc5805593/multi-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-mlpgy8ee

On this page

This experiment, in seven questions

Jump straight to the part of the recipe you need. Data and provenance labels stay close to the action they support.

7 sections · est. 8 min read

02 · Shopping and prep

Shopping and prep list

What do I need before I start?

biologicalsource linked

mouse

Subject model for the experiment.

Use: confirm full cohort details in the source paper

The results contain some intriguing findings. For example, while hypotheses regarding major depression and related traits have tended to focus on monoamine neurotransmitters, our results as a whole point much more strongly to glutamatergic neurotransmission. Moreover, the particular glutamate-receptor genes prioritized by DEPICT ( GRIK3, GRM1, GRM5, and GRM8 ) suggest the importance of processes involving communication between neurons on an intermediate timescale,, such as learning and memory. Such processes are also implicated by many of the enriched gene sets, which relate to altered reactions to stress and novelty in mice (e.g., 'decreased exploration in a new environment,' 'increased anxiety-related response,' 'behavioral fear response').Confirm cohort

reagentsource linked

Summary

reagent used in the protocol.

Use: The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score regressions, (ii) estimate the variance-covariance matrix of the SNP effects, Ω ^, using method of moments, and (iii) for each SNP, sub...

source-linked evidence quoteConfirm item

reagentsource linked

ONLINE METHODS

reagent used in the protocol.

Use: MTAG is a generalized method of moments (GMM) estimator. To obtain the key moment conditions we will use, we consider the best linear prediction of the GWAS estimate for trait s, β ^ j, s, from the SNP's true effect on trait t, β j, t. We use a first-order condition of this best linear predicti...

source-linked evidence quoteConfirm item

instrumentsource linked

Summary

The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score regressions, (ii) estimate the variance-covariance matrix of the SNP effects, Ω ^, using method of moments, and (iii) for each SNP, sub...

Use: The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score regressions, (ii) estimate the variance-covariance matrix of the SNP effects, Ω ^, using method of moments, and (iii) for each SNP, sub...

source-linked evidence quoteConfirm apparatus

instrumentsource linked

Biological Annotation

For brevity, we discuss the specific results only for DEP; the results for NEUR and SWB are similar but more limited. For the tissues tested by DEPICT, plots the P values based on both the GWAS and MTAG results. As expected, nearly all of the enrichment of signal is found in the nervous system. To facilitate interpr...

Use: For brevity, we discuss the specific results only for DEP; the results for NEUR and SWB are similar but more limited. For the tissues tested by DEPICT, plots the P values based on both the GWAS and MTAG results. As expected, nearly all of the enrichment of signal is found in the nervous system. To facilitate interpr...

source-linked evidence quoteConfirm apparatus

instrumentsource linked

CODE AVAILABILITY

MTAG software available at: https://github.com/omeed-maghzian/mtag.

Use: MTAG software available at: https://github.com/omeed-maghzian/mtag.

source-linked evidence quoteConfirm apparatus

instrumentsource linked

ONLINE METHODS

For expositional simplicity, our derivations above and in are parameterized in terms of the parameter vector β ^ j. We note, however, that the input to the MTAG software is the standard output from meta-analysis software: z -statistics and sample sizes. Because MTAG is applied to z -statistics, the GWAS summar...

Use: For expositional simplicity, our derivations above and in are parameterized in terms of the parameter vector β ^ j. We note, however, that the input to the MTAG software is the standard output from meta-analysis software: z -statistics and sample sizes. Because MTAG is applied to z -statistics, the GWAS summar...

source-linked evidence quoteConfirm apparatus

instrumentsource linked

Special Cases

There are three special cases of MTAG that may often be relevant in practice and for which the estimation procedure is made faster and more efficient. The MTAG software offers the option to specialize the analysis for these cases.

Use: There are three special cases of MTAG that may often be relevant in practice and for which the estimation procedure is made faster and more efficient. The MTAG software offers the option to specialize the analysis for these cases.

source-linked evidence quoteConfirm apparatus

instrumentsource linked

Polygenic Prediction

We used the Health and Retirement Study (HRS) and the National Longitudinal Study of Adolescent to Adult Health (Add Health) as our prediction cohorts. We applied the same SNP filters as in the main MTAG analyses. Additionally, we restricted the set of SNPs used to construct the scores to HapMap3 SNPs for comparabil...

Use: We used the Health and Retirement Study (HRS) and the National Longitudinal Study of Adolescent to Adult Health (Add Health) as our prediction cohorts. We applied the same SNP filters as in the main MTAG analyses. Additionally, we restricted the set of SNPs used to construct the scores to HapMap3 SNPs for comparabil...

source-linked evidence quoteConfirm apparatus

softwaresource linked

∑ ^ j accurately captures sample overlap

Software used for acquisition, scoring, statistics, or reporting.

Use: MTAG relies on bivariate LD score regression (and by extension its assumptions) to estimate the correlation in GWAS estimation error due to sample overlap. To gauge MTAG's performance, we simulate an extreme case of sample overlap using real data from the UK Biobank (UKB). We run three GWASs of height, each us...

source-linked evidence quoteConfirm software

softwaresource linked

Polygenic Prediction

Software used for acquisition, scoring, statistics, or reporting.

Use: and summarize the results from our pooled analysis of Add Health and HRS. The GWAS-based polygenic scores have incremental R 2 's of 1.00% for DEP, 1.27% for NEUR, and 1.20% for SWB. The corresponding MTAG-based polygenic scores all have greater predictive power: 1.17% for DEP, 1.65% for NEUR, and 1.57% for SW...

source-linked evidence quoteConfirm software

softwaresource linked

ONLINE METHODS

Software used for acquisition, scoring, statistics, or reporting.

Use: For expositional simplicity, our derivations above and in are parameterized in terms of the parameter vector β ^ j. We note, however, that the input to the MTAG software is the standard output from meta-analysis software: z -statistics and sample sizes. Because MTAG is applied to z -statistics, the GWAS summar...

source-linked evidence quoteConfirm software

03 · Execution checks

Before you run

What should be confirmed before execution?

01

First confirmation

Equipment is listed but no product mappings are linked.

02

Confirm before execution

This page is backed by a publishable Replication Data Ledger package with zero critical source-verification issues.

03

Confirm before execution

Open the source paper before finalizing run-specific details.

Procurement checkpoint

Use source-stated vendors where present. Treat mapped products as sourcing options unless the page marks an exact source match.

Open quote workflow

04 · Procedure

Step-by-step procedure

What do I do, in order?

01extracted step

1 evidence link

MTAG Framework

In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one. For SNP j, we denote the vector of marginal (i.e., not controlling for other SNPs), true effects on each of the T traits by β j. We treat these true effects as random effects with E ( β j ) = 0 and Var ( β j ) = Ω. If the true effects are correlated across traits, then the off-diagonal elements of Ω are non-zero. MTAG's key assumption is that Ω is homogeneous across SNPs, i.e., it does not depend on j.

Neededsource paper and local SOP

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThere are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...

02extracted step

1 evidence link

Polygenic Prediction

We measure the predictive power of each polygenic score by its incremental R 2, defined as the increase in coefficient of determination ( R 2 ) as we move from a regression of the trait only on a set of controls (year of birth, year of birth squared, sex, their interactions, and 10 principal components of the genetic data) to a regression that additionally includes the polygenic score as an independent variable.

NeededPolygenic Prediction, Polygenic Prediction

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThere are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...

03extracted step

1 evidence link

Biological Annotation

For a final comparison, we analyze both the GWAS and MTAG results using the bioinformatics tool DEPICT. We present the prioritized genes, enriched gene sets, and enriched tissues identified by DEPICT at the standard FDR threshold of 5%.

NeededBiological Annotation

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThere are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...

04extracted step

1 evidence link

ONLINE METHODS

There are T traits, which may be binary or quantitative. We standardize each trait and the genotype for each single-nucleotide polymorphism (SNP) j so that they all have mean zero and variance one. The length- T vector of marginal (i.e., not controlling for other SNPs), true effects of SNP j on each of the traits is denoted β j. We assume that these are random effects with mean 0 and variance-covariance matrix Ω that is the same across j. The mean is zero because we treat the choice of reference allele as arbitrary. We make the common assumption,, that the β j 's are identically distributed across j. The assumption implies that the expected amount of phenotypic variance explained is equal for each SNP, regardless of SNP characteristics such as allele frequency.

NeededONLINE METHODS, ONLINE METHODS, ONLINE METHODS

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThere are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...

05extracted step

1 evidence link

Perfect genetic correlation and equal heritabilities

This special case corresponds to the "traits" being (the same measure of) a single trait; in other words, applying MTAG instead of inverse-variance-weighted meta-analysis to T GWAS results. Doing so can be useful if there is sample overlap in the GWAS results. In this case, as noted in the main text, MTAG specializes to β ^ MTAG, j, t = 1 ′ ∑ j - 1 1 ′ ∑ j - 1 1 β ^ j for all t, and it is no longer necessary to estimate Ω.

Neededsource paper and local SOP

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThere are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...

05 · Measurement

Measurement outputs

What raw and processed outputs should exist?

from paper

There are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

from paper

In standard meta-analysis, the diagonal elements of ∑ ^ j would be constructed using the squared standard errors from the GWAS results, and the off-diagonal elements of &#...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

from paper

Therefore, MTAG proceeds by running linkage disequilibrium (LD) score regressions on the GWAS results and using the estimated intercepts to construct the diagonal elements of &#...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

from paper

The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score r...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

06 · Analysis

Analysis plan

How should the outputs become interpretable results?

01

Acquisition

Collect raw experimental outputs with enough metadata to preserve sample identity, condition, and timing.

inferred from protocol

02

Preprocessing / cleaning

This section briefly discusses three analytic formulas we have derived regarding the expected performance of MTAG for each trait: its mean squared error (MSE) across SNPs, its statistical power to detect a true single-SNP association, and its false discovery rate (FDR) (Online...

from paper

03

Scoring or quantification

Quantify the primary readouts for this experiment: There are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω...; In standard meta-analysis, the diagonal elements of ∑ ^ j would be constructed using the squared standard errors from the GWAS results, and the off-diagonal elements of &#...; Therefore, MTAG proceeds by running linkage disequilibrium (LD) score regressions on the GWAS results and using the estimated intercepts to construct the diagonal elements of &#...; The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score r....

from paper

04

Statistical comparison

This section briefly discusses three analytic formulas we have derived regarding the expected performance of MTAG for each trait: its mean squared error (MSE) across SNPs, its s...; The power and FDR formulas (in contrast to the fully general MSE formula) assume that the true effect sizes β j are drawn from some known mean-zero mixture of multivariate...; The derivation of MTAG relies on three important assumptions: (1) Ω is homogeneous across SNPs, (2) sampling variation in Ω ^ and ∑ ^ j can be ignored, and (3) t...; If the homogeneous- Ω assumption is violated, then there are different types of SNPs with different Ω 's. Because MTAG combines the GWAS estimates using the geno...

from paper

05

Reporting output

Report representative outputs alongside summary comparisons for There are several useful special cases of MTAG (Online Methods). When all estimates are for the same trait (implying ω t ω t ' ω t t = Ω and ω t ω..., In standard meta-analysis, the diagonal elements of ∑ ^ j would be constructed using the squared standard errors from the GWAS results, and the off-diagonal elements of &#..., Therefore, MTAG proceeds by running linkage disequilibrium (LD) score regressions on the GWAS results and using the estimated intercepts to construct the diagonal elements of &#..., The MTAG results for SNP j are obtained in three steps: (i) estimate the variance-covariance matrix of the GWAS estimation error, ∑ ^ j, by using a sequence of LD score r....

inferred from protocol

Structured statistical methods

This section briefly discusses three analytic formulas we have derived regarding the expected performance of MTAG for each trait: its mean squared error (MSE) across SNPs, its s...; The power and FDR formulas (in contrast to the fully general MSE formula) assume that the true effect sizes β j are drawn from some known mean-zero mixture of multivariate...; The derivation of MTAG relies on three important assumptions: (1) Ω is homogeneous across SNPs, (2) sampling variation in Ω ^ and ∑ ^ j can be ignored, and (3) t...; If the homogeneous- Ω assumption is violated, then there are different types of SNPs with different Ω 's. Because MTAG combines the GWAS estimates using the geno...

source structured

07 · Source layer

Source and audit

What supports the facts on this page?

Source identityavailable

Structured protocolavailable

Methods evidenceavailable

Materials/equipment listedavailable

Specific product linksneeds review

Evidence quotes (5)

In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one. For SNP j, we denote the vector of marginal (i.e., not controlling for other SNPs), true effects on each of the T traits by β j. We treat these true effects as random effects with E ( β j ) = 0 and Var ( β j ) = Ω. If the true effects are correlated across traits, then the off-diagonal elements of Ω are non-zero. MTAG's key assumption is that Ω is homogeneous across SNPs, i.e., it does not depend on j.
Source method evidence

We measure the predictive power of each polygenic score by its incremental R 2, defined as the increase in coefficient of determination ( R 2 ) as we move from a regression of the trait only on a set of controls (year of birth, year of birth squared, sex, their interactions, and 10 principal components of the genetic data) to a regression that additionally includes the polygenic score as an independent variable.
Source method evidence

For a final comparison, we analyze both the GWAS and MTAG results using the bioinformatics tool DEPICT. We present the prioritized genes, enriched gene sets, and enriched tissues identified by DEPICT at the standard FDR threshold of 5%.
Source method evidence

There are T traits, which may be binary or quantitative. We standardize each trait and the genotype for each single-nucleotide polymorphism (SNP) j so that they all have mean zero and variance one. The length- T vector of marginal (i.e., not controlling for other SNPs), true effects of SNP j on each of the traits is denoted β j. We assume that these are random effects with mean 0 and variance-covariance matrix Ω that is the same across j. The mean is zero because we treat the choice of reference allele as arbitrary. We make the common assumption,, that the β j 's are identically distributed across j. The assumption implies that the expected amount of phenotypic variance explained is equal for each SNP, regardless of SNP characteristics such as allele frequency.
Source method evidence

This special case corresponds to the "traits" being (the same measure of) a single trait; in other words, applying MTAG instead of inverse-variance-weighted meta-analysis to T GWAS results. Doing so can be useful if there is sample overlap in the GWAS results. In this case, as noted in the main text, MTAG specializes to β ^ MTAG, j, t = 1 ′ ∑ j - 1 1 ′ ∑ j - 1 1 β ^ j for all t, and it is no longer necessary to estimate Ω.
Source method evidence

Machine-readable layer

[
  {
    "@context": "https://schema.org",
    "@type": "HowTo",
    "name": "Multi-trait analysis of genome-wide association summary statistics using MTAG methods",
    "description": "Evidence-backed execution summary for Multi-trait analysis of genome-wide association summary statistics using MTAG methods from Multi-trait analysis of genome-wide association summary statistics using MTAG.",
    "step": [
      {
        "@type": "HowToStep",
        "position": 1,
        "name": "MTAG Framework",
        "text": "In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one. For SNP j, we denote the vector of marginal (i.e., not controlling for other SNPs), true effects on each of the T traits by β j. We treat these true effects as random effects with E ( β j ) = 0 and Var ( β j ) = Ω. If the true effects are correlated across traits, then the off-diagonal elements of Ω are non-zero. MTAG's key assumption is that Ω is homogeneous across SNPs, i.e., it does not depend on j."
      },
      {
        "@type": "HowToStep",
        "position": 2,
        "name": "Polygenic Prediction",
        "text": "We measure the predictive power of each polygenic score by its incremental R 2, defined as the increase in coefficient of determination ( R 2 ) as we move from a regression of the trait only on a set of controls (year of birth, year of birth squared, sex, their interactions, and 10 principal components of the genetic data) to a regression that additionally includes the polygenic score as an independent variable."
      },
      {
        "@type": "HowToStep",
        "position": 3,
        "name": "Biological Annotation",
        "text": "For a final comparison, we analyze both the GWAS and MTAG results using the bioinformatics tool DEPICT. We present the prioritized genes, enriched gene sets, and enriched tissues identified by DEPICT at the standard FDR threshold of 5%."
      },
      {
        "@type": "HowToStep",
        "position": 4,
        "name": "ONLINE METHODS",
        "text": "There are T traits, which may be binary or quantitative. We standardize each trait and the genotype for each single-nucleotide polymorphism (SNP) j so that they all have mean zero and variance one. The length- T vector of marginal (i.e., not controlling for other SNPs), true effects of SNP j on each of the traits is denoted β j. We assume that these are random effects with mean 0 and variance-covariance matrix Ω that is the same across j. The mean is zero because we treat the choice of reference allele as arbitrary. We make the common assumption,, that the β j 's are identically distributed across j. The assumption implies that the expected amount of phenotypic variance explained is equal for each SNP, regardless of SNP characteristics such as allele frequency."
      },
      {
        "@type": "HowToStep",
        "position": 5,
        "name": "Perfect genetic correlation and equal heritabilities",
        "text": "This special case corresponds to the \"traits\" being (the same measure of) a single trait; in other words, applying MTAG instead of inverse-variance-weighted meta-analysis to T GWAS results. Doing so can be useful if there is sample overlap in the GWAS results. In this case, as noted in the main text, MTAG specializes to β ^ MTAG, j, t = 1 ′ ∑ j - 1 1 ′ ∑ j - 1 1 β ^ j for all t, and it is no longer necessary to estimate Ω."
      }
    ],
    "tool": [
      {
        "@type": "HowToTool",
        "name": "Summary"
      },
      {
        "@type": "HowToTool",
        "name": "Biological Annotation"
      },
      {
        "@type": "HowToTool",
        "name": "CODE AVAILABILITY"
      },
      {
        "@type": "HowToTool",
        "name": "ONLINE METHODS"
      },
      {
        "@type": "HowToTool",
        "name": "Special Cases"
      },
      {
        "@type": "HowToTool",
        "name": "Polygenic Prediction"
      }
    ],
    "supply": [
      {
        "@type": "HowToSupply",
        "name": "Summary"
      },
      {
        "@type": "HowToSupply",
        "name": "ONLINE METHODS"
      }
    ],
    "isBasedOn": {
      "@type": "ScholarlyArticle",
      "headline": "Multi-trait analysis of genome-wide association summary statistics using MTAG",
      "datePublished": "2018",
      "author": [
        {
          "@type": "Person",
          "name": "Patrick Turley"
        },
        {
          "@type": "Person",
          "name": "Raymond K. Walters"
        },
        {
          "@type": "Person",
          "name": "Omeed Maghzian"
        },
        {
          "@type": "Person",
          "name": "Aysu Okbay"
        },
        {
          "@type": "Person",
          "name": "James J. Lee"
        },
        {
          "@type": "Person",
          "name": "Mark Alan Fontana"
        },
        {
          "@type": "Person",
          "name": "Tuan Anh Nguyen-Viet"
        },
        {
          "@type": "Person",
          "name": "Robbee Wedow"
        },
        {
          "@type": "Person",
          "name": "Meghan Zacher"
        },
        {
          "@type": "Person",
          "name": "Nicholas A. Furlotte"
        },
        {
          "@type": "Person",
          "name": "Patrik Magnusson"
        },
        {
          "@type": "Person",
          "name": "Sven Oskarsson"
        },
        {
          "@type": "Person",
          "name": "Magnus Johannesson"
        },
        {
          "@type": "Person",
          "name": "Peter M. Visscher"
        },
        {
          "@type": "Person",
          "name": "David Laibson"
        },
        {
          "@type": "Person",
          "name": "David Cesarini"
        },
        {
          "@type": "Person",
          "name": "Benjamin M. Neale"
        },
        {
          "@type": "Person",
          "name": "Daniel J. Benjamin"
        }
      ],
      "identifier": "10.1038/s41588-017-0009-4"
    }
  },
  {
    "@context": "https://schema.org",
    "@type": "BreadcrumbList",
    "itemListElement": [
      {
        "@type": "ListItem",
        "position": 1,
        "name": "Experiments",
        "item": "https://replicatescience.com/experiments"
      },
      {
        "@type": "ListItem",
        "position": 2,
        "name": "Multi-trait analysis of genome-wide association summary statistics using MTAG methods",
        "item": "https://replicatescience.com/experiments/multi-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-methods-patrick-turley-pmc5805593/multi-trait-analysis-of-genome-wide-association-summary-statistics-using-mtag-mlpgy8ee"
      }
    ]
  }
]

DOI PMC 100% completeness score