Confronting false discoveries in single-cell differential expression methods
Aim. Evidence-backed execution summary for Confronting false discoveries in single-cell differential expression methods from Confronting false discoveries in single-cell differential expression.
Show snapshot details
On this page
This experiment, in seven questions
Jump straight to the part of the recipe you need. Data and provenance labels stay close to the action they support.
Shopping and prep list
What do I need before I start?
Biological model pending
Subject model for the experiment.
- Use
- confirm full cohort details in the source paper
RNAscope
reagent used in the protocol.
- Use
- To corroborate the results suggested by DE analysis of scRNA-seq data, we analyzed the in situ co-localization of putatively DE genes and cell type marker genes using RNAscope (Advanced Cell Diagnostics). Lists of putatively DE genes were obtained for representative single-cell and pseudobulk DE methods (the Wilcox...
Discussion
Our results demonstrate that single-cell DE methods are poised to produce false discoveries. This understanding uncovers an enormous risk for the field. Our findings suggest that many published findings may be false. Moreover, if left unresolved, substantial research funding may be allocated to follow up on these fa...
- Use
- Our results demonstrate that single-cell DE methods are poised to produce false discoveries. This understanding uncovers an enormous risk for the field. Our findings suggest that many published findings may be false. Moreover, if left unresolved, substantial research funding may be allocated to follow up on these fa...
Impact of mean expression
We initially hypothesized that differences between single-cell DE analysis methods could be attributed to their differing sensitivities towards lowly expressed genes. To explore this hypothesis, we performed the following analyses. First, we divided genes from the eighteen gold standard datasets into three equally s...
- Use
- We initially hypothesized that differences between single-cell DE analysis methods could be attributed to their differing sensitivities towards lowly expressed genes. To explore this hypothesis, we performed the following analyses. First, we divided genes from the eighteen gold standard datasets into three equally s...
Mixed models
Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) bas...
- Use
- Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) bas...
Discussion
Software used for acquisition, scoring, statistics, or reporting.
- Use
- Our results demonstrate that single-cell DE methods are poised to produce false discoveries. This understanding uncovers an enormous risk for the field. Our findings suggest that many published findings may be false. Moreover, if left unresolved, substantial research funding may be allocated to follow up on these fa...
Differential expression analysis methods
Software used for acquisition, scoring, statistics, or reporting.
- Use
- The seven single-cell methods analyzed here included a t-test, a Wilcoxon rank-sum test, logistic regression, negative binomial and Poisson generalized linear models, a likelihood ratio test, and the two-part hurdle model implemented by MAST. The implementation provided in the Seurat function 'FindMarkers&#...
Dissecting pseudobulk DE methods
Software used for acquisition, scoring, statistics, or reporting.
- Use
- These experiments led us to suspect that discarding information about the inherent variability of biological replicates caused both the bias towards highly expressed genes and the attendant decrease in performance. To test this hypothesis, we compared the variance of gene expression in pseudobulks and pseudo-replica...
Before you run
What should be confirmed before execution?
First confirmation
Species or subject information is missing.
Confirm before execution
Equipment is listed but no product mappings are linked.
Confirm before execution
This page is backed by a publishable Replication Data Ledger package with zero critical source-verification issues.
Confirm before execution
Open the source paper before finalizing run-specific details.
Procurement checkpoint
Use source-stated vendors where present. Treat mapped products as sourcing options unless the page marks an exact source match.
Open quote workflowStep-by-step procedure
What do I do, in order?
Application to spinal cord injury
Experiments were conducted on adult male or female C57BL/6 mice (15-35 g body weight, 12-30 weeks of age). Vglut2:Cre (Jackson Laboratory 016963) transgenic mice were used and maintained on a mixed genetic background (129/C57BL/6). Housing, surgery, behavioral experiments and euthanasia were performed in compliance with the Swiss Veterinary Law guidelines. Animal care, including manual bladder voiding, was performed twice daily for the first 3 weeks after injury and once daily for the remaining post-injury period. All procedures and surgeries were approved by the Veterinary Office of the Canton of Geneva (Switzerland; GE/57/20 A).
Surgical procedures and post-surgical care
Surgical procedures were performed as previously described, -. Briefly, a laminectomy was made at the mid-thoracic level (T9 vertebra). We performed a contusion injury using a force-controlled spinal cord impactor (IH-0400 Impactor, Precision Systems and Instrumentation LLC, USA ), as previously described,. The applied force was set to 90 kdyn. Analgesia (buprenorphine, Essex Chemie AG, Switzerland, 0.01-0.05 mg per kg, s.c.) was provided for three days after surgery.
Mixed models
Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) based on the negative binomial or Poisson distributions, adapting implementations provided in the 'muscat' R package. For each of these models, we evaluated the impact of incorporating the library size factors as an offset term, and compared the Wald test of model coefficients to a likelihood ratio test against a reduced model, yielding a total of four GLMMs from each distribution. The enormous computational requirements of the GLMMs prevented us from evaluating these models in the full ground truth datasets; instead, we analyzed a series of downsampled datasets,...
Measurement outputs
What raw and processed outputs should exist?
This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
We sought to understand the common factors that could explain the decreased performance of pseudobulk methods in these two experiments. We recognized that both experiments entai...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve...
- Raw artifact
- Per-sample or per-animal endpoint measurements collected during the experiment
- Processed artifact
- Structured table with cleaned measurements ready for comparison
- Reported as
- Summary statistics and between-group or across-timepoint comparisons
Analysis plan
How should the outputs become interpretable results?
Acquisition
Collect raw experimental outputs with enough metadata to preserve sample identity, condition, and timing.
inferred from protocolPreprocessing / cleaning
Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells.
from paperScoring or quantification
Quantify the primary readouts for this experiment: This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...; We sought to understand the common factors that could explain the decreased performance of pseudobulk methods in these two experiments. We recognized that both experiments entai...; Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...; These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve....
from paperStatistical comparison
Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...; These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve...; We aimed to compare available statistical methods for differential expression (DE) analysis based on their ability to generate biologically accurate results. We reasoned that pe...; We selected a total of fourteen DE methods, representing the most widely used statistical approaches for single-cell transcriptomics, to compare (Methods, "Differential ex...
from paperReporting output
Report representative outputs alongside summary comparisons for This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied..., We sought to understand the common factors that could explain the decreased performance of pseudobulk methods in these two experiments. We recognized that both experiments entai..., Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-..., These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve....
inferred from protocolStructured statistical methods
Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...; These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve...; We aimed to compare available statistical methods for differential expression (DE) analysis based on their ability to generate biologically accurate results. We reasoned that pe...; We selected a total of fourteen DE methods, representing the most widely used statistical approaches for single-cell transcriptomics, to compare (Methods, "Differential ex...
source structuredSource and audit
What supports the facts on this page?
Evidence quotes (3)
Experiments were conducted on adult male or female C57BL/6 mice (15-35 g body weight, 12-30 weeks of age). Vglut2:Cre (Jackson Laboratory 016963) transgenic mice were used and maintained on a mixed genetic background (129/C57BL/6). Housing, surgery, behavioral experiments and euthanasia were performed in compliance with the Swiss Veterinary Law guidelines. Animal care, including manual bladder voiding, was performed twice daily for the first 3 weeks after injury and once daily for the remaining post-injury period. All procedures and surgeries were approved by the Veterinary Office of the Canton of Geneva (Switzerland; GE/57/20 A).
Surgical procedures were performed as previously described, -. Briefly, a laminectomy was made at the mid-thoracic level (T9 vertebra). We performed a contusion injury using a force-controlled spinal cord impactor (IH-0400 Impactor, Precision Systems and Instrumentation LLC, USA ), as previously described,. The applied force was set to 90 kdyn. Analgesia (buprenorphine, Essex Chemie AG, Switzerland, 0.01-0.05 mg per kg, s.c.) was provided for three days after surgery.
Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) based on the negative binomial or Poisson distributions, adapting implementations provided in the 'muscat' R package. For each of these models, we evaluated the impact of incorporating the library size factors as an offset term, and compared the Wald test of model coefficients to a likelihood ratio test against a reduced model, yielding a total of four GLMMs from each distribution. The enormous computational requirements of the GLMMs prevented us from evaluating these models in the full ground truth datasets; instead, we analyzed a series of downsampled datasets, each containing between 25 and 1,000 cells. To quantify the computational resources required by each DE method, we monitored peak memory usage using the 'peakRAM' R package, and the base R function 'system.time' to record wall time.
Machine-readable layer
[
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "Confronting false discoveries in single-cell differential expression methods",
"description": "Evidence-backed execution summary for Confronting false discoveries in single-cell differential expression methods from Confronting false discoveries in single-cell differential expression.",
"totalTime": "PT72000M",
"step": [
{
"@type": "HowToStep",
"position": 1,
"name": "Application to spinal cord injury",
"text": "Experiments were conducted on adult male or female C57BL/6 mice (15-35 g body weight, 12-30 weeks of age). Vglut2:Cre (Jackson Laboratory 016963) transgenic mice were used and maintained on a mixed genetic background (129/C57BL/6). Housing, surgery, behavioral experiments and euthanasia were performed in compliance with the Swiss Veterinary Law guidelines. Animal care, including manual bladder voiding, was performed twice daily for the first 3 weeks after injury and once daily for the remaining post-injury period. All procedures and surgeries were approved by the Veterinary Office of the Canton of Geneva (Switzerland; GE/57/20 A)."
},
{
"@type": "HowToStep",
"position": 2,
"name": "Surgical procedures and post-surgical care",
"text": "Surgical procedures were performed as previously described, -. Briefly, a laminectomy was made at the mid-thoracic level (T9 vertebra). We performed a contusion injury using a force-controlled spinal cord impactor (IH-0400 Impactor, Precision Systems and Instrumentation LLC, USA ), as previously described,. The applied force was set to 90 kdyn. Analgesia (buprenorphine, Essex Chemie AG, Switzerland, 0.01-0.05 mg per kg, s.c.) was provided for three days after surgery."
},
{
"@type": "HowToStep",
"position": 3,
"name": "Mixed models",
"text": "Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) based on the negative binomial or Poisson distributions, adapting implementations provided in the 'muscat' R package. For each of these models, we evaluated the impact of incorporating the library size factors as an offset term, and compared the Wald test of model coefficients to a likelihood ratio test against a reduced model, yielding a total of four GLMMs from each distribution. The enormous computational requirements of the GLMMs prevented us from evaluating these models in the full ground truth datasets; instead, we analyzed a series of downsampled datasets,..."
}
],
"tool": [
{
"@type": "HowToTool",
"name": "Discussion"
},
{
"@type": "HowToTool",
"name": "Impact of mean expression"
},
{
"@type": "HowToTool",
"name": "Mixed models"
}
],
"supply": [
{
"@type": "HowToSupply",
"name": "RNAscope"
}
],
"isBasedOn": {
"@type": "ScholarlyArticle",
"headline": "Confronting false discoveries in single-cell differential expression",
"datePublished": "2021",
"author": [
{
"@type": "Person",
"name": "Jordan W. Squair"
},
{
"@type": "Person",
"name": "Matthieu Gautier"
},
{
"@type": "Person",
"name": "Claudia Kathe"
},
{
"@type": "Person",
"name": "Mark A. Anderson"
},
{
"@type": "Person",
"name": "Nicholas D. James"
},
{
"@type": "Person",
"name": "Thomas H. Hutson"
},
{
"@type": "Person",
"name": "Rémi Hudelle"
},
{
"@type": "Person",
"name": "Taha Qaiser"
},
{
"@type": "Person",
"name": "Kaya J. E. Matson"
},
{
"@type": "Person",
"name": "Quentin Barraud"
},
{
"@type": "Person",
"name": "Ariel J. Levine"
},
{
"@type": "Person",
"name": "Gioele La Manno"
},
{
"@type": "Person",
"name": "Michael A. Skinnider"
},
{
"@type": "Person",
"name": "Grégoire Courtine"
}
],
"identifier": "10.1038/s41467-021-25960-2"
}
},
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Experiments",
"item": "https://replicatescience.com/experiments"
},
{
"@type": "ListItem",
"position": 2,
"name": "Confronting false discoveries in single-cell differential expression methods",
"item": "https://replicatescience.com/experiments/confronting-false-discoveries-in-single-cell-differential-expression-methods-jordan-w-squair-pmc8479118/confronting-false-discoveries-in-single-cell-differential-expression-mlph9h2r"
}
]
}
]