02 · Shopping and prep

Shopping and prep list

What do I need before I start?

biologicalsource linked

Biological model pending

Subject model for the experiment.

Use: confirm full cohort details in the source paper

This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied the aggregation procedure to random groups of cells, which produced a pseudobulk matrix composed of 'pseudo-replicates' (Fig. ). This experiment induced a similar decrease in the performance of pseudobulk methods, combined with the re-emergence of a bias towards highly expressed genes (Fig. and Supplementary Fig. ).Confirm cohort

reagentsource linked

RNAscope

reagent used in the protocol.

Use: To corroborate the results suggested by DE analysis of scRNA-seq data, we analyzed the in situ co-localization of putatively DE genes and cell type marker genes using RNAscope (Advanced Cell Diagnostics). Lists of putatively DE genes were obtained for representative single-cell and pseudobulk DE methods (the Wilcox...

source-linked evidence quoteConfirm item

instrumentsource linked

Discussion

Our results demonstrate that single-cell DE methods are poised to produce false discoveries. This understanding uncovers an enormous risk for the field. Our findings suggest that many published findings may be false. Moreover, if left unresolved, substantial research funding may be allocated to follow up on these fa...

Use: Our results demonstrate that single-cell DE methods are poised to produce false discoveries. This understanding uncovers an enormous risk for the field. Our findings suggest that many published findings may be false. Moreover, if left unresolved, substantial research funding may be allocated to follow up on these fa...

source-linked evidence quoteConfirm apparatus

instrumentsource linked

Impact of mean expression

We initially hypothesized that differences between single-cell DE analysis methods could be attributed to their differing sensitivities towards lowly expressed genes. To explore this hypothesis, we performed the following analyses. First, we divided genes from the eighteen gold standard datasets into three equally s...

Use: We initially hypothesized that differences between single-cell DE analysis methods could be attributed to their differing sensitivities towards lowly expressed genes. To explore this hypothesis, we performed the following analyses. First, we divided genes from the eighteen gold standard datasets into three equally s...

source-linked evidence quoteConfirm apparatus

instrumentsource linked

Mixed models

Use: Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) bas...

source-linked evidence quoteConfirm apparatus

softwaresource linked

Discussion

Software used for acquisition, scoring, statistics, or reporting.

Use: Our results demonstrate that single-cell DE methods are poised to produce false discoveries. This understanding uncovers an enormous risk for the field. Our findings suggest that many published findings may be false. Moreover, if left unresolved, substantial research funding may be allocated to follow up on these fa...

source-linked evidence quoteConfirm software

softwaresource linked

Differential expression analysis methods

Software used for acquisition, scoring, statistics, or reporting.

Use: The seven single-cell methods analyzed here included a t-test, a Wilcoxon rank-sum test, logistic regression, negative binomial and Poisson generalized linear models, a likelihood ratio test, and the two-part hurdle model implemented by MAST. The implementation provided in the Seurat function 'FindMarkers&#...

source-linked evidence quoteConfirm software

softwaresource linked

Dissecting pseudobulk DE methods

Software used for acquisition, scoring, statistics, or reporting.

Use: These experiments led us to suspect that discarding information about the inherent variability of biological replicates caused both the bias towards highly expressed genes and the attendant decrease in performance. To test this hypothesis, we compared the variance of gene expression in pseudobulks and pseudo-replica...

source-linked evidence quoteConfirm software

03 · Execution checks

Before you run

What should be confirmed before execution?

First confirmation

Species or subject information is missing.

Confirm before execution

Equipment is listed but no product mappings are linked.

Confirm before execution

This page is backed by a publishable Replication Data Ledger package with zero critical source-verification issues.

Confirm before execution

Open the source paper before finalizing run-specific details.

Procurement checkpoint

Use source-stated vendors where present. Treat mapped products as sourcing options unless the page marks an exact source match.

Open quote workflow

04 · Procedure

Step-by-step procedure

What do I do, in order?

01extracted step

1 evidence link

Application to spinal cord injury

Experiments were conducted on adult male or female C57BL/6 mice (15-35 g body weight, 12-30 weeks of age). Vglut2:Cre (Jackson Laboratory 016963) transgenic mice were used and maintained on a mixed genetic background (129/C57BL/6). Housing, surgery, behavioral experiments and euthanasia were performed in compliance with the Swiss Veterinary Law guidelines. Animal care, including manual bladder voiding, was performed twice daily for the first 3 weeks after injury and once daily for the remaining post-injury period. All procedures and surgeries were approved by the Veterinary Office of the Canton of Geneva (Switzerland; GE/57/20 A).

Neededsource paper and local SOP

Timing30 weeks

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThis result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...

02extracted step

1 evidence link

Surgical procedures and post-surgical care

Surgical procedures were performed as previously described, -. Briefly, a laminectomy was made at the mid-thoracic level (T9 vertebra). We performed a contusion injury using a force-controlled spinal cord impactor (IH-0400 Impactor, Precision Systems and Instrumentation LLC, USA ), as previously described,. The applied force was set to 90 kdyn. Analgesia (buprenorphine, Essex Chemie AG, Switzerland, 0.01-0.05 mg per kg, s.c.) was provided for three days after surgery.

Neededsource paper and local SOP

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThis result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...

03extracted step

1 evidence link

Mixed models

Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) based on the negative binomial or Poisson distributions, adapting implementations provided in the 'muscat' R package. For each of these models, we evaluated the impact of incorporating the library size factors as an offset term, and compared the Wald test of model coefficients to a likelihood ratio test against a reduced model, yielding a total of four GLMMs from each distribution. The enormous computational requirements of the GLMMs prevented us from evaluating these models in the full ground truth datasets; instead, we analyzed a series of downsampled datasets,...

NeededMixed models

Timingnot specified

ConditionsDirectly quoted from source evidence; verify all lab-specific constraints against the source paper before execution.

OutputThis result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...

05 · Measurement

Measurement outputs

What raw and processed outputs should exist?

from paper

This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

from paper

We sought to understand the common factors that could explain the decreased performance of pseudobulk methods in these two experiments. We recognized that both experiments entai...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

from paper

Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

from paper

These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve...

Raw artifact: Per-sample or per-animal endpoint measurements collected during the experiment
Processed artifact: Structured table with cleaned measurements ready for comparison
Reported as: Summary statistics and between-group or across-timepoint comparisons

06 · Analysis

Analysis plan

How should the outputs become interpretable results?

Acquisition

Collect raw experimental outputs with enough metadata to preserve sample identity, condition, and timing.

inferred from protocol

Preprocessing / cleaning

Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells.

from paper

Scoring or quantification

Quantify the primary readouts for this experiment: This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied...; We sought to understand the common factors that could explain the decreased performance of pseudobulk methods in these two experiments. We recognized that both experiments entai...; Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...; These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve....

from paper

Statistical comparison

Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-...; These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve...; We aimed to compare available statistical methods for differential expression (DE) analysis based on their ability to generate biologically accurate results. We reasoned that pe...; We selected a total of fourteen DE methods, representing the most widely used statistical approaches for single-cell transcriptomics, to compare (Methods, "Differential ex...

from paper

Reporting output

Report representative outputs alongside summary comparisons for This result raised the possibility that the aggregation procedure itself was directly responsible for the superiority of pseudobulk methods. To evaluate this notion, we applied..., We sought to understand the common factors that could explain the decreased performance of pseudobulk methods in these two experiments. We recognized that both experiments entai..., Single-cell RNA-seq (scRNA-seq) enables the quantification of RNA abundance at the resolution of individual cells. The maturation of single-cell technologies now enables large-..., These divergences emphasize the importance of developing a sound epistemological foundation for differential expression in single-cell data. In this work, we reasoned that deve....

inferred from protocol

Structured statistical methods

source structured

07 · Source layer

Source and audit

What supports the facts on this page?

Source identityavailable

Structured protocolavailable

Methods evidenceavailable

Materials/equipment listedavailable

Specific product linksneeds review

Evidence quotes (3)

Experiments were conducted on adult male or female C57BL/6 mice (15-35 g body weight, 12-30 weeks of age). Vglut2:Cre (Jackson Laboratory 016963) transgenic mice were used and maintained on a mixed genetic background (129/C57BL/6). Housing, surgery, behavioral experiments and euthanasia were performed in compliance with the Swiss Veterinary Law guidelines. Animal care, including manual bladder voiding, was performed twice daily for the first 3 weeks after injury and once daily for the remaining post-injury period. All procedures and surgeries were approved by the Veterinary Office of the Canton of Geneva (Switzerland; GE/57/20 A).
Source method evidence

Surgical procedures were performed as previously described, -. Briefly, a laminectomy was made at the mid-thoracic level (T9 vertebra). We performed a contusion injury using a force-controlled spinal cord impactor (IH-0400 Impactor, Precision Systems and Instrumentation LLC, USA ), as previously described,. The applied force was set to 90 kdyn. Analgesia (buprenorphine, Essex Chemie AG, Switzerland, 0.01-0.05 mg per kg, s.c.) was provided for three days after surgery.
Source method evidence

Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) based on the negative binomial or Poisson distributions, adapting implementations provided in the 'muscat' R package. For each of these models, we evaluated the impact of incorporating the library size factors as an offset term, and compared the Wald test of model coefficients to a likelihood ratio test against a reduced model, yielding a total of four GLMMs from each distribution. The enormous computational requirements of the GLMMs prevented us from evaluating these models in the full ground truth datasets; instead, we analyzed a series of downsampled datasets, each containing between 25 and 1,000 cells. To quantify the computational resources required by each DE method, we monitored peak memory usage using the 'peakRAM' R package, and the base R function 'system.time' to record wall time.
Source method evidence

Machine-readable layer

[
  {
    "@context": "https://schema.org",
    "@type": "HowTo",
    "name": "Confronting false discoveries in single-cell differential expression methods",
    "description": "Evidence-backed execution summary for Confronting false discoveries in single-cell differential expression methods from Confronting false discoveries in single-cell differential expression.",
    "totalTime": "PT72000M",
    "step": [
      {
        "@type": "HowToStep",
        "position": 1,
        "name": "Application to spinal cord injury",
        "text": "Experiments were conducted on adult male or female C57BL/6 mice (15-35 g body weight, 12-30 weeks of age). Vglut2:Cre (Jackson Laboratory 016963) transgenic mice were used and maintained on a mixed genetic background (129/C57BL/6). Housing, surgery, behavioral experiments and euthanasia were performed in compliance with the Swiss Veterinary Law guidelines. Animal care, including manual bladder voiding, was performed twice daily for the first 3 weeks after injury and once daily for the remaining post-injury period. All procedures and surgeries were approved by the Veterinary Office of the Canton of Geneva (Switzerland; GE/57/20 A)."
      },
      {
        "@type": "HowToStep",
        "position": 2,
        "name": "Surgical procedures and post-surgical care",
        "text": "Surgical procedures were performed as previously described, -. Briefly, a laminectomy was made at the mid-thoracic level (T9 vertebra). We performed a contusion injury using a force-controlled spinal cord impactor (IH-0400 Impactor, Precision Systems and Instrumentation LLC, USA ), as previously described,. The applied force was set to 90 kdyn. Analgesia (buprenorphine, Essex Chemie AG, Switzerland, 0.01-0.05 mg per kg, s.c.) was provided for three days after surgery."
      },
      {
        "@type": "HowToStep",
        "position": 3,
        "name": "Mixed models",
        "text": "Having established that the performance of DE methods is contingent on their ability to account for biological replicates, we asked why mixed models failed to match the performance of pseudobulk methods. In addition to the linear mixed model described above, we implemented generalized linear mixed models (GLMMs) based on the negative binomial or Poisson distributions, adapting implementations provided in the 'muscat' R package. For each of these models, we evaluated the impact of incorporating the library size factors as an offset term, and compared the Wald test of model coefficients to a likelihood ratio test against a reduced model, yielding a total of four GLMMs from each distribution. The enormous computational requirements of the GLMMs prevented us from evaluating these models in the full ground truth datasets; instead, we analyzed a series of downsampled datasets,..."
      }
    ],
    "tool": [
      {
        "@type": "HowToTool",
        "name": "Discussion"
      },
      {
        "@type": "HowToTool",
        "name": "Impact of mean expression"
      },
      {
        "@type": "HowToTool",
        "name": "Mixed models"
      }
    ],
    "supply": [
      {
        "@type": "HowToSupply",
        "name": "RNAscope"
      }
    ],
    "isBasedOn": {
      "@type": "ScholarlyArticle",
      "headline": "Confronting false discoveries in single-cell differential expression",
      "datePublished": "2021",
      "author": [
        {
          "@type": "Person",
          "name": "Jordan W. Squair"
        },
        {
          "@type": "Person",
          "name": "Matthieu Gautier"
        },
        {
          "@type": "Person",
          "name": "Claudia Kathe"
        },
        {
          "@type": "Person",
          "name": "Mark A. Anderson"
        },
        {
          "@type": "Person",
          "name": "Nicholas D. James"
        },
        {
          "@type": "Person",
          "name": "Thomas H. Hutson"
        },
        {
          "@type": "Person",
          "name": "Rémi Hudelle"
        },
        {
          "@type": "Person",
          "name": "Taha Qaiser"
        },
        {
          "@type": "Person",
          "name": "Kaya J. E. Matson"
        },
        {
          "@type": "Person",
          "name": "Quentin Barraud"
        },
        {
          "@type": "Person",
          "name": "Ariel J. Levine"
        },
        {
          "@type": "Person",
          "name": "Gioele La Manno"
        },
        {
          "@type": "Person",
          "name": "Michael A. Skinnider"
        },
        {
          "@type": "Person",
          "name": "Grégoire Courtine"
        }
      ],
      "identifier": "10.1038/s41467-021-25960-2"
    }
  },
  {
    "@context": "https://schema.org",
    "@type": "BreadcrumbList",
    "itemListElement": [
      {
        "@type": "ListItem",
        "position": 1,
        "name": "Experiments",
        "item": "https://replicatescience.com/experiments"
      },
      {
        "@type": "ListItem",
        "position": 2,
        "name": "Confronting false discoveries in single-cell differential expression methods",
        "item": "https://replicatescience.com/experiments/confronting-false-discoveries-in-single-cell-differential-expression-methods-jordan-w-squair-pmc8479118/confronting-false-discoveries-in-single-cell-differential-expression-mlph9h2r"
      }
    ]
  }
]

DOI PMC 100% completeness score