↓ Skip to main content

Optimization of an RNA-Seq Differential Gene Expression Analysis Depending on Biological Replicate Number and Library Size

Overview of attention for article published in Frontiers in Plant Science, February 2018
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (85th percentile)
  • High Attention Score compared to outputs of the same age and source (92nd percentile)

Mentioned by

blogs
1 blog
twitter
8 X users
reddit
1 Redditor

Citations

dimensions_citation
75 Dimensions

Readers on

mendeley
307 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Optimization of an RNA-Seq Differential Gene Expression Analysis Depending on Biological Replicate Number and Library Size
Published in
Frontiers in Plant Science, February 2018
DOI 10.3389/fpls.2018.00108
Pubmed ID
Authors

Sophie Lamarre, Pierre Frasse, Mohamed Zouine, Delphine Labourdette, Elise Sainderichin, Guojian Hu, Véronique Le Berre-Anton, Mondher Bouzayen, Elie Maza

Abstract

RNA-Seq is a widely used technology that allows an efficient genome-wide quantification of gene expressions for, for example, differential expression (DE) analysis. After a brief review of the main issues, methods and tools related to the DE analysis of RNA-Seq data, this article focuses on the impact of both the replicate number and library size in such analyses. While the main drawback of previous relevant studies is the lack of generality, we conducted both an analysis of a two-condition experiment (with eight biological replicates per condition) to compare the results with previous benchmark studies, and a meta-analysis of 17 experiments with up to 18 biological conditions, eight biological replicates and 100 million (M) reads per sample. As a global trend, we concluded that the replicate number has a larger impact than the library size on the power of the DE analysis, except for low-expressed genes, for which both parameters seem to have the same impact. Our study also provides new insights for practitioners aiming to enhance their experimental designs. For instance, by analyzing both the sensitivity and specificity of the DE analysis, we showed that the optimal threshold to control the false discovery rate (FDR) is approximately 2-r, where r is the replicate number. Furthermore, we showed that the false positive rate (FPR) is rather well controlled by all three studied R packages:DESeq, DESeq2, andedgeR. We also analyzed the impact of both the replicate number and library size on gene ontology (GO) enrichment analysis. Interestingly, we concluded that increases in the replicate number and library size tend to enhance the sensitivity and specificity, respectively, of the GO analysis. Finally, we recommend to RNA-Seq practitioners the production of a pilot data set to strictly analyze the power of their experimental design, or the use of a public data set, which should be similar to the data set they will obtain. For individuals working on tomato research, on the basis of the meta-analysis, we recommend at least four biological replicates per condition and 20 M reads per sample to be almost sure of obtaining about 1000 DE genes if they exist.

X Demographics

X Demographics

The data shown below were collected from the profiles of 8 X users who shared this research output. Click here to find out more about how the information was compiled.
As of 1 July 2024, you may notice a temporary increase in the numbers of X profiles with Unknown location. Click here to learn more.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 307 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 307 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 72 23%
Researcher 52 17%
Student > Master 46 15%
Student > Doctoral Student 22 7%
Student > Bachelor 17 6%
Other 27 9%
Unknown 71 23%
Readers by discipline Count As %
Agricultural and Biological Sciences 93 30%
Biochemistry, Genetics and Molecular Biology 79 26%
Medicine and Dentistry 12 4%
Neuroscience 9 3%
Immunology and Microbiology 7 2%
Other 31 10%
Unknown 76 25%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 12. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 21 April 2021.
All research outputs
#2,658,180
of 23,577,654 outputs
Outputs from Frontiers in Plant Science
#1,197
of 21,636 outputs
Outputs of similar age
#64,116
of 448,969 outputs
Outputs of similar age from Frontiers in Plant Science
#37
of 468 outputs
Altmetric has tracked 23,577,654 research outputs across all sources so far. Compared to these this one has done well and is in the 88th percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 21,636 research outputs from this source. They receive a mean Attention Score of 3.9. This one has done particularly well, scoring higher than 94% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 448,969 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 85% of its contemporaries.
We're also able to compare this research output to 468 others from the same source and published within six weeks on either side of this one. This one has done particularly well, scoring higher than 92% of its contemporaries.