7 Summary Page

This page is designed to give you a brief overview of your experiment. Your project's unique ID is shown at the top left corner in a black box along with a link to a sequencing quality control and statistics report generated by the pipeline in the orange box.

Effective library size plot shows the number of reads aligned to genes for each sample colored by treatment group. The same information in a tabular form is displayed in the table to the right of the plot.

Effecive library size plot colored by experimental groups

Figure 7.1: Effecive library size plot colored by experimental groups

The gene-wise variation plot (BCV plot) shows the relationship between biological coefficient of variation (BCV) for each gene and its expression. BCV is a measure of a gene's expression variation across all samples. Black dots represent BCV for each gene. Hover your mouse over the dot To see the name of the gene. The red line shows how average variation changes with expression and the blue line shows the average variation for all genes across all expression levels. The blue line is a helpful metric to compare your data with similar experiments. We tend to see cell culture RNA-seq experiments with an average variation of less than 0.2. Primary cultures and mouse models tend to range from ~ 0.2 to ~ 0.4, and patient samples will tend to range from ~ 0.2 to ~ 0.6. If your experiment falls outside of these ranges it may indicate that additional biological replicates are needed to identify differentially expressed genes.

For a more in depth discussion of biological coefficient of variation see EdgeR manual, chapter 2.8.2 or Davis McCarthy and Gordon Smith's paper on the topic D. J. McCarthy, Chen, and Smyth (2012).

Biological coeficient of variation plot.

Figure 7.2: Biological coeficient of variation plot.

Figure 7.2 is a typical BCV plot. You expect to see higher variation for low-expressing genes and lower variation for mid high-expressing genes. A minor proportion of genes is always highly variable.

Substantial deviations from Figure 5 indicate either technical problems with sample or a variety of pre-sequencing issues such as poor RNA quality or inconsistent sorting protocol (see Figure 7.3 below).

Unusual BCV plots

Figure 7.3: Unusual BCV plots

P-Value Histogram provides insight into the number of genes found significant in your study at a glance. The ideal histogram shown in Figure 7.4 has a relatively even distribution of P-values with a spike at small P-values indicating that a set of genes was strongly affected by the treatment. In addition, plotting the FDR histogram should look similar with a smaller spike at low values. It is not uncommon for studies with low power to have all genes with FDR > 0.5 or simply just 1 and indicates a serious need for additional replicates or a new experimental design.

P-value histogramm

Figure 7.4: P-value histogramm

For an extensive and visually appealing discussion of the subject, please refer to David Robinson's blog post

Finally, beside the Effective Library Size plot, Gene-wise Variation Plot, and P-Value/FDR Histograms this page also provides a table of experimental covariates (conditions) that were used during the analysis.

References

McCarthy, Davis J., Yunshun Chen, and Gordon K. Smyth. 2012. “Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.” Nucleic Acids Research 40 (10): 4288–97. doi:10.1093/nar/gks042.