Difference between revisions of "HLab:Qualitymetrics"

Latest revision as of 16:09, 23 January 2017

ChIP-seq

ChIP-seq Guidelines from ENCODE (2012) article

FRIP: Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
NSC: Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation. A value of greater than or equal to 1.05 is considered good.
RSC: Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
Complexity: This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
Duplication level: The duplication rate reported by FastQC. It is based on all reads even those that do not map.
Percent GC: The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map.
MAD of log ratios: mean absolute deviation of log ratios, Rafa's measure for replicate quality

RNA-seq

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011)

FRIT: Fraction of Reads in Target, for Total script this is anywhere in the gene.
Percent rRNA: Percent of the alignments that overlap rRNA as downloaded from UCSC.
Number of expressed genes: Count of genes with RSEM TPM greater than 1. Should be similar between replicates.
Number of spike-ins: Count of reads mapping to spike-ins.
Duplication level: The duplication rate reported by FastQC. It is based on all reads even those that do not map.
Percent GC: The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map.
Strand specificity: Three numbers from RSeQC infer_experiment.py that describe the strandedness of the reads. It speculates how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads’ mapping information to the underneath gene model. For stranded experiments the numbers are expected to approach .99:.01:0, or .01:.99:0 for unstranded .50:.50:0.
MAD of log ratios: mean absolute deviation of log ratios, Rafa's measure of replicate quality. Only looks at genes that are non-zero in both replicates.

Return to HLab:Main

Difference between revisions of "HLab:Qualitymetrics"

Latest revision as of 16:09, 23 January 2017

ChIP-seq

RNA-seq

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 1: / Line 1: @@
-ChIP-seq
+== ChIP-seq ==
-*FRIP
+ChIP-seq Guidelines from ENCODE (2012) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/ article]
-   Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
-*NSC
-   Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation.  A value of greater than or equal to 1.05 is considered good.
-*RSC
-   Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
-*Complexity
-   This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
-**ChIP-seq Guidelines from ENCODE (2012) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/ article]
-RNA-seq
+;FRIP
-*FRIT
+: Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
-   Fraction of Reads in Target, for Total script this is anywhere in the gene.
+;NSC
-*Percent rRNA
+: Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation.  A value of greater than or equal to 1.05 is considered good.
-   Percent of the alignments that overlap rRNA as downloaded from UCSC.
+;RSC
-*Number of expressed genes
+: Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
-*Number of spike-ins
+;Complexity
+: This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
+;Duplication level
+: The duplication rate reported by FastQC.  It is based on all reads even those that do not map.
+;Percent GC
+: The percent of nucleotides that are GC as reported by FastQC.  It is based on all reads even those that do not map.
+;MAD of log ratios
+: mean absolute deviation of log ratios, Rafa's measure for replicate quality
+== RNA-seq ==
+Standards, [http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Standards_V1.0.pdf Guidelines] and Best Practices for RNA-Seq V1.0 (June 2011)
+;FRIT
+: Fraction of Reads in Target, for Total script this is anywhere in the gene.
+;Percent rRNA
+: Percent of the alignments that overlap rRNA as downloaded from UCSC.
+;Number of expressed genes
+: Count of genes with RSEM TPM greater than 1.  Should be similar between replicates.
+;Number of spike-ins
+: Count of reads mapping to spike-ins.
+;Duplication level
+: The duplication rate reported by FastQC.  It is based on all reads even those that do not map.
+;Percent GC
+: The percent of nucleotides that are GC as reported by FastQC.  It is based on all reads even those that do not map.
+;Strand specificity
+: Three numbers from RSeQC infer_experiment.py that describe the strandedness of the reads. It speculates how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads’ mapping information to the underneath gene model. For stranded experiments the numbers are expected to approach .99:.01:0, or .01:.99:0 for unstranded .50:.50:0.
+;MAD of log ratios
+: mean absolute deviation of log ratios, Rafa's measure of replicate quality. Only looks at genes that are non-zero in both replicates.
+Return to [[HLab:Main]]