Difference between revisions of "HLab:Qualitymetrics"

From CCGB
Jump to: navigation, search
(RNA-seq)
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
ChIP-seq
+
== ChIP-seq ==
*FRIP
+
ChIP-seq Guidelines from ENCODE (2012) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/ article]
  Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
 
*NSC
 
  Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation.  A value of greater than or equal to 1.05 is considered good.
 
*RSC
 
  Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
 
*Complexity
 
  This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
 
**ChIP-seq Guidelines from ENCODE (2012) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/ article]
 
  
RNA-seq
+
;FRIP
*FRIT
+
: Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
  Fraction of Reads in Target, for Total script this is anywhere in the gene.   
+
;NSC
*Percent rRNA
+
: Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation.  A value of greater than or equal to 1.05 is considered good.
  Percent of the alignments that overlap rRNA as downloaded from UCSC.
+
;RSC
*Number of expressed genes
+
: Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
*Number of spike-ins
+
;Complexity
 +
: This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
 +
;Duplication level
 +
: The duplication rate reported by FastQC.  It is based on all reads even those that do not map.
 +
;Percent GC
 +
: The percent of nucleotides that are GC as reported by FastQC.  It is based on all reads even those that do not map.
 +
;MAD of log ratios
 +
: mean absolute deviation of log ratios, Rafa's measure for replicate quality
 +
 
 +
== RNA-seq ==
 +
Standards, [http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Standards_V1.0.pdf Guidelines] and Best Practices for RNA-Seq V1.0 (June 2011)
 +
;FRIT
 +
: Fraction of Reads in Target, for Total script this is anywhere in the gene.   
 +
;Percent rRNA
 +
: Percent of the alignments that overlap rRNA as downloaded from UCSC.
 +
;Number of expressed genes
 +
: Count of genes with RSEM TPM greater than 1.  Should be similar between replicates.
 +
;Number of spike-ins
 +
: Count of reads mapping to spike-ins.
 +
;Duplication level
 +
: The duplication rate reported by FastQC.  It is based on all reads even those that do not map.
 +
;Percent GC
 +
: The percent of nucleotides that are GC as reported by FastQC.  It is based on all reads even those that do not map.
 +
;Strand specificity
 +
: Three numbers from RSeQC infer_experiment.py that describe the strandedness of the reads. It speculates how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads’ mapping information to the underneath gene model. For stranded experiments the numbers are expected to approach .99:.01:0, or .01:.99:0 for unstranded .50:.50:0.
 +
;MAD of log ratios
 +
: mean absolute deviation of log ratios, Rafa's measure of replicate quality. Only looks at genes that are non-zero in both replicates.
 +
 
 +
Return to [[HLab:Main]]

Latest revision as of 15:09, 23 January 2017

ChIP-seq

ChIP-seq Guidelines from ENCODE (2012) article

FRIP
Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
NSC
Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation. A value of greater than or equal to 1.05 is considered good.
RSC
Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
Complexity
This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
Duplication level
The duplication rate reported by FastQC. It is based on all reads even those that do not map.
Percent GC
The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map.
MAD of log ratios
mean absolute deviation of log ratios, Rafa's measure for replicate quality

RNA-seq

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011)

FRIT
Fraction of Reads in Target, for Total script this is anywhere in the gene.
Percent rRNA
Percent of the alignments that overlap rRNA as downloaded from UCSC.
Number of expressed genes
Count of genes with RSEM TPM greater than 1. Should be similar between replicates.
Number of spike-ins
Count of reads mapping to spike-ins.
Duplication level
The duplication rate reported by FastQC. It is based on all reads even those that do not map.
Percent GC
The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map.
Strand specificity
Three numbers from RSeQC infer_experiment.py that describe the strandedness of the reads. It speculates how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads’ mapping information to the underneath gene model. For stranded experiments the numbers are expected to approach .99:.01:0, or .01:.99:0 for unstranded .50:.50:0.
MAD of log ratios
mean absolute deviation of log ratios, Rafa's measure of replicate quality. Only looks at genes that are non-zero in both replicates.

Return to HLab:Main