Difference between revisions of "HLab:Qualitymetrics"
From CCGB
(→RNA-seq) |
|||
(15 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
ChIP-seq Guidelines from ENCODE (2012) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/ article] | ChIP-seq Guidelines from ENCODE (2012) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/ article] | ||
− | + | ;FRIP | |
− | + | : Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good. | |
− | + | ;NSC | |
− | + | : Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation. A value of greater than or equal to 1.05 is considered good. | |
− | + | ;RSC | |
− | + | : Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good. | |
− | + | ;Complexity | |
− | + | : This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good. | |
+ | ;Duplication level | ||
+ | : The duplication rate reported by FastQC. It is based on all reads even those that do not map. | ||
+ | ;Percent GC | ||
+ | : The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map. | ||
+ | ;MAD of log ratios | ||
+ | : mean absolute deviation of log ratios, Rafa's measure for replicate quality | ||
== RNA-seq == | == RNA-seq == | ||
+ | Standards, [http://encodeproject.org/ENCODE/protocols/dataStandards/ENCODE_RNAseq_Standards_V1.0.pdf Guidelines] and Best Practices for RNA-Seq V1.0 (June 2011) | ||
+ | ;FRIT | ||
+ | : Fraction of Reads in Target, for Total script this is anywhere in the gene. | ||
+ | ;Percent rRNA | ||
+ | : Percent of the alignments that overlap rRNA as downloaded from UCSC. | ||
+ | ;Number of expressed genes | ||
+ | : Count of genes with RSEM TPM greater than 1. Should be similar between replicates. | ||
+ | ;Number of spike-ins | ||
+ | : Count of reads mapping to spike-ins. | ||
+ | ;Duplication level | ||
+ | : The duplication rate reported by FastQC. It is based on all reads even those that do not map. | ||
+ | ;Percent GC | ||
+ | : The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map. | ||
+ | ;Strand specificity | ||
+ | : Three numbers from RSeQC infer_experiment.py that describe the strandedness of the reads. It speculates how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads’ mapping information to the underneath gene model. For stranded experiments the numbers are expected to approach .99:.01:0, or .01:.99:0 for unstranded .50:.50:0. | ||
+ | ;MAD of log ratios | ||
+ | : mean absolute deviation of log ratios, Rafa's measure of replicate quality. Only looks at genes that are non-zero in both replicates. | ||
− | + | Return to [[HLab:Main]] | |
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 15:09, 23 January 2017
ChIP-seq
ChIP-seq Guidelines from ENCODE (2012) article
- FRIP
- Fraction of Reads in Peaks, a value of greater than or equal to 0.01 is considered good.
- NSC
- Normalized Strand Coefficient, The normalized ratio between the fragment-length cross-correlation peak and the background cross-correlation. A value of greater than or equal to 1.05 is considered good.
- RSC
- Relative Strand Correlation, ratio between the fragment-length peak and the read-length peak. A value of greater than or equal to 0.8 is considered good.
- Complexity
- This is computed on a sampling of the reads if more than 10M, using Georgi's method. A value greater than or equal to 0.80 is considered good.
- Duplication level
- The duplication rate reported by FastQC. It is based on all reads even those that do not map.
- Percent GC
- The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map.
- MAD of log ratios
- mean absolute deviation of log ratios, Rafa's measure for replicate quality
RNA-seq
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011)
- FRIT
- Fraction of Reads in Target, for Total script this is anywhere in the gene.
- Percent rRNA
- Percent of the alignments that overlap rRNA as downloaded from UCSC.
- Number of expressed genes
- Count of genes with RSEM TPM greater than 1. Should be similar between replicates.
- Number of spike-ins
- Count of reads mapping to spike-ins.
- Duplication level
- The duplication rate reported by FastQC. It is based on all reads even those that do not map.
- Percent GC
- The percent of nucleotides that are GC as reported by FastQC. It is based on all reads even those that do not map.
- Strand specificity
- Three numbers from RSeQC infer_experiment.py that describe the strandedness of the reads. It speculates how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads’ mapping information to the underneath gene model. For stranded experiments the numbers are expected to approach .99:.01:0, or .01:.99:0 for unstranded .50:.50:0.
- MAD of log ratios
- mean absolute deviation of log ratios, Rafa's measure of replicate quality. Only looks at genes that are non-zero in both replicates.
Return to HLab:Main