Difference between revisions of "SLab:Run Processing"

From CCGB
Jump to: navigation, search
(signal processing)
(staging)
Line 47: Line 47:
  
 
Once the run has been copied to the staging directory, the files in the run directory are modified as needed to make sure they have the correct owner, group, and permissions.
 
Once the run has been copied to the staging directory, the files in the run directory are modified as needed to make sure they have the correct owner, group, and permissions.
 +
 +
== archive ==
 +
 +
To archive a run, it needs to be moved into one of the archive folders (md1k-1, md1k-2, md1k-3 on s3) or (md1k-4, md1k-5, md1k-6 on s2).
 +
 +
The /zfs/md1k-N/archive filesystem is compressed and exported read-only.
 +
<pre>
 +
s3:/zfs/md1k-{1,2,3}/archive/sequencing/454/YYYY/YYYY_MM_DD/
 +
s2:/zfs/md1k-{4,5,6}/archive/sequencing/454/YYYY/YYYY_MM_DD/
 +
</pre>
 +
 +
After the run has been archived, the links in the following directory need to be modified
 +
<pre>
 +
/afs/bx.psu.edu/depot/data/schuster_lab/sequencing/archive/454
 +
</pre>
  
 
= Illumina =
 
= Illumina =
 
; systems
 
; systems
 
* illumina-ga
 
* illumina-ga

Revision as of 12:12, 26 March 2010

454

rigs
  • schuster-flx1
  • schuster-flx2
  • schuster-flx3
  • schuster-flx4

on-rig processing

  • run directories are stored in /data
    • /data/YYYY_MM_DD/R_YYYY_MM_DD_HH_MM_SS_RIGNAME_OPERATOR_RUNNAME
  • when a run finishes processing, it calls the /usr/local/rig/bin/postAnalysisScript.sh script
    • rsync's run directory to s2:/zfs/md1k-4/data/sequencing/temp/454
    • ssh's to c1.persephone to submit job
      • depending on run
        • calls c1.persephone:/usr/local/bin/submit-signalProcessing.sh
        • calls c1.persephone:/usr/local/bin/submit-fullProcessing.sh
    • status email is sent to 454pipeline@bx.psu.edu
    • our postAnalysisScript.sh is kept in /home/adminrig/postAnalysisScript directory on each rig
      • revision controlled using rcs
        •  % co -l postAnalysisScript.sh
        •  % vi postAnalysisScript.sh
        •  % ci -u postAnalysisScript.sh
      • Makefile in this directory installs our version into /usr/local/rig/bin
        •  % make install

signal processing

  • signal processing for runs is performed on the persephone cluster
  • depending on run
    • uses qsub to submit job using c1.persephone:/usr/local/bin/signalProcessing.qsub
    • uses qsub to submit job using c1.persephone:/usr/local/bin/fullProcessing.qsub
  • status email is sent to 454pipeline@bx.psu.edu

Before exiting signal processing jobs signal that processing is done by touching a file with the same name as the run directory:

/afs/bx.psu.edu/depot/data/schuster_lab/sequencing/temp/454/.processing_finished/RUN_DIR_NAME

staging

A cron job on s2 checks the /zfs/md1k-4/data/sequencing/temp/454/.processing_finished directory once a minute to see if any signal processing jobs have finished. When it finds a finished signal processing job, it moves it to the staging directory:

/afs/bx.psu.edu/depot/data/schuster_lab/sequencing/staging/454/RUN_DIR

Once the run has been copied to the staging directory, the files in the run directory are modified as needed to make sure they have the correct owner, group, and permissions.

archive

To archive a run, it needs to be moved into one of the archive folders (md1k-1, md1k-2, md1k-3 on s3) or (md1k-4, md1k-5, md1k-6 on s2).

The /zfs/md1k-N/archive filesystem is compressed and exported read-only.

s3:/zfs/md1k-{1,2,3}/archive/sequencing/454/YYYY/YYYY_MM_DD/
s2:/zfs/md1k-{4,5,6}/archive/sequencing/454/YYYY/YYYY_MM_DD/

After the run has been archived, the links in the following directory need to be modified

/afs/bx.psu.edu/depot/data/schuster_lab/sequencing/archive/454

Illumina

systems
  • illumina-ga