|
|
(3 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
− | = Resources =
| + | Current SGE documentation is at '''[[BX:SGE]]'''. |
− | * linne cluster
| |
− | ** [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide]
| |
− | * persephone cluster
| |
− | ** [http://wikis.sun.com/display/GridEngine/Using+Sun+Grid+Engine Sun Grid Engine 6.2u1 User's Guide]
| |
− | * both clusters
| |
− | ** [http://en.wikipedia.org/wiki/Sun_Grid_Engine Sun Grid Engine Wikipedia Page]
| |
− | ** [http://gridengine.sunsource.net/ Open Source Project Page]
| |
− | ** [http://www.sun.com/software/sge/ Commercial Project Page]
| |
| | | |
− | = Submitting a Job =
| + | Linne is running a much older version of SGE, which may not support all of the options available with the latest SGE. Documentation for the SGE running on Linne (6.0u6) can be found here: |
− | | + | * [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide] |
− | == Creating a Job Script ==
| |
− | | |
− | A job script is a standard shell script that contains the commands needed to run your job on the cluster. Sun Grid Engine uses the two character combination <code>#$</code> to specify arguments to <code>qsub</code>. Check the <code>qsub</code> documentation for a complete list of command line arguments.
| |
− | | |
− | <pre>
| |
− | #!/usr/bin/env bash
| |
− | #
| |
− | # Submit this job to the all.q queue
| |
− | #$ -q all.q
| |
− | #
| |
− | # Send email when job begins,ends,is aborted or reschuled,or is suspended
| |
− | #$ -m beas
| |
− | #
| |
− | # Email address to send email to
| |
− | #$ -M rico@bx.psu.edu
| |
− | #
| |
− | # Shell to use to run job
| |
− | #$ -S /bin/bash
| |
− | #
| |
− | # Send job standart output to $HOME/sge-out
| |
− | #$ -o $HOME/sge-out
| |
− | #
| |
− | # Send job standard error to $HOME/sge-out
| |
− | #$ -e $HOME/sge-out
| |
− | | |
− | SEQ_DIR=/afs/bx.psu.edu/depot/data/schuster_lab/sequencing
| |
− | RUN=100201_HWUSI-EAS610_0006
| |
− | RUN_DIR=$SEQ_DIR/archive/illumina/by_date/2010/2010_02_01/$RUN
| |
− | | |
− | cd $RUN_DIR/Data/Intensities/BaseCalls/GERALD_03-02-2010_rico
| |
− | wc -l s_6_sequence.txt
| |
− | </pre>
| |
− | | |
− | == Adding write permissions (only for persephone cluster) ==
| |
− | | |
− | When using the persephone cluster, you need to modify the permissions of any AFS directory that is receiving output.
| |
− | | |
− | Create the <code>~/sge-out</code> direcory.
| |
− | <pre>
| |
− | % mkdir ~/sge-out
| |
− | </pre>
| |
− | | |
− | Allow <code>svc/sge/persephone</code> to access your home directory.
| |
− | <pre>
| |
− | % fs setacl -dir ~ -acl svc/sge/persephone l
| |
− | </pre>
| |
− | | |
− | Allow <code>svc/sge/persephone</code> to read and write to the <code>~/sge-out</code> directory.
| |
− | <pre>
| |
− | % fs setacl -dir ~/sge-out -acl svc/sge/persephone rliw
| |
− | </pre>
| |
− | | |
− | == Submitting the Job Script to SGE ==
| |
− | | |
− | You can submit jobs to Sun Grid Engine using the <code>qsub</code> command.
| |
− | | |
− | <pre>
| |
− | % qsub job.sh
| |
− | Your job 3784 ("job.sh") has been submitted.
| |
− | </pre>
| |
− | | |
− | = Checking Job Status =
| |
− | | |
− | You can check the status of jobs using the <code>qstat</code> command.
| |
− | | |
− | The following command and output shows the status if just my submitted jobs.
| |
− | <pre>
| |
− | % qstat
| |
− | job-ID prior name user state submit/start at queue slots ja-task-ID
| |
− | -----------------------------------------------------------------------------------------------------------------
| |
− | 3773 0.55500 baseCallin rico r 03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu 1
| |
− | 3774 0.55500 baseCallin rico r 03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu 1
| |
− | 3779 0.55500 baseCallin rico r 03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu 1
| |
− | 3780 0.55500 baseCallin rico r 03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu 1
| |
− | 3784 0.55500 job.sh rico r 03/23/2010 18:31:04 all.q@c6.persephone.bx.psu.edu 1
| |
− | 3781 0.00000 baseCallin rico hqw 03/23/2010 14:49:47 1
| |
− | </pre>
| |
− | | |
− | The following command shows the current status of all queues and jobs.
| |
− | | |
− | <pre>
| |
− | % qstat -f -u '*'
| |
− | </pre>
| |
− | | |
− | The following command and output demonstrates checking the all.q queue on the persephone cluster.
| |
− | | |
− | <pre>
| |
− | % qstat -f -q all.q -u '*'
| |
− | queuename qtype resv/used/tot. load_avg arch states
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c1.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c2.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c3.persephone.bx.psu.edu BIP 0/1/8 7.77 lx24-amd64
| |
− | 3646 0.55500 rico_metho suw17 r 03/23/2010 13:59:04 1
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c4.persephone.bx.psu.edu BIP 0/4/8 4.01 lx24-amd64
| |
− | 3773 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | 3774 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | 3779 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | 3780 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c5.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c6.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | 3784 0.55500 job.sh rico r 03/23/2010 18:31:04 1
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c7.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 s
| |
− | | |
− | ############################################################################
| |
− | - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
| |
− | ############################################################################
| |
− | 3781 0.00000 baseCallin rico hqw 03/23/2010 14:49:47 1
| |
− | </pre>
| |
− | | |
− | = Viewing Job Output =
| |
− | | |
− | Viewing the standard output.
| |
− | <pre>
| |
− | % cat ~/sge-out/job.sh.o3784
| |
− | 53299724 s_6_sequence.txt
| |
− | </pre>
| |
− | | |
− | Viewing the standard error (this example file is empty).
| |
− | <pre>
| |
− | % cat ~/sge-out/job.sh.e3784
| |
− | </pre>
| |
− | | |
− | = Deleting a Job =
| |
− | | |
− | Jobs can be deleted using the <code>qdel</code> command.
| |
− | | |
− | The following command and output shows deleting the job with job_id 3781.
| |
− | | |
− | <pre>
| |
− | % qdel 3781
| |
− | rico has deleted job 3781
| |
− | </pre>
| |
− | | |
− | The following command and output demonstrates deleting all jobs submitted by user <code>rico</code>.
| |
− | | |
− | <pre>
| |
− | % qdel -u rico
| |
− | rico has deleted job 3773
| |
− | rico has deleted job 3774
| |
− | rico has deleted job 3779
| |
− | rico has deleted job 3780
| |
− | </pre>
| |
Linne is running a much older version of SGE, which may not support all of the options available with the latest SGE. Documentation for the SGE running on Linne (6.0u6) can be found here: