Difference between revisions of "SLab:Using Sun Grid Engine"

From CCGB
Jump to: navigation, search
(Checking Job Status)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Resources =
+
Current SGE documentation is at '''[[BX:SGE]]'''.
* linne cluster
 
** [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide]
 
* persephone cluster
 
** [http://wikis.sun.com/display/GridEngine/Using+Sun+Grid+Engine Sun Grid Engine 6.2u1 User's Guide]
 
* both clusters
 
** [http://en.wikipedia.org/wiki/Sun_Grid_Engine Sun Grid Engine Wikipedia Page]
 
** [http://gridengine.sunsource.net/ Open Source Project Page]
 
** [http://www.sun.com/software/sge/ Commercial Project Page]
 
  
= Submitting a Job =
+
Linne is running a much older version of SGE, which may not support all of the options available with the latest SGE. Documentation for the SGE running on Linne (6.0u6) can be found here:
 
+
* [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide]
== Creating a Job Script ==
 
 
 
A job script is a standard shell script that contains the commands needed to run your job on the cluster.  Sun Grid Engine uses the two character combination <code>#$</code> to specify arguments to <code>qsub</code>.  Check the <code>qsub</code> documentation for a complete list of command line arguments.
 
 
 
<pre>
 
#!/usr/bin/env bash
 
#
 
# Submit this job to the all.q queue
 
#$ -q all.q
 
#
 
# Send email when job begins,ends,is aborted or reschuled,or is suspended
 
#$ -m beas
 
#
 
# Email address to send email to
 
#$ -M rico@bx.psu.edu
 
#
 
# Shell to use to run job
 
#$ -S /bin/bash
 
#
 
# Send job standart output to $HOME/sge-out
 
#$ -o $HOME/sge-out
 
#
 
# Send job standard error to $HOME/sge-out
 
#$ -e $HOME/sge-out
 
 
 
SEQ_DIR=/afs/bx.psu.edu/depot/data/schuster_lab/sequencing
 
RUN=100201_HWUSI-EAS610_0006
 
RUN_DIR=$SEQ_DIR/archive/illumina/by_date/2010/2010_02_01/$RUN
 
 
 
cd $RUN_DIR/Data/Intensities/BaseCalls/GERALD_03-02-2010_rico
 
wc -l s_6_sequence.txt
 
</pre>
 
 
 
== Adding write permissions (only for persephone cluster) ==
 
 
 
When using the persephone cluster, you need to modify the permissions of any AFS directory that is receiving output.
 
 
 
Create the <code>~/sge-out</code> direcory.
 
<pre>
 
% mkdir ~/sge-out
 
</pre>
 
 
 
Allow <code>svc/sge/persephone</code> to access your home directory.
 
<pre>
 
% fs setacl -dir ~ -acl svc/sge/persephone l
 
</pre>
 
 
 
Allow <code>svc/sge/persephone</code> to read and write to the <code>~/sge-out</code> directory.
 
<pre>
 
% fs setacl -dir ~/sge-out -acl svc/sge/persephone rliw
 
</pre>
 
 
 
== Submitting the Job Script to SGE ==
 
 
 
You can submit jobs to Sun Grid Engine using the <code>qsub</code> command.
 
 
 
<pre>
 
% qsub job.sh
 
Your job 3784 ("job.sh") has been submitted.
 
</pre>
 
 
 
= Checking Job Status =
 
 
 
You can check the status of jobs using the <code>qstat</code> command.
 
 
 
The following command and output shows the status if just my submitted jobs.
 
<pre>
 
% qstat
 
job-ID  prior  name      user        state submit/start at    queue                          slots ja-task-ID
 
-----------------------------------------------------------------------------------------------------------------
 
  3773 0.55500 baseCallin rico        r    03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu    1       
 
  3774 0.55500 baseCallin rico        r    03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu    1       
 
  3779 0.55500 baseCallin rico        r    03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu    1       
 
  3780 0.55500 baseCallin rico        r    03/23/2010 15:06:19 all.q@c4.persephone.bx.psu.edu    1       
 
  3784 0.55500 job.sh    rico        r    03/23/2010 18:31:04 all.q@c6.persephone.bx.psu.edu    1       
 
  3781 0.00000 baseCallin rico        hqw  03/23/2010 14:49:47                                    1   
 
</pre>
 
 
 
The following command shows the current status of all queues and jobs.
 
 
 
<pre>
 
% qstat -f -u '*'
 
</pre>
 
 
 
The following command and output demonstrates checking the all.q queue on the persephone cluster.
 
 
 
<pre>
 
% qstat -f -q all.q -u '*'
 
queuename                      qtype resv/used/tot. load_avg arch          states
 
---------------------------------------------------------------------------------
 
all.q@c1.persephone.bx.psu.edu BIP  0/0/8          0.00    lx24-amd64   
 
---------------------------------------------------------------------------------
 
all.q@c2.persephone.bx.psu.edu BIP  0/0/8          0.00    lx24-amd64   
 
---------------------------------------------------------------------------------
 
all.q@c3.persephone.bx.psu.edu BIP  0/1/8          7.77    lx24-amd64   
 
  3646 0.55500 rico_metho suw17        r    03/23/2010 13:59:04    1       
 
---------------------------------------------------------------------------------
 
all.q@c4.persephone.bx.psu.edu BIP  0/4/8          4.01    lx24-amd64   
 
  3773 0.55500 baseCallin rico        r    03/23/2010 15:06:19    1       
 
  3774 0.55500 baseCallin rico        r    03/23/2010 15:06:19    1       
 
  3779 0.55500 baseCallin rico        r    03/23/2010 15:06:19    1       
 
  3780 0.55500 baseCallin rico        r    03/23/2010 15:06:19    1       
 
---------------------------------------------------------------------------------
 
all.q@c5.persephone.bx.psu.edu BIP  0/0/8          0.00    lx24-amd64   
 
---------------------------------------------------------------------------------
 
all.q@c6.persephone.bx.psu.edu BIP  0/0/8          0.00    lx24-amd64   
 
  3784 0.55500 job.sh    rico        r    03/23/2010 18:31:04    1 
 
---------------------------------------------------------------------------------
 
all.q@c7.persephone.bx.psu.edu BIP  0/0/8          0.00    lx24-amd64    s
 
 
 
############################################################################
 
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
 
############################################################################
 
  3781 0.00000 baseCallin rico        hqw  03/23/2010 14:49:47    1       
 
</pre>
 
 
 
= Viewing Job Output =
 
 
 
Viewing the standard output.
 
<pre>
 
% cat ~/sge-out/job.sh.o3784
 
53299724 s_6_sequence.txt
 
</pre>
 
 
 
Viewing the standard error (this example file is empty).
 
<pre>
 
% cat ~/sge-out/job.sh.e3784
 
</pre>
 
 
 
= Deleting a Job =
 
 
 
Jobs can be deleted using the <code>qdel</code> command.
 
 
 
The following command and output shows deleting the job with job_id 3781.
 
 
 
<pre>
 
% qdel 3781
 
rico has deleted job 3781
 
</pre>
 
 
 
The following command and output demonstrates deleting all jobs submitted by user <code>rico</code>.
 
 
 
<pre>
 
% qdel -u rico
 
rico has deleted job 3773
 
rico has deleted job 3774
 
rico has deleted job 3779
 
rico has deleted job 3780
 
</pre>
 

Latest revision as of 16:51, 4 October 2010

Current SGE documentation is at BX:SGE.

Linne is running a much older version of SGE, which may not support all of the options available with the latest SGE. Documentation for the SGE running on Linne (6.0u6) can be found here: