|
|
(12 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
− | = Resources =
| + | Current SGE documentation is at '''[[BX:SGE]]'''. |
− | * linne cluster
| |
− | ** [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide]
| |
− | * persephone cluster
| |
− | ** [http://wikis.sun.com/display/GridEngine/Using+Sun+Grid+Engine Sun Grid Engine 6.2u1 User's Guide]
| |
− | * both clusters
| |
− | ** [http://en.wikipedia.org/wiki/Sun_Grid_Engine Sun Grid Engine Wikipedia Page]
| |
− | ** [http://gridengine.sunsource.net/ Open Source Project Page]
| |
− | ** [http://www.sun.com/software/sge/ Commercial Project Page]
| |
| | | |
− | = Submitting a Job =
| + | Linne is running a much older version of SGE, which may not support all of the options available with the latest SGE. Documentation for the SGE running on Linne (6.0u6) can be found here: |
− | | + | * [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide] |
− | == Creating a Job Script ==
| |
− | | |
− | == Adding write permissions (only for persephone cluster) ==
| |
− | | |
− | When using the persephone cluster, you need to modify the permissions of any AFS directory that is receiving output.
| |
− | | |
− | Create the <code>~/sge-out</code> direcory.
| |
− | <pre>
| |
− | % mkdir ~/sge-out
| |
− | </pre>
| |
− | | |
− | Allow <code>svc/sge/persephone</code> to access your home directory.
| |
− | <pre>
| |
− | % fs setacl -dir ~ -acl svc/sge/persephone l
| |
− | </pre>
| |
− | | |
− | Allow <code>svc/sge/persephone</code> to read and write to the <code>~/sge-out</code> directory.
| |
− | <pre>
| |
− | % fs setacl -dir ~/sge-out -acl svc/sge/persephone rliw
| |
− | </pre>
| |
− | | |
− | == Submitting the Job Script to SGE ==
| |
− | | |
− | You can submit jobs to Sun Grid Engine using the <code>qsub</code> command.
| |
− | | |
− | <pre>
| |
− | % qsub job.sh
| |
− | Your job 2103 ("job.sh") has been submitted.
| |
− | </pre>
| |
− | | |
− | = Checking Job Status =
| |
− | | |
− | You can check the status of jobs using the <code>qstat</code> command.
| |
− | | |
− | The following command shows the current status of all queues and jobs.
| |
− | | |
− | <pre>
| |
− | % qstat -f -u '*'
| |
− | </pre>
| |
− | | |
− | The following command and output demonstrates checking the all.q queue on the persephone cluster.
| |
− | | |
− | <pre>
| |
− | % qstat -f -q all.q -u '*'
| |
− | queuename qtype resv/used/tot. load_avg arch states
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c1.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c2.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c3.persephone.bx.psu.edu BIP 0/1/8 7.77 lx24-amd64
| |
− | 3646 0.55500 rico_metho suw17 r 03/23/2010 13:59:04 1
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c4.persephone.bx.psu.edu BIP 0/4/8 4.01 lx24-amd64
| |
− | 3773 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | 3774 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | 3779 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | 3780 0.55500 baseCallin rico r 03/23/2010 15:06:19 1
| |
− | --------------------------------------------------------------------------------- | |
− | all.q@c5.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c6.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64
| |
− | ---------------------------------------------------------------------------------
| |
− | all.q@c7.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 s
| |
− | | |
− | ############################################################################
| |
− | - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
| |
− | ############################################################################
| |
− | 3781 0.00000 baseCallin rico hqw 03/23/2010 14:49:47 1
| |
− | </pre>
| |
− | | |
− | | |
− | = Deleting a Job =
| |
− | | |
− | Jobs can be deleted using the <code>qdel</code> command.
| |
− | | |
− | The following command and output shows deleting the job with job_id 3781.
| |
− | | |
− | <pre>
| |
− | % qdel 3781
| |
− | rico has deleted job 3781
| |
− | </pre>
| |
− | | |
− | The following command and output demonstrates deleting all jobs submitted by user <code>rico</code>.
| |
− | | |
− | <pre>
| |
− | % qdel -u rico
| |
− | rico has deleted job 3773
| |
− | rico has deleted job 3774
| |
− | rico has deleted job 3779
| |
− | rico has deleted job 3780
| |
− | </pre>
| |
Linne is running a much older version of SGE, which may not support all of the options available with the latest SGE. Documentation for the SGE running on Linne (6.0u6) can be found here: