Difference between revisions of "SLab:Using Sun Grid Engine"
(→linne) |
|||
Line 1: | Line 1: | ||
− | + | = Resources = | |
− | Resources | + | * linne cluster |
− | * linne | ||
** [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide] | ** [http://docs.sun.com/app/docs/doc/817-6117 Sun Grid Engine 6.0u6 User's Guide] | ||
− | * persephone | + | * persephone cluster |
** [http://wikis.sun.com/display/GridEngine/Using+Sun+Grid+Engine Sun Grid Engine 6.2u1 User's Guide] | ** [http://wikis.sun.com/display/GridEngine/Using+Sun+Grid+Engine Sun Grid Engine 6.2u1 User's Guide] | ||
− | * both | + | * both clusters |
** [http://en.wikipedia.org/wiki/Sun_Grid_Engine Sun Grid Engine Wikipedia Page] | ** [http://en.wikipedia.org/wiki/Sun_Grid_Engine Sun Grid Engine Wikipedia Page] | ||
** [http://gridengine.sunsource.net/ Open Source Project Page] | ** [http://gridengine.sunsource.net/ Open Source Project Page] | ||
** [http://www.sun.com/software/sge/ Commercial Project Page] | ** [http://www.sun.com/software/sge/ Commercial Project Page] | ||
+ | |||
+ | = Submitting a Job = | ||
+ | |||
+ | == Creating a Job Script == | ||
+ | |||
+ | == Adding write permissions (only for persephone cluster) == | ||
+ | |||
+ | When using the persephone cluster, you need to modify the permissions of any AFS directory that is receiving output. | ||
+ | |||
+ | Create the <code>~/sge-out</code> direcory. | ||
+ | <pre> | ||
+ | % mkdir ~/sge-out | ||
+ | </pre> | ||
+ | |||
+ | Allow <code>svc/sge/persephone</code> to access your home directory. | ||
+ | <pre> | ||
+ | % fs setacl -dir ~ -acl svc/sge/persephone l | ||
+ | </pre> | ||
+ | |||
+ | Allow <code>svc/sge/persephone</code> to read and write to the <code>~/sge-out</code> directory. | ||
+ | <pre> | ||
+ | % fs setacl -dir ~/sge-out -acl svc/sge/persephone rliw | ||
+ | </pre> | ||
+ | |||
+ | == Submitting the Job Script to SGE == | ||
+ | |||
+ | You can submit jobs to Sun Grid Engine using the <code>qsub</code> command. | ||
+ | |||
+ | <pre> | ||
+ | % qsub job.sh | ||
+ | Your job 2103 ("job.sh") has been submitted. | ||
+ | </pre> | ||
+ | |||
+ | = Checking Job Status = | ||
+ | |||
+ | You can check the status of jobs using the <code>qstat</code> command. | ||
+ | |||
+ | The following command shows the current status of all queues and jobs. | ||
+ | |||
+ | <pre> | ||
+ | % qstat -f -u '*' | ||
+ | </pre> | ||
+ | |||
+ | The following command and output demonstrates checking the all.q queue on the persephone cluster. | ||
+ | |||
+ | <pre> | ||
+ | % qstat -f -q all.q -u '*' | ||
+ | queuename qtype resv/used/tot. load_avg arch states | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c1.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c2.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c3.persephone.bx.psu.edu BIP 0/1/8 7.77 lx24-amd64 | ||
+ | 3646 0.55500 rico_metho suw17 r 03/23/2010 13:59:04 1 | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c4.persephone.bx.psu.edu BIP 0/4/8 4.01 lx24-amd64 | ||
+ | 3773 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 | ||
+ | 3774 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 | ||
+ | 3779 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 | ||
+ | 3780 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c5.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c6.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 | ||
+ | --------------------------------------------------------------------------------- | ||
+ | all.q@c7.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 s | ||
+ | |||
+ | ############################################################################ | ||
+ | - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS | ||
+ | ############################################################################ | ||
+ | 3781 0.00000 baseCallin rico hqw 03/23/2010 14:49:47 1 | ||
+ | </pre> | ||
+ | |||
+ | |||
+ | = Deleting a Job = | ||
+ | |||
+ | Jobs can be deleted using the <code>qdel</code> command. | ||
+ | |||
+ | The following command and output shows deleting the job with job_id 3781. | ||
+ | |||
+ | <pre> | ||
+ | % qdel 3781 | ||
+ | rico has deleted job 3781 | ||
+ | </pre> | ||
+ | |||
+ | The following command and output demonstrates deleting all jobs submitted by user <code>rico</code>. | ||
+ | |||
+ | <pre> | ||
+ | % qdel -u rico | ||
+ | rico has deleted job 3773 | ||
+ | rico has deleted job 3774 | ||
+ | rico has deleted job 3779 | ||
+ | rico has deleted job 3780 | ||
+ | </pre> |
Revision as of 17:10, 23 March 2010
Contents
Resources
- linne cluster
- persephone cluster
- both clusters
Submitting a Job
Creating a Job Script
Adding write permissions (only for persephone cluster)
When using the persephone cluster, you need to modify the permissions of any AFS directory that is receiving output.
Create the ~/sge-out
direcory.
% mkdir ~/sge-out
Allow svc/sge/persephone
to access your home directory.
% fs setacl -dir ~ -acl svc/sge/persephone l
Allow svc/sge/persephone
to read and write to the ~/sge-out
directory.
% fs setacl -dir ~/sge-out -acl svc/sge/persephone rliw
Submitting the Job Script to SGE
You can submit jobs to Sun Grid Engine using the qsub
command.
% qsub job.sh Your job 2103 ("job.sh") has been submitted.
Checking Job Status
You can check the status of jobs using the qstat
command.
The following command shows the current status of all queues and jobs.
% qstat -f -u '*'
The following command and output demonstrates checking the all.q queue on the persephone cluster.
% qstat -f -q all.q -u '*' queuename qtype resv/used/tot. load_avg arch states --------------------------------------------------------------------------------- all.q@c1.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 --------------------------------------------------------------------------------- all.q@c2.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 --------------------------------------------------------------------------------- all.q@c3.persephone.bx.psu.edu BIP 0/1/8 7.77 lx24-amd64 3646 0.55500 rico_metho suw17 r 03/23/2010 13:59:04 1 --------------------------------------------------------------------------------- all.q@c4.persephone.bx.psu.edu BIP 0/4/8 4.01 lx24-amd64 3773 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 3774 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 3779 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 3780 0.55500 baseCallin rico r 03/23/2010 15:06:19 1 --------------------------------------------------------------------------------- all.q@c5.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 --------------------------------------------------------------------------------- all.q@c6.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 --------------------------------------------------------------------------------- all.q@c7.persephone.bx.psu.edu BIP 0/0/8 0.00 lx24-amd64 s ############################################################################ - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS ############################################################################ 3781 0.00000 baseCallin rico hqw 03/23/2010 14:49:47 1
Deleting a Job
Jobs can be deleted using the qdel
command.
The following command and output shows deleting the job with job_id 3781.
% qdel 3781 rico has deleted job 3781
The following command and output demonstrates deleting all jobs submitted by user rico
.
% qdel -u rico rico has deleted job 3773 rico has deleted job 3774 rico has deleted job 3779 rico has deleted job 3780