BX:SGE
Overview
There is one central SGE installation which handles job scheduling across all of the BX clusters, work servers, and workstations (with the exception of the okinawa, linne, and galaxy clusters). Merging existing clusters, work servers, and workstations is still a work-in-progress project.
The central grid engine has a pair of fully redundant master servers to ensure continuous job scheduling. The loss of both sge masters does not kill jobs that are currently running or queued, but will prevent any further job submissions. There is an approximately 5 minute failover period between sge master failure and the startup of the other sge master.
Status
- Current BX Grid load can be seen through GANGLIA at http://ganglia.bx.psu.edu
- A web version of qstat (XSL formatted version of qstat -f -u '*' -xml) is available at http://qstat.bx.psu.edu
Usage
To submit a job, put the command(s) into a script, and use qsub. Various job resource requirements can be specified with -l resource=foo.
SGE host status can be seen with qhost
Job queue/status can be seen with qstat -f, which will show just your jobs. To see everyone's jobs, qstat -f -u '*'. Note that qstat behaves different than previous versions of SGE.
For more detailed usage and examples, please see the SGE Documentation Site: SGE 6.2u5 documentation