Revision as of 15:59, 3 September 2010

Overview

There is one central SGE installation which handles job scheduling across all of the BX clusters, work servers, and workstations (with the exception of the okinawa, linne, and galaxy clusters). Merging existing clusters, work servers, and workstations is still a work-in-progress project.

The central grid engine has a pair of fully redundant master servers to ensure continuous job scheduling. The loss of both sge masters does not kill jobs that are currently running or queued, but will prevent any further job submissions. There is an approximately 5 minute failover period between sge master failure and the startup of the other sge master.

Status

Current BX Grid load can be seen through GANGLIA at http://ganglia.bx.psu.edu
A web version of qstat (XSL formatted version of qstat -f -u '*' -xml) is available at http://qstat.bx.psu.edu

Usage

To submit a job, put the command(s) into a script, and use qsub. Various job resource requirements can be specified with -l resource=foo.

SGE host status can be seen with qhost

Job queue/status can be seen with qstat -f, which will show just your jobs. To see everyone's jobs, qstat -f -u '*'. Note that qstat behaves different than previous versions of SGE.

For more detailed usage and examples, please see the SGE Documentation Site: SGE 6.2u5 documentation

@@ Line 1: / Line 1: @@
-Using SGE 6.2u5 : http://wikis.sun.com/display/gridengine62u5/Using
+= Overview =
+There is one central SGE installation which handles job scheduling across all of the BX clusters, work servers, and workstations (with the exception of the okinawa, linne, and galaxy clusters). Merging existing clusters, work servers, and workstations is still a work-in-progress project.
+The central grid engine has a pair of fully redundant master servers to ensure continuous job scheduling. The loss of both sge masters does not kill jobs that are currently running or queued, but will prevent any further job submissions. There is an approximately 5 minute failover period between sge master failure and the startup of the other sge master.
+= Status =
+* Current ''BX Grid'' load can be seen through GANGLIA at http://ganglia.bx.psu.edu
+* A web version of qstat (XSL formatted version of ''qstat -f -u '*' -xml) is available at http://qstat.bx.psu.edu
+= Usage =
+To submit a job, put the command(s) into a script, and use qsub. Various job resource requirements can be specified with '''-l resource=foo'''.
+SGE host status can be seen with '''qhost'''
+Job queue/status can be seen with '''qstat -f''', which will show just your jobs. To see everyone's jobs, '''qstat -f -u '*''''. Note that qstat behaves different than previous versions of SGE.
+For more detailed usage and examples, please see the SGE Documentation Site:
+[http://wikis.sun.com/display/gridengine62u5/Using SGE 6.2u5 documentation]

Difference between revisions of "BX:SGE"

Revision as of 15:59, 3 September 2010

Overview

Status

Usage

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools