Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
{jenkins-plugin-info:sge-cloud-plugin}
Excerpt

Submits builds to the Sun Grid Engine (SGE) batch scheduler.  Both the open source version of SGE 2011.11 and the commercial Univa Grid Engine (UGE) 8.3.1 are supported.

Table of Contents

Features

This plugin adds a new type of build step Run job on SGE that submits batch jobs to SGE. The build step monitors the job status and periodically (default one minute) appends the progress to the build's Console Output. Should the build fail, errors and the exit status of the job also appear. If the job is terminated in Jenkins, it is also terminated in SGE.

Builds are submitted to SGE by a new type of cloud, SGE Cloud.  The cloud is given a label like any other slaveagent.  When a job with a matching label is run, SGE Cloud submits the build to SGE.

...

An email can optionally be sent when the job finishes.

Installation

Install the prerequisite plugins:

  • SSH Slaves plugin version 1.9 or greater
  • Copy to Slave plugin version 1.4.3 or greater

The SGE Cloud Plugin is not yet uploaded to the Jenkins download center, so you must compile and load it yourself.  After you clone the git repository, build it using Maven:

Code Block
cd sge-cloud-plugin/
mvn install     # Sometimes 'mvn clean install' works better

In Manage Jenkins > Plugin Manager, select the Advanced tab.  Use Upload Plugin to upload the plugin file sge-cloud-plugin/target/sge-cloud.hpi to Jenkins.

Configure SGE

In SGE, add your Jenkins master host as an SGE submit host.

...

Name

Value

SGE_ROOT

/path/to/uge

SGE_BIN

/path/to/uge/bin/lx-amd64 or .../darwin-x64 or .../win-x86

SGE_CELL

Your cell name

SGE_CLUSTER_NAME

Your cluster name

SGE_EXECD_PORT

64455

SGE_QMASTER_PORT

64444

Environment

...

variable troubleshooting

The SGE error message:

Code Block
Unable to initialize environment because of error: cell directory "/path/to/sge/default" doesn't exist

means that SGE_CELL is undefined (it defaults to default).

Create the

...

cloud

In Manage Jenkins > Configure System, add a new cloud of type SGE Cloud.  Fill in the required information for the newly created cloud.

Set

...

up a

...

project to run on SGE

In a project, specify the Label that you specified in SGE Cloud.

...

Now, when Jenkins runs the project, it will run on the SGE Cloud with the matching label.

Set

...

your script to

...

fail on the

...

first failure

By default, the exit status of the last command determines the success or failure of the build step.  For example, the following script would be inappropriately considered a success:

...

Code Block
set -e
ls /nonexistent    # Error, exit status 2
echo "This is never executed because the above ls command failed."

Additional qsub

...

options

So that you can see the qsub command that was used to submit jobs, the SGE Plugin prints the qsub command to the Jenkins job Console Output:

...

Code Block
qsub: Unknown option

Job States

...

Unfinished jobs

The qstat man page describes the following job states (job status) defined in SGE.  Each state is a string whose first character is most meaningful:

  • "d", for deletion
  • "E", for error
  • "h", for hold
  • "r", for running
  • "R", for restarted
  • "s", for suspended
  • "S", for queue suspended
  • "t", for transfering
  • "T", for threshold
  • "w", for waiting

Finished Jobs

The above states Naturally, qstat can only describe jobs that have not yet finished, were actually submitted to SGE. The SGE Cloud Plugin defines an additional state for jobs it could not even submit to SGE:

  • "J", for Jenkins SGE plugin failure to submit the job

Finished jobs

SGE qstat states cover only unfinished jobs, yet the SGE Cloud Plugin expects that completed finished jobs also have a state.  Therefore the SGE Cloud Plugin derives the Plugin uses the shell exit status as the state of a the finished job from its shell exit status:

  • "0" (zero) for a successfully finished job
  • "1" through "255" for a job that failed with a nonzero exit status

Exit status above 128 indicates that a signal terminated the job.  See Job Exit Status for an explanation of some exit statuses.

Finally, when the Jenkins SGE plugin could not even submit the job to SGE, the job is given the state:

  • "J", for Jenkins SGE plugin failure to submit the job

Viewing the Job Workspace

Each project has a Workspace button that you can use to view the project workspace files in your web browser.  This handy feature relies on the slave that executed the job.  SGE slaves are reused and if kept busy they can live a long and productive life.  However, slaves left idle for an extended time are deleted.  Once the slave is gone, the Workspace button will no longer work.  Then the files can only be viewed using other methods like the command line.

...

Jenkins adds environment variables to the environment, and these are imported into the SGE job environment.  Then SGE adds some more.  There is just one variable name collision: JOB_NAME.  So before SGE overwrites Jenkins' JOB_NAME, the Jenkins value is preserved in environment variable JENKINS_JOB_NAME.

Project History

LSF Cloud Plugin

sge-cloud-plugin was forked from lsf-cloud-plugin and modified to work with SGE instead of LSF.

SGE Cloud Plugin

sge-cloud-plugin is being used in industrial production on at Wave Computing's Grid Engine compute farm.

While it might be nice to integrate sge-cloud-plugin and lsf-cloud-plugin into a single Jenkins plugin, this would be difficult to test, as few organizations have all batch systems installed.

Condor Cloud Plugin (future)

From time to time people inquire about a Condor version of this plugin. To create this you would fork the SGE plugin, then replace the SGE commands it sends with Condor commands.  No official Jenkins Condor Plugin has materialized, but potential candidates do turn up in a search of GitHub. Good luck.