This plugin uses Gearman to support multiple Jenkins masters.
View Gearman on the plugin site for more information.
We on Openstack infrastructor team use Jenkins extensively. Our jenkins servers, at peak load, runs 7000+ jobs per day. At that load we require many jenkins slaves (200+) to process all of those build jobs. We have found that our requirement was pushing Jenkins beyond it's limits therefore we've decided to create the Gearman Plugin to support multiple Jenkins masters. The gearman plugin was designed to support extra slaves, allow load balancing of build jobs, and provide redundancy.
Jenkins core does not support multiple masters. You can setup multiple Jenkins masters but there is no coordination between them. One problem with scheduling builds on Jenkins master (“MasterA”) server is that MasterA only knows about its connected slaves. If all slaves on MasterA are busy then MasterA will just put the next scheduled build on the jenkin server queue. Now MasterA needs to wait for an available slave to run the build. This will be very in-efficient if your builds take a long time to run. So.. what if there is another Jenkins master (“MasterB”) that has free slaves to service the next scheduled build on the server's queue? Your probably saying.. “Then slaves on MasterB should run the build instead of waiting for slaves on MasterA to run them”, then I would say "good thought!". However MasterB will never service the builds on MasterA's queue. The client that schedules the builds must know about MasterB and then schedule builds on MasterB. This is what we mean by lack of coordination between masters. This gearman-plugin attempts to fill the gap.
This plugin integrates Gearman with Jenkins and will make it so that any Jenkins slave on any Jenkins master can service a job in the queue. This plugin will essentially replace the Jenkins (master) build queue with the Gearman job queue. The job will stay in the Gearman queue until there is a Jenkins node (master or slave) that can run that job. The gearman job queue is shared by multiple jenkins masters therefore gearman can hand out jobs to the next available slave on any jenkins master.
- High availability(ish). When one master goes down the other master(s) will continue to execute builds however the in flight jobs on the down master will be lost.
- Slaves are (by default) always shared between masters. The only way to un-share is to offline or disconnect a slave.
- Horizontal scalability. Just continue to add more jenkins masters to distribute the load between masters.
- Gearman jobs can start a jenkins build
- Gearman jobs can stop or abort a jenkins build
- Gearman jobs can change a build description
- Gearman jobs can pass in parameters to jenkins builds
- Gearman jobs can automatically set a slave to offline after running a build
- Gearman is aware of Jenkins project status: meaning that gearman will register/unregister projects when the project is enabled or disabled.
- Gearman is aware of slave status: meaning that gearman will register/unregister slaves when a slave is set online/offline and connected/disconnected.
- Plugin reloads on jenkins restart: meaning that when jenkins restarts the gearman worker threads are automatically restarted and reconnect to a gearman server.
- Dynamically update Gearman functions relative to an update to the Jenkins executors.
- This pliugin does not register functions correctly for Jenkins Matrix Projects.
This assumes some familiarity with Jenkins and Gearman
- Install the Gearman plugin like any other Jenkins plugin, refer to the Jenkins documentation. You can also get the plugin directly from the Jenkins CI Repository
- If you don't already have a Gearman server up and running somewhere you should install one. This plugin will work with the following gearman servers:
- The python gear package (recommended). This is the one we use and test with.
- The C gearmand server ver 1.1.7 and later. To get later versions of gearman server you may need to build from source.
- Theoretically the plugin should work with any gearman server, but we've only used it with the python gear implementation.
After installation the Gearman plugin configuration should appear in the Jenkins global configuration page. You should test the connection to your Gearman Server before running the workers. The 'Enable Gearman' checkbox will start the gearman workers on the Jenkins server. Click on the help bubbles if you need additional help with the configuration.
Starting the Gearman workers:
- When the gearman plugin is enabled a gearman worker thread is spawned for each executor on the master and slave nodes. We'll call these "executor worker threads". Each executor worker thread is associated 1:1 with a Jenkins executor.
- We spawn one more Gearman worker thread to handle job management (i.e. abort/update/etc..). We'll call it the "management worker thread" and it will register a "stop:$hostname" and "set_description:$hostname" function with the gearman server. We use these functions to manage jenkins builds.
- The gearman plugin will register gearman a function for each Gearman executor depending on the projects, labels and nodes that have been setup on the Jenkins master. You can check the registered gearman functions using the administration protocol. It should look something like this..
- Red text denotes gearman admin commands
- Blue text denotes gearman workers. There is a default manager worker for the master and an executor worker for a jenkins executor on master. There are two gearman executor workers for oneiric-668599 slave (exec-0 & exec-1). These executor workers map to two jenkins executors on the oneiric-668599 slave.
- Functions like "build:guava:ubuntu" map to build:$projectName:$nodeLabel"
Here's the corresponding Jenkins:
A gearman client can be written in any language. Here are a few sample clients that work with this plugin
- gearman-plugin-client is a simple test client (below examples use this client)
- java client is a simple client included with jenkins-plugin.hpi
- Zuul client is the smart client we use in production.
Running a Jenkins build
To build a Jenkins job the gearman client just needs to provide the Gearman hostname, port, function, and UUID to start a jenkins build.
Stopping/aborting a jenkins build
A Gearman job can stop/abort a jenkins build.
The job is stopped differently depending on the current state of the job. The table below explains the state, transitions and when cancellations happen.
Sending a job request to gearman puts it on the gearman queue
the job is removed from the gearman queue
jobs on the gearman queue will transition to the jenkins queue
the job is removed from the Jenkins queue
job on the jenkins queue transition to the jenkins executor to run
the build is aborted while on the jenkins executor
Updating a build description
You can send a gearman job to update a build's description. To do this you pass in the following parameters: name of project, build number, description.
Set slave to offline after a build completes
Our infrastructure employees many 'single use slaves' so what we like to do is run a job and then immediately set the slave offline. You can do this by passing in the parameter 'OFFLINE_NODE_WHEN_COMPLETE'
Plugin In Action
Plugin In Production
The above images just show how the plugin might work in a simple case. To see the plugin used in production check out openstack jenkins servers, yes that's servers with an s:
- jenkins01 - we use this master to run operational jobs
- jenkins02 - we use this master to run openstack project builds
- jenkins03 - this is essentially a mirror of jenkins01.
All of the above masters use this plugin which means all of them can run any jobs that are sent to gearman server. We have lots of documentation on how we run the system in production.
- 0.0.1 - initial release.
- Bunch of fixes
- ability to cancel gearman jobs from it's queue
- ability to set jenkins job descriptions
- ignore non-deterministic build failure and log it
- Don't wait for the worker thread to join
- remove restriction on slave to run single job at a time
- Use more fine-grained synchronization in GearmanProxy
- Rework starting/stopping executors
- moved python examples to jenkins wiki
- Add OFFLINE_NODE_WHEN_COMPLETE option
- Set a node offline even if there is an exception
- Always return WORK_COMPLETE when a build finishes regardless of the result