Child pages
  • Amazon EC2 Container Service Plugin
Skip to end of metadata
Go to start of metadata

Use Amazon ECS Containers to setup (docker-based) elastic build executors.

About

Amazon EC2 Container Service (ECS) is AWS' service for Docker container orchestration letting you deploy Docker based applications on a cluster.

This plugin lets you use Amazon ECS Container Service to manage Jenkins cloud agents.

Jenkins delegates to Amazon ECS the execution of the builds on Docker based agents.
Each Jenkins build is executed on a dedicated Docker container that is wiped-out at the end of the build.

The ECS cluster is composed of Amazon EC2 virtual machines instantiated within the boundaries the user's account (typically in an Amazon VPC). These virtual machines can be declared statically or can be managed dynamically by AWS ECS thanks to AWS Auto Scaling and AWS CloudFormation. 

Jenkins agents are connected to the Jenkins master using the JNLP protocol.

Requirements

  • Jenkins version 1.609 or later
  • AWS account with permissions to create an ECS cluster

Installation

Navigate to the "Plugin Manager" screen, install the "Amazon EC2 Container Service" plugin and restart Jenkins.

Configuration

Amazon ECS cluster

As a pre-requisite, you must have created an Amazon ECS cluster with associated ECS instances. These instances can be statically associated with the ECS cluster or can be dynamically created with Amazon Auto Scaling.

The Jenkins Amazon EC2 Container Service plugin will use this ECS cluster and will create automatically the required Task Definition.

Jenkins System Configuration

Navigate to the "Configure System" screen.

In the "Jenkins Location" section, ensure that the "Jenkins URL" is reachable from the the container instances of the Amazon ECS cluster. See the section "Network and firewalls" for more details.

If the global Jenkins URL configuration does not fit your needs (e.g. if your ECS agents must reach Jenkins through some kind of tunnel) you can also override the Jenkins URL in the Advanced Configuration of the ECS cloud.

At the bottom of the screen, click on "Add a new Cloud" and select "Amazon EC2 Container Service Cloud".

Amazon EC2 Container Service Cloud

Then enter the configuration details of the Amazon EC2 Container Service Cloud:

  • Name: name for your ECS cloud (e.g. `ecs-cloud`)
  • Amazon ECS Credentials: Amazon IAM Access Key with privileges to create Task Definitions and Tasks on the desired ECS cluster
  • ECS Cluster: desired ECS cluster on which Jenkins will send builds as ECS tasks
  • ECS Template: click on "Add" to create the desired ECS template or templates


Advanced Configuration

  • Tunnel connection through: tunnelling options (when Jenkins runs behind a load balancer...).
  • Alternative Jenkins URL: The URL used as the Jenkins URL within the ECS containers of the configured cloud. Can be used to override the default Jenkins URL from global configuration if needed. 

ECS Agent Templates

One or several ECS agent templates can be defined for the Amazon EC2 Container Service Cloud. The main reason to create more than one ECS agent template is to use several Docker image to perform build (e.g. java-build-tools, php-build-tools...)

  • Template name is used (prefixed with the cloud's name) for the task definition in ECS.
  • Label: agent labels used in conjunction with the job level configuration "Restrict where the project can be run / Label expression". ECS agent label could identify the Docker image used for the agent (e.g. `docker` for the jenkinsci/jnlp-slave).
  • Docker image: identifier of the Docker image to use to create the agents
  • Filesystem root: working directory used by Jenkins (e.g. `/home/jenkins/`).
  • Memory: number of MiB of memory reserved for the container. If your container attempts to exceed the memory allocated here, the container is killed.
  • The number of cpu units to reserve for the container. A container instance has 1,024 cpu units for every CPU core.

Advanced Configuration

  • Override entrypoint: overwritten Docker image entrypoint. Container command can't be overriden as it is used to pass jenkins agent connection parameters.
  • JVM arguments: additional arguments for the JVM, such as `-XX:MaxPermSize` or GC options.

Network and firewalls

Running the Jenkins master and the ECS container instances in the same Amazon VPC and in the same subnet is the simplest setup and default settings will work out-of-the-box.

Firewalls

If you enable network restrictions between the Jenkins master and the ECS cluster container instances,

  • Fix the TCP listen port for JNLP agents of the Jenkins master (e.g. `5000`) navigating in the "Manage Jenkins / Configure Global Security" screen
  • Allow TCP traffic from the ECS cluster container instances to the Jenkins master on the listen port for JNLP agents (see above) and the HTTP(S) port.

Network Address Translation and Reverse Proxies

In case of Network Address Translation rules between the ECS cluster container instances and the Jenkins master, ensure that the JNLP agents will use the proper hostname to connect to the Jenkins master doing on of the following:

  • Define the proper hostname of the Jenkins master defining the system property `hudson.TcpSlaveAgentListener.hostName` in the launch command
  • Use the advanced configuration option "Tunnel connection through" in the configuration of the Jenkins Amazon EC2 Container Service Cloud (see above).

IAM role

We recommend you create a dedicated amazon IAM role to delegate Jenkins access to your ECS cluster.

ecs:DescribeTaskDefinition

Effect

Action

Resource

Allow

ecs:ListClusters

*

Allow

ecs:DescribeContainerInstances

*

Allow

ecs:RegisterTaskDefinition

*

Allow

ecs:ListTaskDefinitions

*

Allow

ecs:DescribeTaskDefinition

*

Allow

ecs:RunTask

arn:aws:ecs:<region>:<accountId>:task-definition/<cloud name>-<template name>:*

Allow

ecs:StopTask

arn:aws:ecs:<region>:<accountId>:cluster/<clusterName>
arn:aws:ecs:<region>:<accountId>:task/*

Allow

ecs:ListContainerInstances

arn:aws:ecs:<region>:<accountId>:cluster/<clusterName> 

Allow

ecs:DescribeTasks

arn:aws:ecs:<region>:<accountId>:task/*

Here is a sample policy file if you prefer using one  :

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1452746887373",
            "Action": [
                "ecs:RegisterTaskDefinition",
                "ecs:ListClusters",
                "ecs:DescribeContainerInstances",
                "ecs:ListTaskDefinitions",
                "ecs:DescribeTaskDefinition"
            ],
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Sid": "Stmt1452746887374",
            "Action": [
                "ecs:StopTask",
                "ecs:ListContainerInstances"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:ecs:<region>:<accountId>:cluster/<clusterName>"
        },
        {
            "Sid": "Stmt1452746887375",
            "Action": [
                "ecs:RunTask"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:ecs:<region>:<accountId>:task-definition/jenkins-agent:*"
        },
        {
            "Sid": "Stmt1452746887376",
            "Action": [
                "ecs:StopTask",
		"ecs:DescribeTasks"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:ecs:<region>:<accountId>:task/*"
        }
    ]
}

Usage

The ECS agents can be used for any job and any type of job (Freestyle job, Maven job, Workflow job...), you just have to restrict the execution of the jobs on one of the labels used in the ECS Agent Template configuration. Sample with a label named `docker`:

In the console output of the executed builds, you can verify that the build was performed on the ECS cluster checking the agent name that is composed of the ECS cloud name and of a random identifier. Sample console output of a build executed on a agent managed by an ECS cloud named `ecs-cloud`:

Docker Images for ECS Agents

The Jenkins Amazon EC2 Container Service Cloud can use for the agents all the Docker image designed to act as a Jenkins JNLP agent. Here is a list of compatible Docker images:

You can easily extend one of these images to add tools or you can create your own Docker image.

Resources

Versions

see Changelog

43 Comments

  1. If there is a more correct place to post this, obviously let me know.

    Works wonderfully with the recommended jnlp-slaves from Doker Hub.  I Built a copy of the jnlp-slave that is centos based and stored in ECR.  I can manually login to the ECS host and run the image.  But through the pluggin on that same host the task is never kicked off.  I tried using the short name ex "myrepo/jnlp-slave" and the ECR fully qualified name  "blah blah aws/myrepo/jnlp-slave".  

    1. Looking at the code, should be working.   User being used is an Admin for now.  So full permissions.

    2. Start the teasing.  

  2. Can the ec2 instances be automatically terminated/killed like with this plugin: https://wiki.jenkins-ci.org/display/JENKINS/Amazon+EC2+Plugin

  3. Trying to get this working with a Windows ECS cluster but not having any luck connecting slave to master over jnlp.  The agent name and secret key are not coming in.  This works fine for us with Linux Amazon ECS.  Is Windows ECS supported and if so would you be able to provide any examples of this for Windows?

    1. I figured out my issue.  There were a few things that I wasn't understanding and I was able to see my problems by looking at the stopped container docker container.

      1. The plugin sends in the url, slave name, and secret using runtask and the way that you access these parameters is by having an ENTRYPOINT.  That ENTRYPOINT then will automatically have the entire run command, in this case the jnlp connection paramters, as parameters to it.
      2. In your docker file you will need to download java and also download slave.jar from Jenkins.  Best url for this is 
      3. The ENTRYPOINT for my dockerfile looks like this:
        1. SHELL ["cmd"]
          ENTRYPOINT ["c:\\jenkins\\start_slave.cmd"]
        2. start_slave.cmd looks like this:
          java -cp slave.jar hudson.remoting.jnlp.Main -headless %*
      1. Could you please, clarify how did you configure it? I can't build job on windows ecs cluster. Also, I tried to use: https://github.com/alisade/ecs-windows-jnlp-slave instead of jenkins/jnlp-slave because of Windows (smile)
        But it doesn't work (sad)

        Jenkins log attached: jenkins-log.txt

  4. Does any once face error during build process "Jenkins doesn’t have label ecs", I have already a label created during Add Template for cloud.

    Using Ubuntu 16, Jenkins 2.60.1 

    1. May be you have a space after 'ecs ' word in 'Restrict where this project can be run' field?

  5. Could you develop Usage function into this plugin, so I can set usage: use this node as much as possible. Then I can set 0 for # of executors to ensures that all builds run in newly provisioned containers instead of the Jenkins master. This function have in Docker plugin

    http://imgur.com/a/1y4v2

  6. Found a typo while trying to figure this out: "Match on container defintion"

    Is there a way to diagnose why it is not matching container definitions?

    1. Ok, I didn't understand what this mean, this is just saying that it hasn't found the task already running. I thought this had something to do with why the task wasn't actually starting - I'll have to look elsewhere for that apparently.  (However that typo does really exist. (smile) )

      1. FTR: My issue was I was using jenkinsci/slave instead of jenkinsci/jnlp-slave.

  7. I created a fork and a PR to add "Container User" - we had a need to override the container user. Not sure if anyone else will find that useful.

  8. Hey, thanks for the great plugin. We use it GREAT effect dozens of times per day... Anyway, curious if you could shed some light on how to direct the plugin to use a different task placement strategy than binpack. This strategy doesn't make effective use of the multiple ECS host instances that we have dedicated to agent containers as it sequentially launches the containers on one instance and then only the next available instance when the first runs out of CPU units. Thanks!

  9. I am looking at the ECS plugin and the Spot Fleet Plugin. 

    https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Container+Service+Plugin
    https://wiki.jenkins.io/display/JENKINS/Amazon+EC2+Fleet+Plugin

    Ideally I would like to be able to use container images AND spot/spot fleet (for the autoscale slave nodes). Is this possible? 

    1. I haven't used the Fleet Plugin, but I would wager that you could have both installed. You likely would have to specifically point workloads at specific stacks (i.e. it probably couldn't intelligently decide to use one build agent stack over another depending on input criteria such as time of day, etc.), but I can't think of any reason why it wouldn't work.

    2. Would pointing your ECS plugin to a cluster running on spot instances help? I mean there isn't a specifc spot container type, as far as I'm aware, either your ecs-backing nodes are on-demand/reserved or they are spot.

  10. Is it possible to use this plugin with multi branch pipeline ?

  11. Are there any plans to support ECS Fargate? This would be ideal to support a fully dynamic ECS environment. Obviously this would require some modification to the plugin as EC2 tasks are not compatible with Fargate clusters.

    1. Someone has in fact opened a PR on the Github project for this plugin to add support for this very thing (https://github.com/jenkinsci/amazon-ecs-plugin/pull/51). Unfortunately, as there is currently no maintainer for this plugin, it has been sitting around in the PR review queue for some time.

      What I did was fork the plugin repo on github and merge the PR code in so that I can use the plugin w/ Fargate. I also added a couple PRs of my own for task execution role and ulimit config support (#53 and #54).

      You can find my fork here if you want to check it out, until such time as the PRs are merged into the official repo:

      https://github.com/jcragg/amazon-ecs-plugin

  12. Is there a way to use this within a pipeline so that the same container will be used for multiple stages? Currently it terminates the container after the first stage.

  13. Isn't the point of using docker slaves to specify a container including the specific tools for one pipeline? What's the point of a plugin that just runs a single type of docker container with no tools? Or am I missing something? Can I specify a container with a variant of e.g. agent  \{ docker 'ruby' }?

  14. I've been messing with this for a few hours and I can't get it to work. I'm seeing the following error message.

    WARNING: Unexpected exception encountered while provisioning agent ECS Slave generic
    com.amazonaws.services.ecs.model.ClientException: The user value contains invalid characters. Enter a value that matches the pattern ^([a-z0-9_][a-z0-9_-]{0,30})$ (Service: AmazonECS; Status Code: 400;
    1. What's the value of your "user"?

      1. Good question. The logs don't say. $USER = jenkins.  I'm not quite sure what "user" even is in this context. Mine is set up to allow ECS access via IAM role. 

        1. I had this regression apparently with 1.13 too. Managed to fix it by clicking Advanced under ECS Slave Templates for my agent cloud, and setting ContainerUser to something non-empty (which it was defaulted to).

          Not sure what this is being wired up to in the ECS API, but suggest it should have a sensible default.

    2. I have this problem too and I have to downgrade it back to version 1.12

    3. I forgot to mention it here, but v1.14 has been released and should fix that issue. Sorry for the inconvenience!

  15. Is there any docker image like ( jenkinsci/jnlp-slave & cloudbees/jnlp-slave-with-java-build-tools ) that I can use to build my application's docker image ?

    1. I've derived my image from one of these, added docker CE, and mounted the container mapping /var/run/docker.sock into /var/run/docker.sock. Afterwards I've used Amazon ECR plugin as credential provider to ECR and CloudBees Docker Build and Publish plugin as a step to build and publish the image. 
       

  16. And what are the requirements to execute a container in privileged mode, when I try to tick privileged mode for my template I get following error :

     

    04-May-2018 17:19:41.468 WARNING [Computer.threadPoolForRemoting [#6288]] com.cloudbees.jenkins.plugins.amazonecs.ECSService.runEcsTask Slave testECS-37ce414184fbe3 - Failure to run task with definition arn:aws:ecs:us-west-2:6546456465:task-definition/testECS-DefaultCentosJSlaveWithDockerDaemon:5 on ECS cluster arn:aws:ecs:us-west-2:6546456465:cluster/jenkins
    04-May-2018 17:19:41.469 WARNING [Computer.threadPoolForRemoting [#6288]] com.cloudbees.jenkins.plugins.amazonecs.ECSService.runEcsTask Slave testECS-37ce414184fbe3 - Failure reason=ATTRIBUTE, arn=arn:aws:ecs:us-west-2:6546456465:container-instance/180f62b9-a8aa-4222-9ce9-468df185fce9
  17. Hi there,

    Is there a way to set the Task Role on an ECS slave templates?

    My docker container has aws installed and is calling aws with secret / access key from env variable so far. But i'm looking to remove all access / secret key and only use role.

    However I coud not find a way to assign a role to a task run by this plugin.

    1. When you click on "Advanced", you can also define the TaskRole ARN.

  18. From my understanding of this plugin with the latest release (1.6), are we still required to manually set up an ECS Task Definition in AWS first when we want to launch a slave in FARGATE?

    I am set up all the configs yet I get an error

    • No existing task definition found for family or ARN: xxx 

    • No Fargate configuration exists for given values.

    By that defintion does that mean on top of the configs we set up in the Jenkins config for this plugin, we also need toa

    • prepare an ECS Task Defintion in AWS that matches the exact configs we gave to Jenkins?

    But if that is the case, it does not make sense to me because there is a config for "Task Definition Override" which stated "Externally-managed ECS task definition to use, instead of creating task definitions using the Template Name. This value takes precedence over all other container settings." So I assumed the plugin does have the ability to create its own task definition?

    1. Basically, it should work in both ways:
      1) Setup all the settings inside Jenkins and the plugin create the task definition automatically
      2) Define your own task definition and set the arn in "Task Definition Override". For Fargate, you still have to provide Subnets and SecurityGroup

       

      Actually, I haven't tried it out by myself to let the plugin create a Fargate task definition. Maybe there's still a bug?

      1. Yes it does seem like the plugin is not creating the Task defintion when using Fargate

         

        From the logs

        Asked to provision 1 slave(s) for: ecs-java
        Jun 12, 2018 12:41:34 PM INFO com.cloudbees.jenkins.plugins.amazonecs.ECSCloud provision
        Will provision ECS Slave ecs-java, for label: ecs-java
        Jun 12, 2018 12:41:34 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
        Started provisioning ECS Slave ecs-java from myproject-ecs with 1 executors. Remaining excess workload: 0
        Jun 12, 2018 12:41:34 PM INFO com.cloudbees.jenkins.plugins.amazonecs.ECSCloud$ProvisioningCallback call
        Created Slave: myproject-ecs-751ea9bb93e6
        Jun 12, 2018 12:41:35 PM INFO com.cloudbees.jenkins.plugins.amazonecs.ECSService findTaskDefinition
        No existing task definition found for family or ARN: myproject-ecs-ecs-java-slave
        Jun 12, 2018 12:41:35 PM WARNING com.cloudbees.jenkins.plugins.amazonecs.ECSCloud$ProvisioningCallback call
        Slave {0} - Cannot create ECS Task
        Jun 12, 2018 12:41:44 PM WARNING hudson.slaves.NodeProvisioner$2 run
        Unexpected exception encountered while provisioning agent ECS Slave ecs-java
        com.amazonaws.services.ecs.model.ClientException: No Fargate configuration exists for given values.

        And my plugin settings where just

    2. You need to specify valid values for CPU and Memory reservation for FARGATE. You cannot use any value. 

      Please see the table provided in the AWS config. I had the same issue as you and fixed it using the right values.

      https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html

      I also specified Hard Memory Reservation instead of Soft.

      Hope it helps!

  19. Hi All,

    I have successfully configured this plugin and it works fin. but has following limitation:

    If I use MultiJob plugin with  more than two subjobs, only two jobs runs in parallel, and the other are waiting for an available executor.

    Do you know how to solve this? it looks like a maximum of 3 ECS container slaves can run with the same label in parallel.

    Thank you!

    Eli

  20. Hi, All

    managed to configure the plugin with Fargate. Can confirm that Fargate launch can't create task definition, had to create manually. Have the same question as Eli Sh, are there any limitations on the amount of tasks running at the same time for one label ?

    Regards
    Alex

  21. I am running Jenkins (2.121.2) as a container backed by EFS as file store with amazon-ecs:1.11.

    I recently upgraded the plugin to amazon-ecs:1.16 by modifying plugins.txt (which is run in jenkins Dockerfile) 

     

    After upgrade I can see that the plugin version is latest but the configuration page for ecs still shows old screen (no fargate, Template Name/Task Definition Override etc configurations)

    I then uninstalled amazon-ecs:1.16 plugin, restarted my jenkins and then installed amazon-ecs:1.16 through console again.

     

    Even after that, I am unable to see the new configuration. Is there something wrong with my upgrade process. If so, please guide

  22. Would somebody please give us some advise on how to HARVEST AND KEEP Agent Logs after slave is destroyed? most of the time our jobs reach some point where they suddenly drop out the io channel and close the connection between agent and master without any further information. The channel is being closed from within slave side without any clue of what caused the termination (perhaps oom? too many open files? hard to know!) thanks!

    1. I am having exactly the same problem: build starts, task is executed in ecs cluster, task runs for 2-3 minutes and then gets terminated, essentially cancelling/interrupting the build. Could not figure out the root cause, but will let you know if I find something.

      Speaking of agent logs, you can find them:

      1. Some logs are available directly in your ECS cluster task log output
      2. The same logs are also visible in AWS CloudWatch 
      3. You can also see agent logs in  https://yourjenkins.com/log/all