We were lucky and got an early hands on of the Amazon Container Service preview, in this article we will provide an overview what is currently possible with the service and what is still under development. Overall it will give you a good overview what you can expect from the final product.
What is ECS?
Amazon EC2 Container Service is basically “a highly scalable, high performance container management service that supports Docker containers and allows you to easily run distributed applications on a managed cluster of Amazon EC2 instances. Amazon EC2 Container Service lets you launch and stop container-enabled applications with simple API calls, allows you to query the state of your cluster from a centralized service, and gives you access to many familiar Amazon EC2 features like security groups, EBS volumes and IAM roles.” as stated on the AWS product and services page.
Prerequisites
In order to get started with the Amazon Container Service an IAM role needs to be created that allows the individual EC2 instances to communicate with the Amazon Container Service API and a special preview version of the AWS CLI needs to be downloaded. Unfortunately there isn’t any web gui in this early stage, the container service can only be accessed via the CLI.
Launching the cluster
Now it is time to launch the cluster and its first instance. There is a special AMI version of the Amazon Linux that you can use for this, but is runs with pretty much every image that allows you to run Docker. So we opted for a base Ubuntu 14.04 image and fired up an EC2 instance with the IAM role created earlier.
This first step is to create a new cluster with the aws cli:
$ aws ecs create-cluster --cluster-name HappyCloudSolutions { "cluster": { "clusterName": "HappyCloudSolutions", "status": "ACTIVE", "clusterArn": "arn:aws:ecs:us-east-1:aws_account_id:cluster/HappyCloudSolutions" } }
After login into the recently launched instance you need to install docker and then launch a special ecs-container that acts as an agent for the Amazon Container Service:
$ curl -sSL https://get.docker.com/ubuntu/ | sudo sh $ sudo docker run --name ecs-agent -d \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /var/log/ecs/:/log -p 127.0.0.1:51678:51678 \ -e ECS_LOGFILE=/log/ecs-agent.log \ -e ECS_LOGLEVEL=info \ -e ECS_CLUSTER=HappyCloudSolutions \ amazon/amazon-ecs-agent:latest
The ECS_CLUSTER parameter needs to be matching the cluster name created earlier, this allows for different clusters throughout the EC2 environment. This shows the running container:
$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ff80a2cd157a amazon/amazon-ecs-agent:0dec3e3 "/agent" 8 seconds ago Up 7 seconds 127.0.0.1:51678->51678/tcp ecs-agent
Using the AWS CLI you can now list the container instances registered for the Amazon Container Service:
$aws ecs list-container-instances --cluster HappyCloudSolutions { "containerInstanceArns": [ "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133" ] }
This reveals the instance and the unique container instance ID of bad8b85e-637d-42be-9264-e56e77687133. Using this ID more information can be obtained with the describe-container-instance command:
$ aws ecs describe-container-instances --cluster HappyCloudSolutions --container-instances bad8b85e-637d-42be-9264-e56e77687133 { "failures": [], "containerInstances": [ { "status": "ACTIVE", "registeredResources": [ { "integerValue": 1024, "longValue": 0, "type": "INTEGER", "name": "CPU", "doubleValue": 0.0 }, { "integerValue": 3764, "longValue": 0, "type": "INTEGER", "name": "MEMORY", "doubleValue": 0.0 }, { "name": "PORTS", "longValue": 0, "doubleValue": 0.0, "stringSetValue": [ "2376", "22", "51678", "2375" ], "type": "STRINGSET", "integerValue": 0 } ], "ec2InstanceId": "i-0a2112f4", "agentConnected": true, "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "remainingResources": [ { "integerValue": 1024, "longValue": 0, "type": "INTEGER", "name": "CPU", "doubleValue": 0.0 }, { "integerValue": 3764, "longValue": 0, "type": "INTEGER", "name": "MEMORY", "doubleValue": 0.0 }, { "name": "PORTS", "longValue": 0, "doubleValue": 0.0, "stringSetValue": [ "2376", "22", "51678", "2375" ], "type": "STRINGSET", "integerValue": 0 } ] } ] }
The most interesting is the remaining resources part, as it shows the amount of CPU (1024 units or one core) and memory (3764 MB) available. This means our one node cluster is now up and running and ready to run containers, but how do we start them?
Defining a task
A task is basically a json description of what should run on a cluster. A simple example would be a WordPress site with a MySql database as backend. With the register-task-definition command we can add a task to the cluster, in this case from the wordpess.json file:
$ aws ecs register-task-definition --family wordpress --container-definitions file://wordpress.json { "taskDefinition": { "taskDefinitionArn": "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1", "containerDefinitions": [ { "environment": [ { "name": "DB_USER", "value": "root" }, { "name": "DB_PASS", "value": "pass" } ], "name": "wordpress", "links": [ "db" ], "image": "tutum/wordpress-stackable", "essential": true, "portMappings": [ { "containerPort": 80, "hostPort": 80 } ], "entryPoint": [ "/bin/sh", "-c" ], "memory": 500, "cpu": 10 }, { "environment": [ { "name": "MYSQL_ROOT_PASSWORD", "value": "pass" } ], "name": "db", "image": "mysql", "cpu": 10, "portMappings": [], "entryPoint": [ "https://cdn.happycloudsolutions.com.au/entrypoint.sh" ], "memory": 500, "essential": true } ], "family": "wordpress", "revision": 1 } }
The information in the json file is similar to what is required for a docker run command, including which image to use, the name, port mappings and CPU and memory required. Specific to ECS is the essential: true pair, which defines that this container is essential for the task and on failure will stop the whole task. It is also possible to store multiple revisions of the same task, in this case it is revision 1. With the list-task-definitions command we can see the tasks available in the cluster:
$ aws ecs list-task-definitions { "taskDefinitionArns": [ "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1" ] }
The run-task command will then start the task in the cluster, with the count option we define how many instances of the task should be running:
$aws ecs run-task --cluster HappyCloudSolutions --task-definition wordpress:1 --count 1 { "failures": [], "tasks": [ { "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "overrides": { "containerOverrides": [ { "name": "wordpress" }, { "name": "db" } ] }, "lastStatus": "PENDING", "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "desiredStatus": "RUNNING", "taskDefinitionArn": "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1", "containers": [ { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/2d4e9016-4e2b-44d7-8df1-7912fb00e64b", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "lastStatus": "PENDING", "name": "wordpress" }, { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/33633fbe-6b22-49aa-a2b0-43b3106ab76c", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "lastStatus": "PENDING", "name": "db" } ] } ] }
The lastStatus: pending shows that the containers are now starting up and the images will be pulled from the repository. With the list-task command the active tasks on the cluster will be shown:
$ aws ecs list-tasks --cluster HappyCloudSolutions { "taskArns": [ "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191" ] }
Again this gives us the task ID which we can use for the describe task command:
$ aws ecs describe-tasks --cluster HappyCloudSolutions --task a4115c20-717e-444f-ba45-26933af9f191 { "failures": [], "tasks": [ { "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "overrides": { "containerOverrides": [ { "name": "wordpress" }, { "name": "db" } ] }, "lastStatus": "RUNNING", "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "desiredStatus": "RUNNING", "taskDefinitionArn": "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1", "containers": [ { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/2d4e9016-4e2b-44d7-8df1-7912fb00e64b", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "lastStatus": "RUNNING", "name": "wordpress", "networkBindings": [ { "bindIP": "0.0.0.0", "containerPort": 80, "hostPort": 80 } ] }, { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/33633fbe-6b22-49aa-a2b0-43b3106ab76c", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "lastStatus": "RUNNING", "name": "db", "networkBindings": [] } ] } ] }
The lastStatus now has switched to running and failures:[] is empty, which means everything is up and running now. We can also use the docker ps -a command to show the running containers on the instance.
docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES dd53717236d9 tutum/wordpress-stackable:latest "https://cdn.happycloudsolutions.com.au/run-wordpress.sh" 58 seconds ago Up 58 seconds 0.0.0.0:80->80/tcp ecs-wordpress-1-wordpress-c69a958094d68789ac01 d52f7fcce1af mysql:latest "/entrypoint.sh mysq 2 minutes ago Up 2 minutes 3306/tcp ecs-wordpress-1-db-f0febc8687b6a0d37b00 ff80a2cd157a amazon/amazon-ecs-agent:0dec3e3 "/agent" 16 minutes ago Up 16 minutes 127.0.0.1:51678->51678/tcp ecs-agent
As expected everything is running fine.
What happens on a container failure?
To simulate a container failure we are simply going to remove the WordPress container with docker rm -f and run the describe-tasks command again:
$ aws ecs describe-tasks --cluster HappyCloudSolutions --task a4115c20-717e-444f-ba45-26933af9f191 { "failures": [], "tasks": [ { "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "overrides": { "containerOverrides": [ { "name": "wordpress" }, { "name": "db" } ] }, "lastStatus": "STOPPED", "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "desiredStatus": "STOPPED", "taskDefinitionArn": "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1", "containers": [ { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/2d4e9016-4e2b-44d7-8df1-7912fb00e64b", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "name": "wordpress", "networkBindings": [ { "bindIP": "0.0.0.0", "containerPort": 80, "hostPort": 80 } ], "lastStatus": "STOPPED", "exitCode": -1 }, { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/33633fbe-6b22-49aa-a2b0-43b3106ab76c", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/a4115c20-717e-444f-ba45-26933af9f191", "lastStatus": "STOPPED", "name": "db", "networkBindings": [] } ] } ] }
The lastStatus has now been changed to stopped on both containers, this is because of the essential:true we defined earlier. A failure of an essential container will terminate all other containers in the same task. The docker ps -a command shows the same:
$docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d52f7fcce1af mysql:latest "/entrypoint.sh mysq 6 minutes ago Exited (0) 8 seconds ago ecs-wordpress-1-db-f0febc8687b6a0d37b00 ff80a2cd157a amazon/amazon-ecs-agent:0dec3e3 "/agent" 20 minutes ago Up 20 minutes 127.0.0.1:51678->51678/tcp ecs-agent
Unfortunately the failing containers don’t get restarted automatically at this point in time, most likely this feature will be added at a later stage of the preview. But the task can be manually restarted with the start-task command:
$ aws ecs start-task --cluster HappyCloudSolutions --task-definition wordpress:1 --container-instances bad8b85e-637d-42be-9264-e56e77687133 { "failures": [], "tasks": [ { "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/b299851d-e4f0-40ba-a651-2b930c311057", "overrides": { "containerOverrides": [ { "name": "wordpress" }, { "name": "db" } ] }, "lastStatus": "PENDING", "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "desiredStatus": "RUNNING", "taskDefinitionArn": "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1", "containers": [ { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/eedbb198-166b-4e22-a1e7-a059d903b2e2", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/b299851d-e4f0-40ba-a651-2b930c311057", "lastStatus": "PENDING", "name": "wordpress" }, { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/00a4163e-42d5-4d60-b598-26de9dd0850b", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/b299851d-e4f0-40ba-a651-2b930c311057", "lastStatus": "PENDING", "name": "db" } ] } ] }
Having a look with the docker ps -a command shows that two new containers are now running:
# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b6702a341610 tutum/wordpress-stackable:latest "https://cdn.happycloudsolutions.com.au/run-wordpress.sh" 2 seconds ago Up 1 seconds 0.0.0.0:80->80/tcp ecs-wordpress-1-wordpress-b0b4f985cb9b8bf21f00 71d4464ed6f2 mysql:latest "/entrypoint.sh mysq 3 seconds ago Up 3 seconds 3306/tcp ecs-wordpress-1-db-de829bece6e7ecf1b501 d52f7fcce1af mysql:latest "/entrypoint.sh mysq 27 minutes ago Exited (0) 20 minutes ago ecs-wordpress-1-db-f0febc8687b6a0d37b00 ff80a2cd157a amazon/amazon-ecs-agent:0dec3e3 "/agent" 41 minutes ago Up 41 minutes 127.0.0.1:51678->51678/tcp ecs-agent
This also means that the existing container wasn’t restarted and simply left as is. Let’s have another look at the describe-container-instances command:
$ aws ecs describe-container-instances --cluster HappyCloudSolutions --container-instances bad8b85e-637d-42be-9264-e56e77687133 { "failures": [], "containerInstances": [ { "status": "ACTIVE", "registeredResources": [ { "integerValue": 1024, "longValue": 0, "type": "INTEGER", "name": "CPU", "doubleValue": 0.0 }, { "integerValue": 3764, "longValue": 0, "type": "INTEGER", "name": "MEMORY", "doubleValue": 0.0 }, { "name": "PORTS", "longValue": 0, "doubleValue": 0.0, "stringSetValue": [ "2376", "22", "51678", "2375" ], "type": "STRINGSET", "integerValue": 0 } ], "ec2InstanceId": "i-0a2112f4", "agentConnected": true, "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "remainingResources": [ { "integerValue": 1004, "longValue": 0, "type": "INTEGER", "name": "CPU", "doubleValue": 0.0 }, { "integerValue": 2764, "longValue": 0, "type": "INTEGER", "name": "MEMORY", "doubleValue": 0.0 }, { "name": "PORTS", "longValue": 0, "doubleValue": 0.0, "stringSetValue": [ "2376", "22", "80", "51678", "2375" ], "type": "STRINGSET", "integerValue": 0 } ] } ] }
In the task definition at the beginning the cpu requirements for both containers were set to 10 and the memory requirements to 500. As now one container each is running the available cpu is 1004 and the available memory 2764. This means that the Amazon Container Service keeps track of the resource requirements across cluster nodes.
Launch a second node
We simply repeated the launch of an instance as done earlier and a second node will be added to the container service as we can see with the list-container-instances command:
$ aws ecs list-container-instances --cluster HappyCloudSolutions { "containerInstanceArns": [ "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "arn:aws:ecs:us-east-1:aws_account_id:container-instance/f0b60750-1f4b-4e7e-a4c3-0257ad797bc7" ] }
Then we started another 5 of the WordPress / Mysql task.
$ aws ecs run-task --cluster HappyCloudSolutions --task-definition wordpress:1 --count 5 { "failures": [ { "reason": "RESOURCE:PORTS", "arn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133" }, { "reason": "RESOURCE:PORTS", "arn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/f0b60750-1f4b-4e7e-a4c3-0257ad797bc7" } ], "tasks": [ { "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/6ba7c117-429c-481d-866b-dd43fb169434", "overrides": { "containerOverrides": [ { "name": "wordpress" }, { "name": "db" } ] }, "lastStatus": "PENDING", "containerInstanceArn": "arn:aws:ecs:us-east-1:aws_account_id:container-instance/f0b60750-1f4b-4e7e-a4c3-0257ad797bc7", "desiredStatus": "RUNNING", "taskDefinitionArn": "arn:aws:ecs:us-east-1:aws_account_id:task-definition/wordpress:1", "containers": [ { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/9188e656-637e-40c1-9b3c-beb82700662d", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/6ba7c117-429c-481d-866b-dd43fb169434", "lastStatus": "PENDING", "name": "wordpress" }, { "containerArn": "arn:aws:ecs:us-east-1:aws_account_id:container/7732b89e-1d17-4535-9953-b6ded23fb13f", "taskArn": "arn:aws:ecs:us-east-1:aws_account_id:task/6ba7c117-429c-481d-866b-dd43fb169434", "lastStatus": "PENDING", "name": "db" } ] } ] }
As you can see the failures parts is not empty and contains an error. In this case the reason simply states RESOURCE: PORTS. The cause lies in the port bindings of the WordPress container, as it binds to port 80 and as soon as one container is running no additional one can be launched. We decided then to use a simple busybox image and run the sleep command on start, this will allow us to start as many containers as we wanted.
$ aws ecs run-task --cluster HappyCloudSolutions --task-definition sleep:1 --count 30 A client error (ClientException) occurred when calling the RunTask operation: Task count can not be greater than 10
So one of the limitations of the preview is that at present you can not start more than 10 tasks of the same kind. Running the same command again with only 10 tasks and having a look at docker ps -a shows that the container service has distributed the container instanced equally between the two cluster nodes.
docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES aa0760773d49 busybox:latest "sleep 3600" About a minute ago Up About a minute ecs-sleep-1-sleep-f088ecc7bba3dfef0900 0f93ffed8801 busybox:latest "sleep 3600" About a minute ago Up About a minute ecs-sleep-1-sleep-a6a0f1cadafbc3ec6400 935fa01d93e3 busybox:latest "sleep 3600" About a minute ago Up About a minute ecs-sleep-1-sleep-ecd5f9a4c49be3c8a501 10a9b31b7bfe busybox:latest "sleep 3600" About a minute ago Up About a minute ecs-sleep-1-sleep-eea2fa8ea1feb39a4b00 52f15d6d06f4 busybox:latest "sleep 3600" About a minute ago Up About a minute ecs-sleep-1-sleep-c49ddf93da98bacbf701 3ea1fbccb3cb tutum/wordpress-stackable:latest "https://cdn.happycloudsolutions.com.au/run-wordpress.sh" 8 minutes ago Up 8 minutes 0.0.0.0:80->80/tcp ecs-wordpress-1-wordpress-b2a083a09b809abcc001 72247957b1da mysql:latest "/entrypoint.sh mysq 10 minutes ago Up 10 minutes 3306/tcp ecs-wordpress-1-db-eacdb6d4aed4c0fdf901 511f3ef2a403 amazon/amazon-ecs-agent:0dec3e3 "/agent" 14 minutes ago Up 14 minutes 127.0.0.1:51678->51678/tcp ecs-agent
What happens on a node failure?
To simulate a node failure we terminated one of the cluster nodes from within the AWS dashboard, which still is a grace full shutdown and the cluster agent should be able to notify the container service. But having a look at the list-tasks output still shows the 10 instances of the sleep task and the 2 WordPress ones.
aws ecs list-tasks --cluster HappyCloudSolutions { "taskArns": [ "arn:aws:ecs:us-east-1:aws_account_id:task/06f36a64-ad3f-4adb-972b-f8d9d9325738", "arn:aws:ecs:us-east-1:aws_account_id:task/102be95b-484d-4dd3-9509-9a4d044a6fd0", "arn:aws:ecs:us-east-1:aws_account_id:task/1f3e1046-b7d8-4880-bc85-baf43e0848a5", "arn:aws:ecs:us-east-1:aws_account_id:task/6ba7c117-429c-481d-866b-dd43fb169434", "arn:aws:ecs:us-east-1:aws_account_id:task/7db8b6d5-c662-4604-a8d7-ff4d854de6cb", "arn:aws:ecs:us-east-1:aws_account_id:task/7e67fc7a-c684-4325-9c56-667f9c4d2753", "arn:aws:ecs:us-east-1:aws_account_id:task/92d31536-0a4c-4dc5-af18-b74f13f99312", "arn:aws:ecs:us-east-1:aws_account_id:task/b299851d-e4f0-40ba-a651-2b930c311057", "arn:aws:ecs:us-east-1:aws_account_id:task/b5eacff5-6225-489a-ba8a-9d2963e3ec01", "arn:aws:ecs:us-east-1:aws_account_id:task/b899e843-8064-452f-8f67-aad016b2540a", "arn:aws:ecs:us-east-1:aws_account_id:task/e2c9201c-0b3e-42c7-a9c3-146cf88a2718", "arn:aws:ecs:us-east-1:aws_account_id:task/f9a4a24a-ed08-46f6-ad67-59d944a63061" ] }
Also a look at the list-container-instances command reveals that both instances are still listed although one of them doesn’t exist anymore.
$ aws ecs list-container-instances --cluster HappyCloudSolutions { "containerInstanceArns": [ "arn:aws:ecs:us-east-1:aws_account_id:container-instance/bad8b85e-637d-42be-9264-e56e77687133", "arn:aws:ecs:us-east-1:aws_account_id:container-instance/f0b60750-1f4b-4e7e-a4c3-0257ad797bc7" ] }
This behaviour has been acknowledged in the Amazon container service preview forum already and a shell script is provided as a workaround to clean up non-existent ec2 instances.
Conclusion
In the current stage of the preview the Amazon Container service allows the launch of containers defined in tasks across a number of nodes. It hasn’t implemented features for container or node failures and thus is not a highly failure resilient platform yet, but we expect this to arrive in the final version of the service.
Share this Post