neveragain.de teletype

AWS: ECS ExecuteCommand: A Quick Look

2021-03-16

Amazon Elastic Container Service can be used in two flavors – you can either run your own EC2 instances1 that run an agent and connect to the ECS control plane, or you can use AWS Fargate to never worry about managing servers again and join the serverless revolution.

One of the last reasons to prefer EC2-backed ECS over Fargate-backed ECS was the ability to simply ssh to your EC2 instances and docker exec into any container. This was not possible with Fargate, as AWS is in control of the host that runs your Fargate micro-VM.

Yesterday, the new ECS Exec feature was announced, which allows interactive shell access to containers.

Some key points

It uses SSM Session Manager under the hood, therefore:

Configuration

To enable this:

  1. add SSM permissions to the task role – I used the AmazonSSMManagedInstanceCore policy, but for production, something more strict would be better

  2. configure the Service to support ExecuteCommand: aws ecs update-service --cluster foo --service exectest --enable-execute-command

Note that re-configuring the Service does not apply to already-running tasks, so you’ll need to replace existing tasks.

A Quick Look

You can connect using the AWS CLI ecs execute-command operation:

[ec2-user@ip-10-0-0-34 ~]$ aws ecs execute-command \
>         --cluster foo \
>         --task d62cc44c12264542b0fd71f19908d4ff \
>         --command /bin/bash \
>         --interactive
[...]
Starting session with SessionId: ecs-execute-command-06b31f227a644ad8e
root@ip-10-0-1-201:/# 

This is how the agent is injected into the running container – it’s kinda crazy:

root@ip-10-0-1-201:/# ps axufw
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        35  0.0  0.1 1245800 15236 ?       Ssl  12:40   0:00 /managed-agents/execute-command/amazon-ssm-agent
root        47  0.0  0.3 1412100 30804 ?       Sl   12:40   0:00  \_ /managed-agents/execute-command/ssm-agent-worker
root        57  0.5  0.3 1330612 27648 ?       Sl   12:42   0:00      \_ /managed-agents/execute-command/ssm-session-worker ecs-exec
root        67  0.0  0.0   5752  3560 pts/0    Ss   12:42   0:00          \_ /bin/bash
root       395  0.0  0.0   9392  3148 pts/0    R+   12:44   0:00              \_ ps axufw
root         1  0.0  0.0  10640  6140 ?        Ss   12:40   0:00 nginx: master process nginx -g daemon off;
nginx       34  0.0  0.0  11036  2584 ?        S    12:40   0:00 nginx: worker process

Note that this is a vanilla nginx container. No changes to the image required.

That /managed-agents is a r/o bind-mount on the host box:

root@ip-10-0-1-201:/# mount | grep managed-ag
/dev/xvda1 on /managed-agents/execute-command type ext4 (ro,noatime,data=ordered)
root@ip-10-0-1-201:/# 
root@ip-10-0-1-201:/# find /managed-agents -ls
   933843      4 drwxr-xr-x   3 root     root         4096 Mar 17 12:40 /managed-agents
   131291      4 drwx------   4 root     root         4096 Mar 17 12:40 /managed-agents/execute-command
   131300      4 drwx------   2 root     root         4096 Mar 17 12:39 /managed-agents/execute-command/certs
   131301    216 -r--------   1 root     root       217924 Mar 17 12:39 /managed-agents/execute-command/certs/amazon-ssm-agent.crt
   131302  13292 -r-x--x--x   1 root     root     13610920 Mar 17 12:40 /managed-agents/execute-command/amazon-ssm-agent
   131308  24960 -r-x--x--x   1 root     root     25556488 Mar 17 12:40 /managed-agents/execute-command/ssm-agent-worker
   131292      4 drwx------   2 root     root         4096 Mar 17 12:39 /managed-agents/execute-command/configuration
   131293      4 -rw-r--r--   1 root     root          193 Mar 17 12:39 /managed-agents/execute-command/configuration/amazon-ssm-agent.json
   131294      4 -rw-r--r--   1 root     root          814 Mar 17 12:39 /managed-agents/execute-command/configuration/seelog.xml
   131309  19252 -r-x--x--x   1 root     root     19711400 Mar 17 12:40 /managed-agents/execute-command/ssm-session-worker

The blog entry links to the original proposal on Github, if you’re interested in details…

Wrapping Up

I had a few ServerException / Service Unavailable errors while testing. I can’t reproduce them now, so I’m not sure if this was launch day diarrhea or my own clumsiness.

If you get TargetNotConnectedException / internal error, you’re missing SSM connectivity or permissions.

Overall it’s more involved than I’d like it to be – some assembly required, many moving parts… though Nathan Peck, the Developer Avocado 🥑 for Container Services at @awscloud, points out that the ECS “Copilot” tool makes this significantly easier.

Anyway – it’s very cool that this is possible now, as it was my last “con” point for Fargate.


Originally posted on Twitter


  1. Or even your own servers, with ECS Anywhere, a feature that was introduced recently