AWS: ECS ExecuteCommand: A Quick Look
2021-03-16Amazon Elastic Container Service can be used in two flavors – you can either run your own EC2 instances1 that run an agent and connect to the ECS control plane, or you can use AWS Fargate to never worry about managing servers again and join the serverless revolution.
One of the last reasons to prefer EC2-backed ECS over Fargate-backed ECS was the ability to simply ssh
to your
EC2 instances and docker exec
into any container. This was not possible with Fargate, as AWS is in control of the host
that runs your Fargate micro-VM.
Yesterday, the new ECS Exec feature was announced, which allows interactive shell access to containers.
Some key points
- not supported in AWS CLI v2 yet – “in the coming weeks”
- requires Fargate version 1.4 (which has been
LATEST
for a while) - an ECS Service needs to be explicitly configured to support this
- only interactive sessions – non-interactive will launch “in the near future”
It uses SSM Session Manager under the hood, therefore:
- the ECS task needs SSM permissions
- the Fargate ENI needs access to the SSM API (
ssm
andssmmessages
) - the client requires the
session-manager-plugin
Configuration
To enable this:
-
add SSM permissions to the task role – I used the
AmazonSSMManagedInstanceCore
policy, but for production, something more strict would be better -
configure the Service to support ExecuteCommand:
aws ecs update-service --cluster foo --service exectest --enable-execute-command
Note that re-configuring the Service does not apply to already-running tasks, so you’ll need to replace existing tasks.
A Quick Look
You can connect using the AWS CLI ecs execute-command
operation:
[ec2-user@ip-10-0-0-34 ~]$ aws ecs execute-command \
> --cluster foo \
> --task d62cc44c12264542b0fd71f19908d4ff \
> --command /bin/bash \
> --interactive
[...]
Starting session with SessionId: ecs-execute-command-06b31f227a644ad8e
root@ip-10-0-1-201:/#
This is how the agent is injected into the running container – it’s kinda crazy:
root@ip-10-0-1-201:/# ps axufw
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 35 0.0 0.1 1245800 15236 ? Ssl 12:40 0:00 /managed-agents/execute-command/amazon-ssm-agent
root 47 0.0 0.3 1412100 30804 ? Sl 12:40 0:00 \_ /managed-agents/execute-command/ssm-agent-worker
root 57 0.5 0.3 1330612 27648 ? Sl 12:42 0:00 \_ /managed-agents/execute-command/ssm-session-worker ecs-exec
root 67 0.0 0.0 5752 3560 pts/0 Ss 12:42 0:00 \_ /bin/bash
root 395 0.0 0.0 9392 3148 pts/0 R+ 12:44 0:00 \_ ps axufw
root 1 0.0 0.0 10640 6140 ? Ss 12:40 0:00 nginx: master process nginx -g daemon off;
nginx 34 0.0 0.0 11036 2584 ? S 12:40 0:00 nginx: worker process
Note that this is a vanilla nginx container. No changes to the image required.
That /managed-agents
is a r/o bind-mount on the host box:
root@ip-10-0-1-201:/# mount | grep managed-ag
/dev/xvda1 on /managed-agents/execute-command type ext4 (ro,noatime,data=ordered)
root@ip-10-0-1-201:/#
root@ip-10-0-1-201:/# find /managed-agents -ls
933843 4 drwxr-xr-x 3 root root 4096 Mar 17 12:40 /managed-agents
131291 4 drwx------ 4 root root 4096 Mar 17 12:40 /managed-agents/execute-command
131300 4 drwx------ 2 root root 4096 Mar 17 12:39 /managed-agents/execute-command/certs
131301 216 -r-------- 1 root root 217924 Mar 17 12:39 /managed-agents/execute-command/certs/amazon-ssm-agent.crt
131302 13292 -r-x--x--x 1 root root 13610920 Mar 17 12:40 /managed-agents/execute-command/amazon-ssm-agent
131308 24960 -r-x--x--x 1 root root 25556488 Mar 17 12:40 /managed-agents/execute-command/ssm-agent-worker
131292 4 drwx------ 2 root root 4096 Mar 17 12:39 /managed-agents/execute-command/configuration
131293 4 -rw-r--r-- 1 root root 193 Mar 17 12:39 /managed-agents/execute-command/configuration/amazon-ssm-agent.json
131294 4 -rw-r--r-- 1 root root 814 Mar 17 12:39 /managed-agents/execute-command/configuration/seelog.xml
131309 19252 -r-x--x--x 1 root root 19711400 Mar 17 12:40 /managed-agents/execute-command/ssm-session-worker
The blog entry links to the original proposal on Github, if you’re interested in details…
Wrapping Up
I had a few ServerException
/ Service Unavailable
errors while testing.
I can’t reproduce them now, so I’m not sure if this was launch day diarrhea or my own clumsiness.
If you get TargetNotConnectedException
/ internal error
, you’re missing SSM connectivity or permissions.
Overall it’s more involved than I’d like it to be – some assembly required, many moving parts… though Nathan Peck, the Developer Avocado 🥑 for Container Services at @awscloud, points out that the ECS “Copilot” tool makes this significantly easier.
Anyway – it’s very cool that this is possible now, as it was my last “con” point for Fargate.
-
Or even your own servers, with ECS Anywhere, a feature that was introduced recently. ↩