neveragain.de teletype

A recent discussion on the ##aws IRC was about how people handle Docker Hub’s recently introduced rate limiting.

This post will explore a CodeBuild-based solution for periodic mirroring of images.

Background

Container images need to be pulled from somewhere. By default, that’s Docker Hub. So when I’m using simply FROM nginx in my Dockerfile, that’s actually docker.io/nginx.

This is fine for an occasional build process.

But it’s quite problematic for automated use. For example, testing that runs at high frequency can cause a significant amount of pulls. The same goes for frequent deployments or ECS tasks, or any trigger that’s based on a public docker image. Especially problematic are misconfigured / failing services, which will perpetually try to start a service, pulling the image each time.

Understandably, Docker Hub is unwilling to serve massive amounts of – basically unnecessary – pulls for free. So I can’t really blame them for introducing those rate limits in November 2020, and I think their rate limit for free usage is reasonable. Also keep in mind that many images are rather large – sizes beyond one gigabyte are common.

This rate limiting becomes a significant issue on AWS though: In almost all scenarios, the pull request will be issued from an AWS IP address that’s effectively random. There’s no way of knowing how many pulls have already been made from that address today. Therefore, any pull from AWS has a good chance of failing with this error: Error response from daemon: toomanyrequests: You have reached your pull rate limit.

Other Issues

Using a frequently-pulled image from an external source has more issues besides rate-limiting errors:

  • It’s wasteful: This is a lot of data transfer over the public internet – ingress traffic is free on AWS, but that’s no excuse for completely avoidable significant traffic

  • It’s slow: Pulling an image from the public internet takes significantly longer than using an image hosted in my AWS region.

  • It’s unreliable: As with any external service, there’s always an increased chance that it’s unreachable (for network or other issues – not to mention that there’s no SLA for a free service)

Generally speaking, not pulling the same image hundreds of times is just good manners.

Workarounds

AWS responded by publishing advice on the AWS blog.

Basically, the options are:

  • Paying for a Docker Hub account. This might seem like the right thing to do, but doesn’t address any of the other issues.

  • Using the AWS Public ECR. AWS managed to launch this incredibly fast after Docker’s announcement, and I believe it’s the best long-term solution. But as of today, many common images are not available there. I think a marketing push would be in order…

  • Manually copying the required images from Docker Hub to my AWS account’s ECR.

In most cases, using the account’s private ECR is the most reasonable path out of this mess. But now I have a classic Day 2 problem: How do I make sure my mirrored image is up to date?

As with anything in life, the answer is automation.

Well, once again I’m writing too much. So let’s get to it.

Overview

My solution will be based on these components:

  • CodeBuild provides the environment to pull an image from Docker Hub and then push it to my ECR

  • EventBridge to trigger CodeBuild on a fixed schedule

  • ECR to store the copied images

An important goal is to easily do this for multiple images, so the CodeBuild project has to accept the image name as a parameter.

I will use the public images of nginx and logstash as examples.

ECR

I need to create the repositories for the images to be mirrored.

  • “Create repository” in the ECR Console
  • Use nginx as repository name
  • For everything else, defaults are fine
  • Repeat for logstash and any others as necessary

Note the registry URL of the private ECR repositories – something like 123456789012.dkr.ecr.eu-central-1.amazonaws.com (without the trailing /nginx).

CodeBuild Project

Then I’ll create the CodeBuild project that takes care of mirroring:

  • “Create build project” in the CodeBuild Console
  • Use some beautiful name like foo or ecr-mirror
  • As Source, I select No source
  • Under Environment,
    • I’ll use any next-best-latest Linux image; currently, that is
      • Operating system: Amazon Linux 2
      • Runtime: Standard
      • Image: aws/codebuild/amazonlinux2-x86_64-standard:3.0
    • Enable Privileged
    • Use a New service role (note the name!)
    • In Additional configuration,
      • add an Environment variable named REGISTRY, using the registry URL from the ECR step
      • leave everything else as it is
  • In Buildspec, I’ll Switch to editor, remove all suggested lines, and paste the build specification given below
  • Finally, Create build project

Here’s the build specification to use:

version: 0.2

phases:
  build:
    commands:
       - aws ecr get-login-password | docker login --username AWS --password-stdin $REGISTRY
       - docker pull $PREFIX$IMAGE
       - docker tag $PREFIX$IMAGE $REGISTRY/$IMAGE
       - docker push $REGISTRY/$IMAGE

Finally, hidden in the project’s Build details tab, I’ll note the Project ARN.

Adjust IAM Permissions for CodeBuild

CodeBuild created a basic IAM Role for my project, but this role is not allowed to access my ECR repositories yet. To fix that:

  • In the IAM Console, I locate the Role that has been created by CodeBuild (named as noted above; should be codebuild-ecr-mirror-service-role)
  • Attach policies: AmazonEC2ContainerRegistryPowerUser

EventBridge Rule

I’ll use EventBridge to periodically update the images.

  • In the EventBridge Console, Create rule
  • name it ecr-mirror
  • As pattern, I’ll select Schedule
  • I could use a fixed rate, but to have better control, I’ll use a Cron expression: 10 0 * * ? * (that’s 00:10 UTC every day)
  • As Target, I select a CodeBuild project and use the Project ARN from above
  • Expand Configure input, and there
    • select Constant (JSON text)
    • paste the JSON given below (for nginx)
  • Add target (for logstash)
    • using the same CodeBuild target and ARN
    • using the same JSON text, but replacing nginx:stable with logstash:latest
  • Leave the rest as is and Create
{
	"environmentVariablesOverride": [
		{
			"name": "IMAGE",
			"type": "PLAINTEXT",
			"value": "nginx:stable"
		}
	]
}

It will be a single line after pasting it – that’s fine.

Testing

Scheduled jobs are annoying to test. I could adjust the Cron specification to trigger a few minutes from now.

A faster (but less thorough) way to test is the Start build with overrides button in the CodeBuild project overview. There under Environment variables override, I’ll Add environment variable named IMAGE with a value of, for example, logstash:7.11.2.

In the resulting build process, I like to click Tail logs.

By the way, CodeBuild will log to CloudWatch by default (in the /aws/codebuild/ecr-mirror log group, unless changed).

Next Steps

I now have a mirror of the images I need, updated daily.

What’s left is changing all references to the image, e.g. in my ECS task definitions and Dockerfiles. Instead of nginx, I’ll change those to use 123456789012.dkr.ecr.eu-central-1.amazonaws.com/nginx. This requires a login to that registry first, of course. ECS handles this transparently, but it needs the appropriate IAM permissions. Same for EKS.

Further improvements could be made:

For one, narrowing down CodeBuild’s IAM permissions to the appropriate repositories might be a good idea.

And for mirroring more than a few images, the CodeBuild job could be adjusted to retrieve the list of images from somewhere else, for example a StringList in the SSM Parameter Store.

The Docker in the Room

You might have noticed that this CodeBuild solution is plagued by the same problem initially described here: There’s a good chance that it will fail because of Docker Hub’s rate limiting.

Honestly, I don’t have a good idea for that. Most of the time it works. When it does not, a manual Retry build usually helps. So yes, a daily update could fail for one or two days – but usually, that’s not an issue. And if it is, you can always intervene manually and/or increase the mirroring frequency.

All in all, the whole situation is a mess without a good solution. But this one is better than nothing / mirroring by hand.

Let me know how you handle images!


Discuss and/or follow on Twitter!