Timeless Services

2021-06-29

This post is part of a blog series: Road to re:Web.

An Anti-Cloud Pattern

Idle resources are an anti-cloud pattern. As I see it, any service that charges by provisioned time (per second / hour / month) – regardless of actual workload – isn’t truly serverless. Resources are provisioned exclusively for a single tenant. For all practical purposes, they are always used at full capacity. But most of the time, they are idle.

It’s like leaving my car running when I get home.

Not being time-based – not blocking resources that are not actively used – should be a property of what’s called serverless. But much of the world already has another definition (e.g. Fargate is considered serverless, just because there are no servers to manage).

Therefore I’ll be calling those services timeless. Until I come up with a better name.

This post will focus on AWS, but the principles are the same for all cloud providers.

Time-Based Services Hurt Everyone

As I laid out before, cloud cost is problematic for many users. With the typical services that are time-based, like EC2 or Fargate, costs are significantly higher for almost all usage patterns. This is especially true for any non-production environment, as well as individuals (students, developers, freelancers etc.) and many small or start-up organizations.

Also I cannot “forget” to shut down a timeless service. Many people have received panic-inducing bills – simply because some resources were left running, but idle (here is one recent example of many). While this isn’t the only cloud billing trap, it’s a big one. And certainly not just for novices – especially larger companies tend to have quite a few resources running 24x7 that they don’t need.

Most importantly: Idle resources are waste. It’s simply not necessary. Besides wasting money, it has significant environmental impact. This impact is improved by being “in the cloud” instead of your grandma’s on-premises data center, because large cloud providers are really good at reducing excess waste (e.g. from cooling and power units) – but overall, it’s not much better.

Ain’t Easy Bein’ Green

The sad thing is: It’s really not easy to architect exclusively using timeless services.

Often it’s due to the application’s requirements, for example, when it needs a classic relational database like MySQL or PostgreSQL. Aurora Serverless v1 helps a bit, but isn’t quite there yet (v2 looks promising!) – and doesn’t fit for all use-cases.

And sometimes it feels like AWS is making this difficult on purpose – most notably with VPC design. Let’s take a closer look at that.

NAT Gateway

On AWS, the arch enemy, of course, is the Managed NAT Gateway.

This is required¹ if I want to have a non-public network (a very basic security requirement in many scenarios) that can do outbound connections to the internet – like, you know, for downloading security updates, or using AWS APIs.

This hits home for almost $40 per month without having moved a single network packet. Those cost extra, of course. Not to mention that you need one of these fuckers for each availability zone, if they need to be redundant. A laughable cost for most businesses – but absolutely out of the question for any individual.

Domino Effect

So a key desire is to not put myself in a situation where I need a NAT Gateway.

Take the Lambda / RDS pair as an example.

If both can exist on a private network and do not need outbound connectivity, everything is fine.

But as soon as my Lambda needs something from the internet or from an AWS API, it gets complicated:

I could remove the Lambda from my VPC. Then it will have a public IP address, and it will happily connect to the world. Unfortunately, it will no longer be able to talk to my private network, where RDS is living.
I could remove the Lambda from my VPC and place RDS in a public subnet, so it will have a public IP. The Lambda could connect to that. But a database with a public IP address is considered a rather strong red flag in IPv4-land, and I couldn’t even use a Security Group to limit access properly. Not to mention that while vanilla RDS can have a public IP address, Aurora databases can not – annihilating any dreams about the timeless Aurora Serverless v2 right away.
I could move Lambda to a public subnet within my VPC. This would solve all my problems. It could get a public IP and could connect to any private subnet I have. Unfortunately, this isn’t supported: Lambda will not pick up a public IP address in a public subnet – probably because that would be too easy.
If I just need access to an AWS API, I could use PrivateLink to map endpoints into my private subnet – but those are time-based as well: $9 per month, per availability zone, per AWS service. Right.

There are similarly weird examples. To name one, the network mode awsvpc in ECS will happily grab a public IP address when used with ECS-Fargate. If I use the same awsvpc network mode with ECS-EC2 on a VPC’s public subnet, it will refuse to pick up a public IP address.

Yesterday’s Internet

Just to add insult to injury: All this drama completely goes away with IPv6. There is no NAT in IPv6, and the VPC would use a regular Internet Gateway, or to mimic the NAT Gateway’s outbound-only characteristic, an Egress-only Internet Gateway.

Both are free!

Unfortunately, in 2021, it still is virtually impossible to use a IPv6-only VPC².

Defining Cloud

Everyone has their own definiton of cloud. To me, it’s using abstractions as much as possible. This, in turn, means that I can focus on what is interesting to me. All the infrastructure, all the software stack and runtimes below, making that secure and highly available – those are solved problems. I don’t want to do that myself.

Following that route of abstraction, we end up at truly serverless – timeless – services.

To me, those are inconsequential configuration items. Take S3 buckets, for example. I can have one or ten of those. It’s just a thing that’s there. I pay for actual data storage – not for merely having an empty bucket around.

Here’s some examples of time-based services:

NAT Gateway, oh yes
RDS and Aurora (at least in their non-Serverless flavors)
EC2
Fargate
Elastic Loadbalancer
PrivateLink
EKS
VPN

Some wonderful examples of timeless services:

VPC itself (often overlooked!)
S3
Lambda
EFS
Timestream
Infinidash
API Gateway
DynamoDB
SQS

Noteworthy is the new App Runner service: it lands somewhere inbetween, because it uses a hybrid approach (CPU is timeless, RAM is not) – a very interesting step in the right direction.

Prototyping Without Regrets

Note that I can configure all of these timeless services and build an architecture that’s highly available across multiple availability zones, needs zero maintenance, scales automatically and costs literally nothing until I really use it. And even then it’s so much cheaper.

As an example: A while back, I built a backend for an app. The app never went anywhere, but this highly scalable timeless infrastructure was available for more than a year, and still is – and if the app would now be an overnight success, this architecture would scale with it, without me doing anything. I payed a few cents for all that, in total.

Try that kind of worry-free prototyping with time-based services…

Conclusion

Time-based services are an anti-cloud pattern. Yes, you could turn them on and off as needed, or have some pilot light with auto-scaling that would eventually react after a few minutes.

But that’s not what cloud is about. As cloud services evolve and time goes on, we’ll see more architectures that are fully built on timeless services, allowing everyone to worry less about costs and to focus on their actual work.

In a sense, re:Web is about building a bridge to timeless services: It makes it possible to run common web applications on Lambda instead of EC2 or Fargate (or App Runner).

Discuss and/or follow on Twitter!

This post is part of a blog series: Road to re:Web.

Technically, you could run your own EC2 box (then called a NAT instance) for like $3/month. Per availability zone. And then take care of its configuration, redundancy, security updates, monitoring and all the other management issues yourself. But the point of cloud is to do less of what AWS calls “undifferentiated heavy lifting”. ↩
Note that top-level IPv4 exhaustion occurred more than ten years ago. ↩