AWS: VPC DNS Support
2021-09-07There is No Way It’s DNS
If you’ve been around in networking for a while, you probably know this haiku:
Recently I’ve had one more experience like that, where enabling “VPC Hostnames” on AWS caused an outage. I’d have sworn on my love of beer that it cannot be related to DNS.
Time for a closer look at DNS support in VPCs.
Overview: The VPC DNS Resolver
Amazon provides standard DNS resolvers in each VPC, free of charge. You can find the resolver service at several IP addresses:
- IPv6: At the local address
fd00:ec2::253
- Legacy IP:
- Local address
169.254.169.253
- VPC’s base network address plus 2 (e.g. 172.31.0.2 or 10.0.0.2) – therefore sometimes called the “dot-2 resolver”
- Local address
This resolver is announced as part of the default DHCP options.
It can resolve both public internet DNS names (like neveragain.de
) and VPC-specific hostnames.
The documentation of DNS support for VPCs fits on a single page.
Overview: VPC-Specific DNS Hostnames
Each Elastic network interface (ENI) in a VPC can have DNS hostnames pointing to it.
For EC2 instances, this would be ec2-$IPADDRESS.$REGION.compute.amazonaws.com
(legacy IP only).
It works the same way for other in-VPC AWS services: They create ENIs in the VPC and some corresponding DNS records.
With RDS, for example, this is something like mysql1234.cluster-oisdhosdahsj.eu-central-1.rds.amazonaws.com
.
RDS demonstrates an important distinction:
- For private-only databases, they always (globally!) resolve to the private IP address
- For databases with public IPs:
- queries from “outside” resolve to the public IP address
- queries from “inside” resolve to the private IP address
Here, “inside” usually means “from within the same VPC”, but this line is blurred by VPC Peerings and other connections like Transit Gateway.
VPC-Specific DNS settings
There’s two settings to control the DNS resolver’s behavior. In the documentation’s own words:
-
enableDnsHostnames
: Indicates whether instances with public IP addresses get corresponding public DNS hostnames1 -
enableDnsSupport
: Indicates whether the DNS resolution is supported
Confusingly, the default settings vary a bit: “By default, both attributes are set to true in a default VPC or a VPC
created by the VPC wizard” – but any other VPC will have enableDnsHostnames
disabled by default.
Reading Naïvely skimming the VPC DNS documentation, the following assumption seemed sound:
enableDnsHostnames
just creates additional DNS entries for IP addresses (authoritative DNS), and enableDnsSupport
controls whether the VPC’s resolver is enabled at all.
But, as we all know: Assumption is the mother of all fuck-ups.
enableDnsSupport
This works pretty much as expected: If this is enabled, the VPC resolver IP addresses (see above) respond to queries.
Disabling it immediately casues those resolvers to go dark, i.e. any DNS query will not be answered (you’ll see a timeout).
It is also required for the next setting to have any effect.
enableDnsHostnames
As expected, this creates additional DNS records. That’s why this option is also required if you want to have public IPs for RDS databases, for example.
Entirely not expected – for me, at least – is that it sythesizes forward and reverse responses for any private IP address. That’s right: Not just the private IP addresses within your VPC CIDR range, but any private IP address. Like, you know, those private IP addresses that you’re using on-premises.
This seems reasonable for VPC addresses:
[ec2-user@ip-10-0-0-34 ~]$ dig +noall +ans -x 10.0.0.34
34.0.0.10.in-addr.arpa. 285 IN PTR ip-10-0-0-34.eu-central-1.compute.internal.
But now your on-premises servers have a reverse lookup, too! From the VPC’s perspective, at least:
[ec2-user@ip-10-0-0-34 ~]$ dig +noall +ans -x 192.168.123.234
234.123.168.192.in-addr.arpa. 600 IN PTR ip-192-168-123-234.eu-central-1.compute.internal.
[ec2-user@ip-10-0-0-34 ~]$ dig +noall +ans -x 172.16.0.3
3.0.16.172.in-addr.arpa. 600 IN PTR ip-172-16-0-3.eu-central-1.compute.internal.
If this option is disabled, those queries return NXDOMAIN
(entry does not exist), as they should.
This is documented, but rather in the fine print instead of obviously with the options’ descriptions.
Resolution (Pun Inteded)
So, after all: It was DNS. The client in question performs a reverse lookup of the server’s IP address and then
uses that information to see which credentials it should present. After enabling VPC DNS hostnames, that lookup
was no longer answered with “no such entry” but with the synthesized ip-192-168-0-123.eu-central-1.compute.internal
.
Subsequently, the credentials lookup failed to find a match.
Additional Trivia
Version Query
This is something I’ve observed often on AWS: They seem to like easter eggs and/or proper implementations.
From ancient times, when ISC BIND was the de-facto authoritative DNS server software, you’d
query a nameserver’s version information from the version.bind
hostname in the Chaos (CH
) class (as opposed to the
default Internet class IN
). The nameserver would answer with the software version, e.g. "9.11.0"
.
Most modern nameservers simply don’t answer this query. The VPC resolver, however, does:
[ec2-user@ip-10-0-0-34 ~]$ dig +noall +ans version.bind txt ch @10.0.0.2
version.bind. 0 CH TXT "EC2 DNS"
Caching
The resolver IP address seems to employ some caching of its own, apparently for a few minutes per record.
This is important when changing the VPC Hostnames setting: It will not take effect immediately, and not consistently!
Fleet
I can observe fluctuating time-to-live values returned by the resolver IP address, which leads me to believe that there’s actually a handful of different resolvers answering the queries:
[ec2-user@ip-10-0-0-34 ~]$ while true; do dig +noall +ans neveragain.de soa @10.0.0.2; sleep 1; done
neveragain.de. 243 IN SOA squigley.hq.neveragain.de. hostmaster.neveragain.de. 2021080901 86400 3600 2419200 10800
neveragain.de. 242 IN SOA squigley.hq.neveragain.de. hostmaster.neveragain.de. 2021080901 86400 3600 2419200 10800
neveragain.de. 241 IN SOA squigley.hq.neveragain.de. hostmaster.neveragain.de. 2021080901 86400 3600 2419200 10800
neveragain.de. 247 IN SOA squigley.hq.neveragain.de. hostmaster.neveragain.de. 2021080901 86400 3600 2419200 10800
neveragain.de. 245 IN SOA squigley.hq.neveragain.de. hostmaster.neveragain.de. 2021080901 86400 3600 2419200 10800
neveragain.de. 246 IN SOA squigley.hq.neveragain.de. hostmaster.neveragain.de. 2021080901 86400 3600 2419200 10800
[ ... and so on ...]
Additional Reading
For advanced DNS resolution topics, see also:
- Route53 Resolver Endpoints to forward DNS queries between external (e.g. on-premises) DNS and AWS (beware: while this sounds like three lines of BIND configuration, it actually starts at almost $190/month!)
- DNSSEC validation (not enabled by default, unfortunately)
- DNS Firewall,
which is a corny name for hostname-specific override rules in the resolver (you can fake
NODATA
,NXDOMAIN
and arbitraryCNAME
responses) - For VPC Peerings, there’s separate settings for DNS
Conclusion
It was DNS.
Discuss and/or follow on Twitter!
-
Minor gripe: What does that even mean, an instance gets a hostname? What exactly does it get, and how? Does it affect the DHCP-assigned hostname as well? A rather weak choice of words, I’d say, used throughout the document. Spoiler: It does not affect DHCP responses. ↩