A 101 on Domain Fronting

Mon 11th Feb 19

Domain fronting has been around for years and I've always understood the concept but never actually looked at exactly how it works. That was until recently when I did some work with Chris Truncer who had us set it up as part of a red team test. That was the point I had to get down and understand the actual inner workings. Luckily Chris is a good teacher and the concept is fairly simple when it is broken down into pieces.

Before starting to explain fronting itself, a step back to look at how we retrieve a web page...

First there is the network request.

  • A user enters a URL, which contains a hostname, into a browser.
  • The operating system does a DNS lookup on the hostname.
  • A TCP connection, based on the IP addresses from the lookup, is made between the two machines.

Now there is a network connection in place, the application layer kicks in and the HTTP request is sent. With HTTP/1.0, a web server was only able to serve one web site per IP address as it had no way to know the hostname used to request the site. HTTP/1.1 introduced the concept of a "Host" header which allows the server to host multiple virtual hosts which are selected based on the host name provided, hence the term "named virtual hosts". The server will check the host name provided against a list of virtual hosts is knows about and will pick the correct one to serve, if it doesn't know the specific host requested, it serves the default site.

At the most basic level, a HTTP/1.1 request looks like this:

GET / HTTP/1.1
Host: digi.ninja

In a normal request made by a browser, the hostname in the URL will match the one given in the host header but this does not have to be the case as there is nothing to enforce that link. We can demonstrate this using curl, the following request will make a network connection to bing.com but will request the site google.com through the host header. As Bing doesn't have a virtual host called google.com we get an error.

$ curl -H "Host: google.com" bing.com
<h2>Our services aren't available right now</h2><p>We're working to restore all services as soon as possible. Please check back soon.</p>06XZVXAAAAAD6lfm8665lRL+M0r8EuYmDTFRTRURHRTA1MTMARWRnZQ==

Lets try it making a request for the Australian Google site but going via the UK Google domain.

$ curl -H "host: www.google.com.au" www.google.co.uk
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-GB"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title>...

This request works as the UK Google servers know about all their different virtual hosts and so are able to route the request correctly and return the appropriate content.

Now we've covered the basic concepts, lets see how this is used for domain fronting.

When a site is set up on a Content Delivery Network (CDN), such as Amazon Cloudfront, Cloudflare, Microsoft Azure CDN or Google Cloud CDN, a CNAME record for the domain is setup to point at the CDN servers and something similar to a named vhost is setup on the CDN web servers so it can respond to the request. The setup is given an "origin" server which it pairs with the incoming domain so it knows where to go to retrieve the actual content to serve.

As we have already shown, the hostname used for the network connection does not have to match the site requested and so we can use a hostname for one of the sites hosted by the CDN but then specify the host header for different one.

So what about certificates, won't using mismatched hostnames and host headers mess things up and give certificate warnings or leak information? The short answer, no. Going back to the start of this post, the first thing that gets set up is the network connection, this is where TLS comes in. Once the TCP connection is established, the TLS negotiation then starts and is all based on the hostname used to start the connection, the host header is in the application layer traffic and doesn't get a look in until all the lower layers are established. We can show this by going back to the Google / Bing example and making the request over HTTPS while asking curl for more information on the connection.

$ curl -v -H "Host: google.com" https://bing.com
* Rebuilt URL to: https://bing.com/
*   Trying 13.107.21.200...
* TCP_NODELAY set
* Connected to bing.com (13.107.21.200) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
  * TLSv1.2 (OUT), TLS handshake, Client hello (1):
  * TLSv1.2 (IN), TLS handshake, Server hello (2):
  * TLSv1.2 (IN), TLS handshake, Certificate (11):
  * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
  * TLSv1.2 (IN), TLS handshake, Server finished (14):
  * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
  * TLSv1.2 (OUT), TLS change cipher, Client hello (1):
  * TLSv1.2 (OUT), TLS handshake, Finished (20):
  * TLSv1.2 (IN), TLS handshake, Finished (20):
  * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
  * ALPN, server accepted to use h2
  * Server certificate:
  *  subject: CN=www.bing.com
  *  start date: Jul 20 17:47:08 2017 GMT
  *  expire date: Jul 10 17:47:08 2019 GMT
  *  subjectAltName: host "bing.com" matched cert's "bing.com"
  *  issuer: C=US; ST=Washington; L=Redmond; O=Microsoft Corporation; OU=Microsoft IT; CN=Microsoft IT TLS CA 5
  *  SSL certificate verify ok.
  * Using HTTP2, server supports multi-use
  * Connection state changed (HTTP/2 confirmed)
  * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
  * Using Stream ID: 1 (easy handle 0x559a18d7f900)
  > GET / HTTP/2
  > Host: google.com
  > User-Agent: curl/7.58.0
  > Accept: */*
  > 
  * Connection state changed (MAX_CONCURRENT_STREAMS updated)!
  < HTTP/2 400 
  < x-msedge-ref: 0OPdbXAAAAACPvltTXmg8T6Ynwb1og0T8TE9OMDRFREdFMDQyMABFZGdl
  < date: Thu, 07 Feb 2019 09:15:35 GMT
  < 
  * Connection #0 to host bing.com left intact
  <h2>Our services aren't available right now</h2><p>We're working to restore all services as soon as possible. Please check back soon.</p>0OPdbXAAAAACPvltTXmg8T6Ynwb1og0T8TE9OMDRFREdFMDQyMABFZGdl

There is a lot of output, but the two key lines are:

  • subject: CN=www.bing.com
  • subjectAltName: host "bing.com" matched cert's "bing.com"

Which show the TLS connection is being negotiated using Bing's certificate, there are no warnings because the certificate is valid and there is no mention of Google till the host header is sent in the HTTP request.

In the situation I was in with Chris, we were using fronting to get our HTTPS based command and control (C2) channel out of a company who used a web filtering solution which made decisions based on the hostname used to initiate the request. They were not doing SSL stripping, which is important as that would have given them visibility into the HTTP traffic and allowed them to also made decisions based on the host header. We picked a company hosted on Cloudfront which was known to have a good reputation and we expected to be allowed through the filters. We setup our own C2 site with Cloudfront with its origin pointing at our real servers. The custom beacon software used the "good" hostname for the network connection and our "malicious" hostname in the host header. As far as the company filters were concerned, we were talking to a nice, trustworthy site, but the CDN saw, and responded to, the requests for our C2 site routing all the traffic directly to our server.

In this example I've covered hiding C2 traffic, but this technique can also be used to bypass censorship filters and other, similar issues. Bypassing censorship is done in the same way as discussed here except the site to be viewed is fronted by a "trusted" site and a custom browser plugin, or local network proxy, does the work of swapping out the host header for the correct value.

If you manage a domain which is served through a CDN, you might be thinking, "What can I do to prevent my good reputation from being abused like this?", the answer, as far as I can tell, is nothing. The "vulnerability" is in the way HTTP works, the disconnect between the network connection and the application traffic. Once you host your site on a box which is shared with other sites, all the others become accessible once the network connection has been made. There won't even be any logs for you to check as once the CDN sees the host header, all the traffic is routed into the "bad" site and so any logs will be in its account. The only thing you may be able to look at is DNS logs and try to tally requests with site traffic, if you are seeing a lot of requests but not much traffic, then something odd may be going on. Realistically though, that isn't going to work due to caching and the effort it would take to try to match the two sets of logs up.

If you would like to work through setting up your own cloud fronted domain, I've written an accompanying post "Domain Fronting with Cloudfront - A worked example".