Intro

CloudFront is a fully managed content delivery network that caches content in locations physically close to the users.

Distribution

A CloudFront Distribution is collection of configurations that define the deployment.

Edge Location

Edge locations are the distributed points-of-presence that store content in a local cache and are closest to the users.

Regional Edge Cache

A Regional Edge Cache is a larger version of an Edge Location. Regional Edge Caches act as an intermediate caching layer between the Origin and the Edge Locations.

Important
CloudFront only caches Downloadable content. It does not $2 caching.

Behaviours

Behaviours define which content is cached and also the caching parameters.

CloudFront Behaviours have the following characteristics.

  • CloudFront distributions are assigned a default (*) behaviour that matches everything.
  • Additional behaviours can be matched based on the content path such as stuff/*
  • Origins, Origin Groups, TTL, Protocol Policies, and access restrictions are configured via Behaviours.

Origin

An Origin is the source location of the content. An Origin can either be an S3 Origin or a Custom Origin.

Time-to-Live

The Time-to-Live value specifies how long content is cached before it is considered stale and needs to be refreshed.

The following points describe TTL.

  • If the latest version is already in the cache, the origin returns a 304 Not Modified status code.
  • If the latest version is not in the cache, the origin returns a 200 OK status code, and the latest version of the file.
  • The default TTL value is 24 hours and is applied to any object that does not have a per-object TTL set.
  • The default, minimun and maximum TTL values can be set in a cache policy.
  • The Cache-Control or Expires response header properties can be used to signal to the browser, how long to keep an object in it's local cache.
  • Cache-Control max-age and Cache-Control s-maxage define the validity in seconds.
  • Expires defines the date and time the validity expires.
  • When minimum and maximum TTL values are defined as well as the Cache-Control or Expires headers. The caching behaviour of CloudFront is as follows.
    • If Cache-Control or Expires are < the minimum TTL, the minimum TTL is used.
    • If Cache-Control or Expires are > the maximum TTL, the maximum TTL is used.
    • A detailed breakdown of this behaviour can be found in the docs here.
  • S3 origins can set the TTL values of an object via metadata.

Invalidations

Cache invalidations allow you to manually expire objects from CloudFront.

The following points describe Invalidations.

  • Invalidations are performed distribution wide at all edge locations.
  • String matching patterns are used to determine which object are invalidated.
  • Versioned filenames can be used to automatically invalidate cached objects.
  • There is no cost when invalidating files via versioned filenames.
  • Invalidating files with string matching does have and associated cost.

Certificate Manager

The AWS Certificate Manager (ACM) service, is a fully managed, Regionally resilient, service that enables the use of Digital Certificates for web based services. Digital Certificates allow for the use of Transport Layer Security (TLS) encryption over the HTTPS protocol.

ACM has the following characteristics.

  • Certificates can either be generated by ACM or imported from an external Certificate Authority (CA).
  • Certificates that are generated by ACM are automatically renewed.
  • Certificates that are imported MUST be renewed by the administrator.
  • ACM can only be used with supported services IE: CloudFront and ALBs.
  • Certificates can only be used for Services in the same region they are deployed into.
Important
CloudFront operates out of the us-east-1 Region. ACM Certificates used by a CloudFront distribution must be deployed into the us-east-1 Region.

SSL/TLS Certificates

CloudFront distributions support SSL by default using the *.cloudfront.net wildcard certificate that matches all CloudFront distributions.

Altername domain names can be used by leverageing CNAMEs, however domain ownership must be verified using a matching certificate.

Certificates can be Generated by ACM or imported to ACM by an administrator.

Important
There are 2x TLS connections invloved when using CloudFront. Viewer -> CloudFront and CloudFront -> Origin. Both must have valid public certificates. Self signed certificates are not supported.

Subject Name Indication

Subject Name Indication (SNI) is an extention to the TLS protocol which allows multiple website domains to be hosted on a single IP address.

SNI was added as an extension to TLS in 2003, however some older (ancient) browsers do not support SNI.

Note
CloudFront hosting using SNI is free. If support for older browsers that do not support SNI is required, CloudFront charges a fee to provide a dedicated IP address.

Origin Access Identity (OAI)

Origin Access Identity (OAI) is a legacy method of controlling access to S3 from CloudFront. It is used to prevent users from bypassing CloudFront and accessing S3 directly.

OAI is an identity that be assiciated with a CloudFront distribution. The OAI is then granted access to the S3 bucket via a Bucket Policy.

Origin Access Control (OAC)

Origin Access Control (OAC) is the current method of controlling access to S3 from CloudFront. The following features are supported which are not available with OAI without a workaround.

  • Operates in all Amazon S3 buckets in all AWS Regions, including opt-in Regions launched after December 2022
  • Allows S3 server-side encryption with AWS KMS (SSE-KMS)
  • Supports dynamic requests (PUT and DELETE) to Amazon S3
Important
Both OAI and OAC are ONLY available with an S3 origin. If your origin is an Amazon S3 bucket configured as a website endpoint, it is the same as using a non-S3 origin and you must set it up with CloudFront as a custom origin. That means you can't use OAI or OAC 😭

Securing Custom Origins

To secure a Custom Origin there are a couple of options.

Custom Headers

Cloudfront can be configured to add a custom header the request. In the case of a S3 origin, a bucket policy can be used to restrict traffic to only requests with the custom header.

NSG / Firewall

A Network Security Group (NSG) or Firewall can be used to restrict traffic to the Custom Origin based on the published IP address of CloudFront.

Geo Restrictions

Geo Restrictions can be used to restrict access to content based on a users location on the planet earth. There are 2 options for Geo Restrictions with CloudFront.

CloudFront Geo Restriction

CloudFront Geo Restrictions are applied globally to the distribution. The following points describe CloudFront Geo Restrictions.

  • Uses a Whitelist OR Blacklist model.
  • Can restrict based on Country ONLY.
  • Uses a 99.8% accurate GeoIP database to identify country of origin.
  • The GeoIP database maps to a 2-digit country code.

3rd Party Geo Location

3rd Party Geo Location allows for Geo Restriction that is applied per behaviour. The following points describe 3rd Party Geo Location.

  • CloudFront is configured as a private distribution. This requires requests to use a signed URL or signed cookie.
  • Some form of compute is used to determine the users location.
  • This architecture can ALSO be used to restrict access based on anything that can be determined by the compute. Such as a license key, or a user account.
Note
By default, Edge locations return a 403 UNAUTHORIZED status code when unauthorized Geo Restricted content is accessed.

Object Visibility

CloudFront has 2 options for object visibility. Public or Private. Visibility can be configured with a single behaviour for the entire distribution (Either Public or Private). OR multiple behaviours with each being Public or Private.

Public

Public objects are available to anyone with the URL. This is the default configuration for CloudFront distributions.

Private

Private objects are only available to users who have been granted access to the object. This is configured via a signed URL or signed cookie. The following points describe access to private objects.

  • Signed URLs and Signed Cookies require a Signer.
  • Legacy implentations use a CloudFront key account which is created by the account Root User. The key account is added as Trusted Signer.
  • Modern implentations use Trusted Key Group(s).
  • Signed URLs provide access to a Single object.
  • Signed URLs are a good option if the client does't support Signed Cookies.
  • Cookies provide access to Groups of objects.
  • Signed Cookies allow you to use a Custom URL.

Field Level Encryption

Field-Level Encryption allows for the encryption of specific fields of application data within a request. Field-Level Encryption uses a public/private key pair. Sensitive fields are encrypted using using the public key at the Edge Location. The private key is used to decrypt the data at the Origin.

Lambda@Edge

Lambda@Edge is an extension of AWS Lambda that lets you deploy Python and Node.js functions at Amazon CloudFront edge locations. Lambda@Edge functions can be executed in the following 4 scenarios.

blog/cloud-notes-aws-cloudfront/aws-lambda-edge.png
  1. When CloudFront receives a request from a viewer (viewer request)
  2. Before CloudFront forwards a request to the origin (origin request)
  3. When CloudFront receives a response from the origin (origin response)
  4. Before CloudFront returns the response to the viewer (viewer response)

CloudFront Architecture

The following diagram shows an example CloudFront architecture.

blog/cloud-notes-aws-cloudfront/aws-cloudfront.png

The following points describe the above diagram.

  • A CloudFront distribution is configured with multiple Behaviours.
  • The path based behaviour: stuff/* is attached to a Custom Origin.
  • The default behaviour: * is attached to an S3 Origin.
  • A user requesting content is directed to their closest Edge Location.
  • If the content is already cached at the Edge Location (a cache hit), it is served to the user.
  • If the content is not already cached at the Edge Location (a cache miss), the content is requested from the Regional Edge Cache. If the content is not already cached at the Regional Edge Cache, it is requested from the Origin.