Serverless Design for URL Shortening Service

1. Overview

1.1 What is a URL shortening Service?

It is a service which can provide short aliases for long URLs. Instead of sharing a lengthy URL with your customers or peers, a short URL will be generated and shared. On clicking the short URL, the link gets redirected to the actual URL.

Following is an example of a Short URL:

Providing a Long URL https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-setup-api-key-with-console.html the URL shortening service returns the short URL https://tinyurl.com/5xcEdf56

Services that offer this functionality are Tinyurl, Bitly, etc.

2. Requirements:

2.1 Functional Requirements

  1. Service should create a Short URL against a long URL.

  2. The usage of a short URL should redirect the users to its original long URL.

  3. The service should allow the users to generate custom short URLs.

  4. The service should allow the users to set up TTL for the short URL.

  5. The service should allow the user to delete a short URL.

2.2 Non Functional Requirements

  1. Service should be up and running all the time.

  2. Service should be fast and reliable.

  3. Service should expose REST API’s so that it can be integrated with third-party applications

  4. Service should be able to track the usage on a client level and have the flexibility to enable tier-based subscriptions.

3. Assumptions

  1. For every short URL created, there can be 100 reads. Assuming 1:100 write-to-read ratio

  2. The system generates 100 million URLs per day

    1. Write Operations = 100 million/24 hours/3600 seconds = 1160 writes per second.

    2. Read operations = 100 million * 100 /24 hours/3600 seconds = 11600 reads per second.

  3. Assuming the service can run for 100 years, the total number of records generated by the system would be 100 million 365 100 years = 3650 Billion records.

  4. Assuming each record size of 100 bytes, the total storage requirement would be 3650 Billion records * 100 bytes = 365 TB

4. High-Level Design

4.1 High-level flow

The following diagram depicts the high-level flow of URL shortening service

  • 4.1.1 Shortening Request

    1. The user triggers the short URL creation request by providing the Long URL from the browser.

    2. The browser sends the request to the URL Shortening Service

    3. URL Shortening Service creates the short URL for the input URL and stores the data in the Datastore.

    4. URL Shortening Service returns the short URL to the User.

  • 4.1.2 Redirection Request

    1. The user enters the short URL in the browser.

    2. The browser sends the request to the URL Shortening Service.

    3. URL Shortening Service queries the Datastore to fetch the actual URL corresponding to the short URL.

    4. URL Shortening Service sends the redirect URL to the browser.

4.2 APIs

To expose the functionalities we will use the following REST APIs

  • Shortening the URL

  • Redirecting the URL

  • Deleting the URL

4.2.1 Shortening the URL

API request definition of the Short Url creation.

createShortUrl(actualURL, customURL, ttl)

// actualURL - Required Field. URL which needs to be shortened.
// customURL - Optional Field. Custom short URL.
// ttl - Optional Field. Time in seconds for which the short URL needs to be active.

Return Value: shortURL

4.2.2 Redirecting the URL

API request definition for redirecting the URL

redirectURL(url)

// url - Required Field. Shortened URL for which the actual URL need to be fetched from Datastore.

Returns a HTTP redirect response with HTTP code 302.

4.2.3 Deleting the URL

API request definition for deleting the URL

deleteURL(url)
// url - Required Field. Shortened URL for which the actual URL need to be fetched from Datastore.

4.3 Design Diagram

4.3.1 Shortening URL request:

  1. The user client (eg. Browser) calls the URL Shortening service API gateway to create the short URL with the user’s API key.

  2. API gateway triggers the ShortURLHandler Lambda to create the short URL for the actual URL in the request.

  3. ShortURLHandler Lambda stores the URL mapping with other request params in the Dynamo DB and returns the short URL to the client.

4.3.2 Redirect URL request:

  1. Browser calls the URL Shortening service API gateway to get the actual URL.

  2. API gateway triggers the RedirectURLHandler Lambda to get the actual URL for the short URL in the request.

  3. RedirectURLHandler Lambda queries the Dynamo DB with a short URL to fetch the actual URL and returns the actual URL to the client.

4.3.3 Delete URL request:

  1. The user client calls the URL Shortening service API gateway to delete the short URL with the user’s API key.

  2. API gateway triggers the DeleteURLHandler Lambda which deletes the short URL from the Dynamo DB.

4.4 Components

4.4.1 Short URL generator

We need to use two solutions URL encoding and Key generation for creating the short URL.

  1. URL encoding

    1. Base62 (Preferred)

    2. MD5

  2. Key Generation

4.4.1.1 URL Encoding through base62

A base is some digits or characters that can be used to represent a particular number. Base 10 are digits [0–9], which we use in everyday life and Base 62 are [0–9][a-z][A-Z].

As per our assumptions made for the system, it should be able to handle 3650 billion records which in turn means 3650 billion unique short URLs. Following are the unique URLs that can be generated for different URL lengths.

URL LengthUnique Records
5~916 million
6~56 billion
7~3500 billion

To achieve our estimate of 3650 billion unique short URLs, we would need a URL of length 7.

Possible options to create Base 62:

  1. Create Short URLs from random numbers: Generate a random number and convert it to base62. As more records get added to the Datastore the chances of collision increase.

  2. Using a Counter or Key generation technique: This is commonly used for Server-based services. It uses a centralized key generation service or zookeeper to distribute keys to each server thereby avoiding collisions. See more under the Key Generation section.

4.4.1.2 URL encoding through MD5

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value(or 32 hexadecimal digits). We can use this 32 hexadecimal digit for generating 7 characters long tiny URLs. These solutions have more chances of collision thereby increasing the Datastore queries as the first 7 characters of the generated hash could be the same for multiple URLs.

Steps

  1. Encode the long URL using MD5 and take the first 7 characters.

  2. The first 7 digits could be the same for different long URLs and hence query the Datastore for collision.

  3. Store the generated URL if it is not already present in the Datastore. If the short URL is already present, try the next 7 characters and so on.

4.4.1.3 Key Generation

To create a unique short URL in the serverless system with minimal collisions, we will use timestamps as the key generator.

The epoch Time (in milliseconds) 1713585359347 can be converted to base 62 UASDXFr which can be used as the unique key for short URLs. In case of conflict, we can use the first letter from the API gateway request ID or the user’s apiKey. For example, if we have two requests at the same timestamp 1713585359347 with base62 value as UASDXFr and user API key as AIzaSyDaGmWKa4JsXZ-HjGw7ISLn_3namBGewQe. To resolve the conflict while writing to Dynamo DB, we append the first letter A from the API key thereby creating the short URL UASDXFrA.

4.4.2 Database

4.4.2.1 Data Access Patterns

  • Fetch the actual Long URL with the short URL.

We will go with Dynamo DB as the data store. Following is the DB schema

FieldTypeDescription
shortURLpartitionKeyShortened URL for which the actual long URL was mapped to.
longURLURL which was shortened
ttlTime in seconds the shortURL needs to be active. We can use Dynamo DB https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html functionality.

4.5 Monitoring

The URL shortening service uses AWS Cloudwatch for monitoring the system performance and business metrics.

Some of the success metrics to track

  1. Number of short URLs generated per day

  2. Number of redirection provided by the system per day

  3. Number of unique users using the Short URL service

4.6 Availability/Load balancer

Availability and Load balancing are handled internally by AWS and does not require any infrastructure setup. AWS SLA documentation.) provides that AWS supports monthly uptime of at least 99.95% for each AWS region.

4.7 Rate Limiting

Rate Limiting or Throttling can be set on AWS API gateway. Following are some of the basic throttling settings

  1. AWS throttling limits are applied across all accounts and clients in a region. This is set by AWS and client cannot change update this configuration.

  2. Per-account limits are applied to all APIs in an account in a specified Region. The limits can be updated by contacting the AWS customer support.

  3. Per-API, per-stage throttling limits are applied at the API method level for a stage.

  4. Per-client throttling limits are applied to clients that use API keys associated with your usage plan as client identifier.

Reference: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html

Conclusion

The design can be easily extended to

  1. enable tier based subscription model when the users login to the system and ratelimit the user requests based on their subscription.

  2. support Custom short URL.