0x1F URL Service Outline

Introduction

This document gives a brief outline of the 0x1F URL shortening & redirect service.

The service has two main functions:

  1. To replace any URL which is to be encoded in a QR tag with a shorter, constant length URL, which can be encoded in a low resolution tag.
  2. To redirect requests for a shortened URL to the original, full length URL, recording and tracking all requests as they are made.

An example of a shortened URL is:

http://0x1f.ie/H83if
The service is hosted at the domain '0x1f.ie'. The URL path is a 5-digit long, base-62 encoded number using the character set [a-zA-Z0-9]. This scheme allows a total of 62^5 unique URLs to be hosted by the service. The total length of the shortened URL is 20 characters. This allows the URL to be encoded in the lowest resolution QR tag possible.

The service will be implemented using procjs backed by a couchdb database. Initially, procjs will only be required for the URL redirection part of the service. Creation and update of shortened URLs will be handled entirely by couchdb.

Creating shortened URLs

A shortened URL is created by inserting a document into the couchdb database. Each document has the following format:

    {
        "url":"http://innerfunction.com/wp/whelans/?p=14"
        "inactive":false
    }
Where the 'url' property is the full URL which the service will redirect to. The 'inactive' property is a flag specifiying whether the URL is in active use. If this property is true then the service will not redirect to the URL when requested, returning a 404 error instead. The property has a default value of 'false'.

The document ID is the 5 digit base-62 number which is represented in the shortened URL's path.

The full URL associated with a shortened URL can be rewritten at any time by updating the document. This allows a QR tag encoding a shortened URL to be reused multiple times.

This part of the service is hosted entirely on couchdb using its HTTP API. The service will use standard couchdb access controls to prevent unauthorized update of the system. (See http://wiki.apache.org/couchdb/Technical%20Overview#Security_and_Validation). The couchdb API is hosted at http://admin.0x1f.ie/ so that it won't overlap with the URL redirection service (see next).

URL redirection

All HTTP requests to the domain 0x1f.ie are treated as redirection requests and handled as follows:

  1. If the request is anything other than a GET request then a 501 Not Implemented error is returned. (TODO: Should also handle HEAD requests).
  2. The request path is extracted from the request.
  3. The document whose ID matches the request path is read from the couchdb instance.
  4. If no document is found then a 404 error is returned.
  5. If a document is found but the URL is inactive then a 404 error is returned.
  6. Else a 302 redirection response is returned using the URL in the document.

Logging

All requests are logged. Each log entry is a space separated list of fields terminated by a newline character. Each log entry has the following standard fields:

  1. Timestamp: In yyyymmddhhmmss.s format.
  2. The HTTP client's IP address.
  3. The client tracking ID (see below).
  4. The client's User Agent string. The string is URL encoded so than no spaces appear within it.
  5. The request path.
  6. The response code. One of:
The following field is included in the log entry when the response code is OK:

Tracking

All clients are tracked using cookies. Each service response includes a cookie containing a unique identifier. The very first request to the service from any client won't contain a tracking cookie. The service will then generate a unique identifier by concatentating the following:

This unique identifier will then be returned to the client in a cookie contained in the HTTP response. Subsequent requests from the same client will contain the tracking cookie. Subsequent responses will echo the tracking cookie back to the client.