Beyond API Gateway polling

engels

A short while back, we wrote about building serverless APIs using AWS services. By using the AWS API Gateway, most of the heavy lifting is handled for you, so you can focus on the business logic that adds value to your products.

The API Gateway has made most REST APIs very easy to implement for quite some time now. It’s been a different story for calls triggering long-running processes on the server. A good example of this would be uploading a video and running some recognition job on it.

In the past, the approach that was recommended for such tasks was to use polling to regularly check if the job has completed.

Consumer regularly polls for updates

Polling, however, is a waste of resources; The client continuously needs to make web requests, and will always wait longer than necessary because of your polling interval. But modern browsers have a way to simply ask the backend to tell send them the data when ready, without having to nag every few seconds: websockets. Since late 2018, there is native support to use these in the API Gateway! This means that it has become very easy to use websockets in your cloud-native serverless applications.

What are websockets

Websockets are a way to enable two-way communication between a client and a server. Particularly, it’s implemented by modern browsers to allow the server to actively push data to the browser.

Traditionally, a client requested data from the server, the server returns it, and closes the connection. Before this can happen, we have some round-trips that are needed to set up the connection to the server. This needs to happen before every request.

With websockets, a long-lived TCP connection is opened between two devices. For us, this will usually be a (web)server and a browser. We no longer need some roundtrips to set up each connection, but can instead keep sending information to the server and vice-versa!

This is very useful for the use case where an API call triggers a process that takes any significant amount of time. After the process has completed, the server can send a message back to the client.

Service provider calls consumer back when done

An API Gateway setup using Websockets

When using Websockets with API Gateway, most of the connection logic is handled for you by the gateway. When a client requests a connection to your API, the special $connect route in your gateway will be called. In the handler for this route, you have the option to reject the connection. This can be useful for failed authentication or to throttle requests from an IP that makes too many requests.

In this handler, you will also receive a user ID for the connection. This ID can be used by the backend to send messages to the client, so make sure you store this in a persistent datastore.

When, for whatever reason, the client disconnects, the special $disconnect route is called. You could use this to mark the client as disconnected in your datastore, or to simply remove the connection. Maybe you want to cancel the job that was running, as there’s no longer a client to report back to.

Code example

We have made a minimalistic demo of setting up this process, using CloudFormation to define the required infrastructure as code, on our bitbucket.

It features a very simple $connect and $disconnect handler, both of which just log something. In practice, you could store the connectionId they log to e.g. DynamoDB. This is the identifier you use to send messages to a client.

There is also a lambda called ‘sendmessage’, which you can use to send something back to the connected client. You can call this lambda with a connectionId and message parameter, and it will send the message to the client that had connected with the given connection ID.

There is a guide on deploying the demo in your own environment in the README file on Bitbucket, so you can easily experiment with it yourself!