A stateless token case study: Algolia search API

At work, we use Algolia to outsource the job of managing search infrastructure. One part of its API intrigued me. Algolia's server-side library allows us to create "secured API keys" to give to our users (i.e., browsers), with which our users can perform searches over our Algolia data with filters.

For example, our Algolia account contains search data from Teams A, B and C. When a user from Team A logs in, our server generates an Algolia token for that user with a filter set to only show results from Team A's data.

The cool thing is, these secured tokens can be created without any calls to Algolia's servers, making them very lightweight and easy to use. I wanted to find out how Algolia were actually doing it! Having been reading a lot about JWTs and trying them out on some APIs, I wanted to discover a good use-case for them. This seemed similar enough, but I could tell that what Algolia was creating were not actually JWTs.

The search

In our app, creating a "secured token" looks something like this:

$searchToken = SearchClient::generateSecuredApiKey($secret, [
    'filters' => 'team:' . $user->team_id,
]);

In the example, $secret is a server-side configuration value Algolia gives us, which we never share with clients. $searchToken gets sent to the client's browser on page load. Because creating a token doesn't require any API calls, we create new tokens on every page load, and could quickly refresh or modify them during a session if we needed to.

To work out what was actually contained in these tokens, I went digging in the source code of their PHP library. I found the relevant code here:

public static function generateSecuredApiKey($parentApiKey, $restrictions)
{
    $urlEncodedRestrictions = Helpers::buildQuery($restrictions);
    $content = hash_hmac('sha256', $urlEncodedRestrictions, $parentApiKey).$urlEncodedRestrictions;
    return base64_encode($content);
}

So the token that gets sent to a client will be structured something like this:

base64_encode(
    '8b02da15d77ee56bf593849cb4ca8494f2cff19403c8c0bd99fc362e91a5ec69'
    . 'filters=team%3A123'
)

The client could decode this and pull out the query parameter data if it wanted, but any change to it would make the initial HMAC invalid, and therefore Algolia wouldn't accept the token from the client.

If it quacks like a JWT

These Algolia tokens obviously don't include any JSON; they encode their payload data as a URL query string instead. But you could achieve a similar result using a JWT. Both are ways to send data between two trusted services via an untrusted intermediary. The data is unencrypted, so the client can inspect the data. But because of the cryptographic signature attached to the data, the client cannot modify the data without detection.

The general principle works like this:



  1. We copy the secret token from Algolia to our servers "manually" (or via config management software)
  2. Our server creates a secured token for a specific user when that user needs to search, with parameters specific to that user
  3. The token is shared with the client
  4. The client uses the token, as well as other identifying information, to make requests directly to Algolia
  5. Algolia checks that the token is correct (has not been tampered with), then extracts the parameters and performs the query the client requested

Beyond the initial sharing of the secret between Algolia and our own servers, we don't need to send requests to Algolia's API; the client can communicate with them directly when searching, which is great.

Is it really stateless?

There's an important subtlety to notice here. Our "shared secret" is only shared between Algolia and our company. It's different for every Algolia customer (and even every registered application belonging to the same customer). Most JWT tutorials sign the JWT with a single secret per service, as if every Algolia customer were using the same shared secret. This probably changes the exact understanding of "stateless".

In the step described as validate(token, app), Algolia must look up the shared secret belonging to app, in order to check that token's signature is valid. Depending on how this is implemented, it might require database lookups, etc., but that's for Algolia to optimise. From our perspective when creating tokens, no round-trips to Algolia are required.


You'll only receive email when crabmusket publishes a new post

More from crabmusket