Reluctant technophile, full-stack co-founder, Sydney-based. Personal blog: @enemieanon
3,064 words
https://crabmusket.net @crabmusket

Lessons in object-oriented design

I will be re-reading the following quotes from Sandi Metz's Practical Object-Oriented Design every morning before I start coding:

Object-oriented design is about managing dependencies. It is a set of coding techniques that arrange dependencies such that objects can tolerate change. In the absence of design, unmanaged dependencies wreak havoc because objects know too much about one another. Changing one object forces change upon its collaborators, which in turn forces change upon its collaborators, ad infinitum. A seemingly insignificant enhancement can cause damage that radiates outward in overlapping concentric circles, ultimately leaving no code untouched.

This makes sense of why dependency injection containers exist, the principle of inversion of control, and how to split up classes into the right responsibilities.

Asserting that code should be easy to change is akin to stating that children should be polite; the statement is impossible to disagree with, yet it in no way helps a parent raise an agreeable child. The idea of easy is too broad; you need concrete definitions of easiness and specific criteria by which to judge code. If you define easy to change as

  • changes have no unexpected side effects,
  • small changes in requirements require correspondingly small changes in code,
  • existing code is easy to reuse, and
  • the easiest way to make a change is to add code that in itself is easy to change,

then the code you write should have the following qualities. Code should be

  • Transparent The consequences of change should be obvious in the code that is changing and in distant code that relies upon it.
  • Reasonable The cost of any change should be proportional to the benefits the change achieves.
  • Usable Existing code should be usable in new and unexpected contexts.
  • Exemplary The code itself should encourage those who change it to perpetuate these qualities.

Code that is Transparent, Reasonable, Usable, and Exemplary (TRUE) not only meets today’s needs but can also be changed to meet the needs of the future. The first step in creating code that is TRUE is to ensure that each class has a single, well-defined responsibility.

Optimising for unknown future requirements seems like a great way to go about writing code that's usable today as well.

A stateless token case study: Algolia search API

At work, we use Algolia to outsource the job of managing search infrastructure. One part of its API intrigued me. Algolia's server-side library allows us to create "secured API keys" to give to our users (i.e., browsers), with which our users can perform searches over our Algolia data with filters.

For example, our Algolia account contains search data from Teams A, B and C. When a user from Team A logs in, our server generates an Algolia token for that user with a filter set to only show results from Team A's data.

The cool thing is, these secured tokens can be created without any calls to Algolia's servers, making them very lightweight and easy to use. I wanted to find out how Algolia were actually doing it! Having been reading a lot about JWTs and trying them out on some APIs, I wanted to discover a good use-case for them. This seemed similar enough, but I could tell that what Algolia was creating were not actually JWTs.

The search

In our app, creating a "secured token" looks something like this:

$searchToken = SearchClient::generateSecuredApiKey($secret, [
    'filters' => 'team:' . $user->team_id,
]);

In the example, $secret is a server-side configuration value Algolia gives us, which we never share with clients. $searchToken gets sent to the client's browser on page load. Because creating a token doesn't require any API calls, we create new tokens on every page load, and could quickly refresh or modify them during a session if we needed to.

To work out what was actually contained in these tokens, I went digging in the source code of their PHP library. I found the relevant code here:

public static function generateSecuredApiKey($parentApiKey, $restrictions)
{
    $urlEncodedRestrictions = Helpers::buildQuery($restrictions);
    $content = hash_hmac('sha256', $urlEncodedRestrictions, $parentApiKey).$urlEncodedRestrictions;
    return base64_encode($content);
}

So the token that gets sent to a client will be structured something like this:

base64_encode(
    '8b02da15d77ee56bf593849cb4ca8494f2cff19403c8c0bd99fc362e91a5ec69'
    . 'filters=team%3A123'
)

The client could decode this and pull out the query parameter data if it wanted, but any change to it would make the initial HMAC invalid, and therefore Algolia wouldn't accept the token from the client.

If it quacks like a JWT

These Algolia tokens obviously don't include any JSON; they encode their payload data as a URL query string instead. But you could achieve a similar result using a JWT. Both are ways to send data between two trusted services via an untrusted intermediary. The data is unencrypted, so the client can inspect the data. But because of the cryptographic signature attached to the data, the client cannot modify the data without detection.

The general principle works like this:



  1. We copy the secret token from Algolia to our servers "manually" (or via config management software)
  2. Our server creates a secured token for a specific user when that user needs to search, with parameters specific to that user
  3. The token is shared with the client
  4. The client uses the token, as well as other identifying information, to make requests directly to Algolia
  5. Algolia checks that the token is correct (has not been tampered with), then extracts the parameters and performs the query the client requested

Beyond the initial sharing of the secret between Algolia and our own servers, we don't need to send requests to Algolia's API; the client can communicate with them directly when searching, which is great.

Is it really stateless?

There's an important subtlety to notice here. Our "shared secret" is only shared between Algolia and our company. It's different for every Algolia customer (and even every registered application belonging to the same customer). Most JWT tutorials sign the JWT with a single secret per service, as if every Algolia customer were using the same shared secret. This probably changes the exact understanding of "stateless".

In the step described as validate(token, app), Algolia must look up the shared secret belonging to app, in order to check that token's signature is valid. Depending on how this is implemented, it might require database lookups, etc., but that's for Algolia to optimise. From our perspective when creating tokens, no round-trips to Algolia are required.

JavaScript's ecosystem is uniquely paranoid

Another week, another NPM-related snafu. Why does this keep happening to the JavaScript ecosystem? The answer is paranoia. 😱


Many are quick to assert that JavaScript just has a too-low barrier to entry and the n00bs are messing it up. Whenever anyone says "JavaScript is the new PHP!" this is probably what they mean. I don't feel the need to provide evidence against this claim; I think it comes from an understandable frustration, expressed through all-too-common tech elitism.

Others say we should blame resume-driven-development, and the ego boost of having published thousands of open-source modules. We must all suffer, the argument goes, because a few obsessive individuals want to be able to boast about how critical they personally are to the JavaScript ecosystem. While this is probably a real trend, why isn't it more prevalent in other open-source ecosystems?

Disclaimer before proceeding: I use JS every day, and I actually really like it. I'm not trying to criticise it, just to explore some unique problems it has. I hope this post doesn't come across as too harsh.

There are probably many contributing factors that have shaped NPM into what it is today. However, I assert that the underlying reason for the bizarre profusion of tiny, absurd-seeming one-liner packages on NPM is paranoia, caused by a unique combination of factors.

JavaScript makes you paranoid

Three factors have caused a widespread cultural paranoia among JavaScript developers. This has been inculcated over years. These factors are: JavaScript's weak dynamic type system; the diversity of runtimes JavaScript targets; and the fact of deploying software on the web.

1. Weak dynamic typing

It's well-known that JavaScript's "type system" leaves a lot to be desired. This well-known talk is a humourous take on some of the many ways you can shoot yourself in the foot in JavaScript.

Unless your team (and every open-source package your team depends on) always uses ===, knows exactly when typeof is acceptable, is good at defensive programming, and designs APIs that have good type discipline*, you've probably been tripped up by a string that behaved like a number, a 0 that was skipped for being falsy, an undefined turning up somewhere surprising, typeof null === 'object', etcetera.

This isn't entirely unique to JavaScript - many languages have dynamic types, and many languages have weak types and implicit coercions. But I would argue JavaScript is quite a dire example. And this is still an important contributing factor, without which the second factor probably wouldn't be as significant.

*Or, you are TypeScript users. See Appendix 3.

2. Browser runtimes

It's not just the case that "JavaScript is missing a standard library". For example, there is a really easy and straightforward "standard" way to check if an object is an array: thing instanceof Array.

But wait! Enter the iframe! If the array came from a different context, this check will fail, because the iframe's Array constructor is a different object from the parent window's Array. Do you really know where that value came from?

Enter Array.isArray to save the day! But wait! What if your code needs to run in an older browser which doesn't support isArray? Is your transpilation+polyfill pipeline reliable enough to handle this? What do you mean you're not using babel-env-preset or whatever the package is called now? This is the downfall of many a well-intentioned addition to JavaScript's standard library (like String.padStart).

Having to deal with an extreme diversity of runtimes seems unique to JavaScript among mainstream languages. This could be my bias showing (I'm primarily a web developer), but it's certainly true of the difference between web frontend code and web backend code. You just never know where your code is going to run - in Internet Explorer 8, on Opera for Android, or someone's old version of Safari on their iPhone 5 they're clinging to because it would be too expensive to upgrade.

This is bad enough for application developers, who can to some extent draw a line and decide not to support users in certain demographics. (Or, in Kogan's case, charge those users more.) But it's a nightmare for library developers, who want to make their code usable by as many other developers as possible.

3. Bundle size

Do you remember a few months ago when the internet joined in a collective hate-on for the is-buffer package? This package, as its name suggests, checks whether something is a Buffer.

Why would one need a package for that? Well, weak typing might make one want to check types like this; moving targets in the runtime might make one worry that one doesn't know how to check the type reliably - but still, why doesn't one just depend on the buffer package?

Enter the final triumvir of this unholy alliance: bundle size paranoia, which was ostensibly the reason the is-buffer package was created. Because JavaScript programs have to be downloaded frequently by users (even multiple times by the same user on the same day, if caching isn't used carefully), and because Google has convinced us that milliseconds of additional page load time will have dire consequences for our users and consequently for our bank accounts, and because bundlers and module systems have not provided adequate support for modularity, we web developers go to extreme lengths to avoid shipping unnecessary bytes to our users.

When the unit of modularity is "NPM package", rather than "file" or even "function", some will go to great lengths to split their code across NPM packages. (For more on this, see Appendix 1.) This works with old bundlers that can't tree-shake, and it can avoid reuse - though as noted by the lodash project itself, they are thankfully moving away from this pattern because it may introduce more opportunities to duplicate code than to deduplicate it!

A huge amount of effort has been poured into not just minifying a source bundle, but producing the best possible bundle in the first place. The NPM ecosystem as it stands today has been shaped in part by these efforts.

Future proof

These three factors combine and interact in unexpected and awful ways.

Over the years there has been rapid evolution in both frontend frameworks and backend JavaScript, high turnover in bundlers and best-practises. This has metastasized into a culture of uncertainty, an air of paranoia, and an extreme profusion of small packages. Reinventing the wheel can sometimes be good - but would you really bother doing it if you had to learn all the arcane bullshit of browser evolution, IE8 compatibility, implementation bugs, etc. ad infinitum?

And it's not just that you don't understand how things work now, or how they used to work - but that they'll change in the future!

Whenever NPM's package culture is discussed, one of the benefits touted is that if one of your dependencies is ever updated, your own code will now be updated "for free"! Your application will remain correct, because it depends on an abstraction that will remain correct. (Abstractions are good, but see Appendix 2.)

This is a very reasonable expectation, and an important piece of progress in software development. But I believe the paranoia created by the three factors I discussed above have led to the excesses we see in the current NPM ecosystem. This is why we have is-even and its whole ludicrous web of dependencies, and why we don't have is-even in Python.

"Surely," the rational developer exclaims, "there could be no future changes to the is-even package. The definition of even numbers isn't going to change any time soon!"

No, the definition of even numbers won't ever change. But sadly, my friend, this is JavaScript - and you can never really be sure.

Discuss this post on dev.to


Appendix 1. In praise of modules

My thoughts on this issue have been brewing for a while, but this comment by Sindre Sorhus, noted small-package developer, really put it all in focus for me.

Sindre makes a very good argument in favour of modules:

tl;dr You make small focused modules for reusability and to make it possible to build larger more advanced things that are easier to reason about.

However, this is not an argument in favour of NPM packages. All the benefits Sindre lists could be achieved by simply designing programs in a modular way. If another developer wants to avoid having to re-implement an interesting but not-entirely-trivial piece of functionality, they should be able to lift a well-defined module (ideally a single file) from one project to another.

A lot of the issues with NPM are caused by... well, NPM, not by some inherent property of small modules. This was the case for last week's is-promise debacle (which precipitated me actually writing this blog post). Small NPM packages are the "problem", not small modules, and the problem, at its root, is caused by paranoia.

Appendix 2. The meaning of abstractions

What's wrong with this code?

const isPromise = require('is-promise');

if (isPromise(thing)) {
  thing.then(successCallback).catch(failureCallback);
}

(It's from a real application that uses is-promise, but I won't name names.)

Did you spot it? catch might be undefined. Why? is-promise implements the Promises/A+ spec, which only requires a then method. The specific meaning of "is thing a promise?" can actually change based on how you want to use the answer. The "promise" is not a reliable abstraction here, because JavaScript has so many versions of it, and because promises can be used in many ways.

This is slightly tangential to the paranoia discussed above, but is an outcome of a "don't ask" approach to packages ("don't ask" because the details will horrify you), and probably not unique to JavaScript.

UPDATE: apparently even the package maintainer failed to notice the distinction between a Promise and something that has a then method. This stuff is not trivial.

The pattern of doing this kind of typecheck is all-too-prevalent in the JS ecosystem, which privileges APIs that seem "simple" because you can chuck anything you want into them, but pushes the burden of being compatible with every conceivable input onto the library. Which brings me to my next appendix...

Appendix 3. TypeScript

Is there a solution to all this? How can we stop the madness?

I don't believe TypeScript is a solution. If anything, it's a clear symptom of the problem. But I believe that TypeScript helps do something important: it makes poorly-typed code annoying to write.

Yes, you can design a method that accepts anything from a string to a thenable that will return an object containing a Float64Array, but writing the type of that method becomes ugly, and implementing it becomes a pain because TypeScript forces you to demonstrate to its satisfaction that you've done it correctly.

Fewer APIs that take and return different types make it less necessary to implement code like is-buffer, is-number, etcetera. Of course, browser compatiblity and bundle size anxiety will still present problems. But maybe with an increase in JavaScript developers designing code with types, we'll see less demand for typecheck packages and the like.

Appendix 4. Deno

One of the reasons I'm excited for Deno's upcoming stable release is that it builds on a philosophy of fewer, better dependencies. But even in cases where you need a specific dependency, Deno's URL-based imports make it trivial to:

  • Import just a single file without downloading a whole package plus its tests and everything else. Refer back to Appendix 1 for why this is cool.

  • Pin each import to a commit hash or other stable identifier.

Yes, many people are concerned about the idea of importing URLs for many legitimate reasons. NPM is a more trusted place to host packages than some random website. But not even NPM can be 100% reliable indefinitely. Deno at least makes you stop and think... do I trust this source?