Reluctant technophile, full-stack co-founder, Sydney-based. Personal blog: @enemieanon
7086 words
https://crabmusket.net@crabmusket

Destroy All Dependencies

or, why structural typing is awesome

Comment on this post on dev.to

When designing software, we strive to ensure each module, class, or function has a single responsibility. We want to be able to reuse small, focused pieces of code, and we want our tests to be clean and understandable.

But a piece of software as a whole usually has more than one responsibility! Sindre Sorhus, noted developer of small NPM modules, puts it succinctly in this comment:

You make small focused modules for reusability and to make it possible to build larger more advanced things that are easier to reason about.

Our modules should be small and focused, but the end goal is to integrate them into a larger whole. This means the modules that make up our applications must have dependencies between each other; they cannot live in blissful isolation.

This means that dependencies are a necessary evil in software development.

Because dependencies are a necessary evil, we try to keep control of them as much as possible. According to Sandi Metz in her excellent book:

Object-oriented design is about managing dependencies. It is a set of coding techniques that arrange dependencies such that objects can tolerate change ... In the absence of design, unmanaged dependencies wreak havoc because objects know too much about one another.

While Sandi is talking about object-oriented design in particular, the principle applies to all ways of designing programs. Carefully managing the dependencies of each module of code you write will lead to cleaner, less-tightly-coupled code that's easier to test and reuse.

The example in this article will use TypeScript in object-oriented style. We're going to look at a piece of code with some obvious dependencies and some subtle ones, and then we'll use a few well-known techniques to remove as many of those dependencies as we can. The concepts and tools we'll use are:

  • The single-responsibility principle
  • The dependency injection pattern
  • Duck typing
  • TypeScript's structural typing capabilities
  • Generics and the principle of parametricity

After our changes, we'll have made our module much more reusable and more robust to changes in the rest of the application it's part of.

A note about static types

Part of the inspiration for writing this article was the idea that dynamically-typed languages have it easier than statically-typed languages when it comes to reusability. Because types don't have to be declared, imported, and pinned down, the code is unencumbered to do what needs to be done now, and to change in the future.

Type systems have come a long way in the last decade. New statically-typed languages have emerged and gone mainstream. Existing languages have gained new features.

In this post we'll end up writing code that feels very dynamic, even though it's statically-typed and safe. This is, I think, a real gift of TypeScript's type system, as you'll see.

A disclaimer about premature design

Before we dive into the code, I want to point out that it's possible to do too much design up-front. Premature abstraction, code that is too DRY, and imagining future requirements can all create more problems than they're worth.

But, that said, it's possible to write code that maintains more flexibility than less, even when all requirements aren't yet known. Different ways of coding the same behaviour can create fewer or more dependencies, without changing the level of DRYness or abstraction.

The advice in this post has two purposes:

  1. To introduce you to some techniques which result in fewer dependencies, which you can apply habitually when writing new code.
  2. To help you learn how to apply these techniques when changing requirements force you to break apart dependencies.

With that in mind, let's get started.

Breadth-first search with lots of dependencies

Take a look at this (incomplete) snippet of TypeScript code which implements breadth-first search on a graph:

import {Graph} from "./graph";

export function breadthFirstSearch(
  nodes: Array<string>,
  edges: Array<[string, string]>,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  let graph = new Graph(nodes, edges);

  let result: Array<string> = [];
  let visited: Array<string> = [];
  let queue: Array<string> = [startingNode];

  while (queue.length) {
    let node = queue.pop();
    let nodesToVisit = graph.successors(node);

    // algorithm omitted for brevity
  }

  return result;
}

I've omitted the meat of the BFS algorithm, but you can see the important parts, which are:

  • Creating a Graph from the nodes and edges we're given
  • Calling the graph's successor method, which returns the names of the nodes connected to the given node

The first thing we should do when looking at this code is ask ourselves: what dependencies does this code have? Have a think before reading on.

There's one obvious dependency: importing the graph.ts module at the top of the file. However, there are many implicit dependencies in this code, which take a little bit of thought to discern:

  • What data to pass to the Graph constructor (nodes and edges)
  • Where to find the Graph constructor (in graph.ts)
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the Graph class has (in this case, just successors)
  • How to call those Graph methods (e.g. successors takes a single string and returns an array of them)

You might not be used to thinking of things like class names and method parameter lists as dependencies. But they represent knowledge about external code, just as much as a hardcoded file location does. These pieces of knowledge tie our code to that external code, and mean that if the external code changes, our code will have to change too.

Some of these are necessary dependencies, but others are incidental. Incidental dependencies can be caused by careless design, but can be remedied just as easily. Let's take the first step in reducing our code's dependency on the outside world: reducing the amount of responsibility it has.

The single-responsibility principle

The single responsibility principle, which is a staple of object-oriented programming but can be applied anywhere, encourages us to write code so that "a class should have only one reason to change."

What is the responsibility of our BFS function as it is currently written? If we were to describe it in one sentence, we might say it like this:

breadthFirstSearch creates a Graph from the given nodes and edges, and searches for a path between two nodes within it.

The presence of an and in that sentence indicates there is more than one responsibility. We need to find a way to remove the first responsibility. We'll find that in removing a responsibility, we also remove dependencies.

You might think that creating a Graph object is a mere implementation detail, of the same level of significance as creating the result array, for example. And you may certainly be right to think that! When looking at a snippet of code like this, outside the context of a wider application, it's hard to say. In every instance, you will need to think through what counts as a responsibility, what is a mere implementation detail, and where to draw the boundaries of your modules. My purpose here is to illustrate what may be, not to proscribe the optimal way to structure breadth-first search algorithms.

Let's use the dependency injection pattern to remove the responsibility for creating a graph. To do that, we'll change the code as follows:

import {Graph} from "./graph";

export function breadthFirstSearch(
  graph: Graph,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  let result: Array<string> = [];
  // the rest remains unchanged
}

We removed the nodes and edges arguments, instead accepting a graph of type Graph. Instead of creating a dependent object, the function now accepts it as an argument. By doing that, it has refused to take responsibility for creating the graph, even though it still depends on a graph existing at some point.

Let's look back at our list of dependencies and see how it's different now:

  • What data to pass to the Graph constructor
  • That graph is an instance of the Graph class
  • Where to find the Graph constructor
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the Graph class has (in this case, just successors)
  • How to call those Graph methods (e.g. successors takes a single string)

We seem to have introduced a new dependency, in the process of trying to remove one: now our code knows that the graph parameter is an instance of a class - the Graph class in particular.

This is a much smaller dependency. We've traded a large piece of knowledge - how to use the Graph constructor - for a smaller piece: merely that the Graph constructor exists.

But we would still like to remove this dependency entirely, rather than only shrinking it, if we can. And it turns out: we can.

Duck typing with interfaces

Let's think for a moment about this new dependency we've introduced. It may be smaller than the original dependency, but it still has some troubling implications.

In a statically-typed language, declaring that graph has type Graph (where Graph is a class constructor) means that we can never pass anything into this parameter which is not an instance of the Graph class or one of its subclasses.

This might seem reasonable, but it does reduce the flexibility of our code. Object-oriented inheritance can be useful, but we shouldn't force it on users of our code without good reason to. Languages like C++ have typically used inheritance heavily to enable polymorphism, because they do not have alternatives. But in TypeScript, as in many modern object-oriented languages, we can do better.

We can apply a principle known as duck typing to remove the implicit dependency on the class hierarchy here. Instead of depending on a specific class constructor, we'll depend on an interface. The difference is that interfaces are not tied to any specific class hierarchy.

In duck typing, an object's suitability is determined by the presence of certain methods and properties, rather than the type of the object itself.

Let's create a new file called graph_interface.ts to help with this. We'll declare all the capabilities of graphs which our application needs in one location:

graph_interface.ts:

export interface GraphInterface {
  successors(node: string): Array<string>;
  predecessors(node: string): Array<string>;
  nodes(): Array<string>;
  edges(): Array<[string, string]>;
  // ... other graph methods
}

And we'll modify our BFS module like so:

import {GraphInterface} from "./graph_interface";

export function breadthFirstSearch(
  graph: GraphInterface,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  // the rest remains unchanged
}

Now, instead of depending on the Graph class itself, we depend on the new interface. Any type can implement this interface, regardless of a class's lineage. For example, we might end up creating new graph types that look like some of these:

// This class of graph has no parent class
class SparseGraph implements GraphInterface {
  // ...
}

// This class is the child of a class from the JavaScript standard library
class AdjacencyMatrix extends Uint8Array implements GraphInterface {
  // ...
}

// This class is a child of our original Graph class
class DirectedGraph extends Graph implements GraphInterface {
  // ...
}

We have strictly increased the reusability of our code, because any type can stick to the interface our code needs. This is an example of the duck typing pattern:

Let's do another check of our dependency list:

  • That graph is an instance of the Graph class
  • That graph conforms to the GraphInterface type
  • Where to find the GraphInterface type
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the GraphInterface type has (in this case, just successors)
  • How to call those GraphInterface methods (e.g. successors takes a single string)

We've removed the Graph class dependency for good, and have modified the following dependencies to refer now to GraphInterface instead of Graph, but we've again introduced a new one dependency! (Will this nightmare ever end? Are we making progress, or just painting over the cracks in our shoddy design?)

As with the previous change we made, we've swapped a larger piece of knowledge for a smaller piece. The Graph class implied an inheritance hierarchy, but the GraphInterface does not. While numerically our dependencies have stayed the same, we've reduced the amount of knowledge they contain. This makes them more manageable.

But now, thanks to an important feature of TypeScript's type system, and the changes we've made up to this point, we can make a dramatic improvement.

Duck typing with structural types

Astute readers will have noticed that in the last section, I gave some examples of classes that looked like this:

class DirectedGraph extends Graph implements GraphInterface {
  // ...
}

But in TypeScript, unlike most statically-typed languages, it's not necessary to declare implements GraphInterface. As long as a class is compatible with an interface at the point of use, TypeScript will be happy! We don't need to declare compatibility ahead of time.

This is known as structural typing. It's an extremely useful feature which we will now make use of. Structural typing contrasts to nominative typing, where the specific identity of each type is unique and significant. Read more about structural and nominative typing.

Because of structural typing, we don't have to depend on a shared interface defined elsewhere. Let's move the interface declaration right into our module, like this:

interface GraphInterface {
  successors(node: string): Array<string>;
  predecessors(node: string): Array<string>;
  nodes(): Array<string>;
  edges(): Array<[string, string]>;
  // ... other graph methods
}

export function breadthFirstSearch(
  graph: GraphInterface,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  // ...

  while (...) {
    let nodesToVisit = graph.successors(node);
    // ...
  }

  // ...
}

Now we don't import GraphInterface, but declare it where it's needed. This is an important ability in languages like TypeScript and Go.

Here, we can cross a dependency off our list and, thankfully, not introduce any new ones:

  • That graph conforms to the GraphInterface type
  • Where to find the GraphInterface type
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the GraphInterface type has (in this case, just successors)
  • How to call those GraphInterface methods (e.g. successors takes a single string)

We no longer depend on the interface being found anywhere except right here where it's used. We could rename every other file in our application, and this module would not even notice.

This is another big step forwards, but you might be getting a niggling feeling that something is wrong when looking at the code above. Let's take a moment to address that niggle.

Increasing flexibility using the interface segregation principle

The niggle is probably coming from the fact that our newly-moved GraphInterface declares a bunch of methods that breadthFirstSearch doesn't use. For example, predecessors is never used.

This is a violation of the interface segregation principle, which suggests that "no client should be forced to depend on methods it does not use."

This phrasing (from the Wikipedia page) is not a precise fit for our code example, but the principle definitely applies. Fortunately, we can easily remedy this, and in doing so increase the flexibility of our module.

Let's revise the GraphInterface like so:

interface GraphInterface {
  successors(node: string): Array<string>;
}

export function breadthFirstSearch(
  graph: GraphInterface,
  startingNode: string,
  nodeToFind: string
): Array<string> {
  // ...

  while (...) {
    let nodesToVisit = graph.successors(node);
    // ...
  }

  // ...
}

Now that the interface has shrunk to include precisely the functionality we need from it, it doesn't represent the entire concept of "graphness" any more, and we should probably rename it. (Luckily, this is safe to do, because the interface is not exported, so no external code could be depending on it!)

interface BreadthFirstSearchable {
  successors(node: string): Array<string>;
}

export function breadthFirstSearch(
  graph: BreadthFirstSearchable,
  // ...
}

The interface is now acting as a declaration of the types of things that can be searched with our breadthFirstSearch function. For more exploration of this pattern, have a read of this great blog post which uses examples in Go, another structurally-typed language.

Let's take another look at our list of dependencies:

  • That graph conforms to the BreadthFirstSearchable type
  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the graph object has (in this case, just successors)
  • How to call those methods (e.g. successors takes a single string)

I have modified the final two dependencies slightly, and in doing so, have made the first dependency redundant. Because our BreadthFirstSearchable interface perfectly fits just the methods our code actually uses, the interface itself is insignificant; it's merely a convenient syntax to declare how we will use the graph object.

Take a breather

We've done extremely well so far, reducing our list of five weighty dependencies to three. Take a second to congratulate yourself!

If we stopped here, we could be proud of our work. We've successfully

  • excised a responsibility that our module shouldn't have had in the first place,
  • removed an import of an external file,
  • abolished the implicit dependency on a class hierarchy

and in doing so, have made the code clearer and more focused. A side-effect of declaring interfaces at the point of use is to have explicit documentation about what our breadth-first search module really needs in order to work.

Notice that the remaining dependencies seem much more difficult to consider removing. After all, how could we make a BFS algorithm that doesn't depend on knowing the name of the successors method? Could we take this too far?

Of the dependencies that remain, I would argue the final two are very reasonable dependencies to keep. They express the core meaning of the code. Breadth-first search depends on searching the successors of nodes in a graph. Each node may have zero, one, or many successors. These are fundamental facts about graph theory that we can't really get around. So it's unlikely we'll be able to remove these dependencies, no matter how clever we are.

However, the dependency on the string type can be dispensed of. If you want to try the Full Monty, read on to find out how.

Using generics to reduce knowledge

Our code currently knows that each node in the graph is a string. This looks like a single piece of knowledge. But, like the Graph class implies an inheritance hierarchy, the string type implies other knowledge:

  • Nodes can be compared using not just == and ===, but <, >, localeCompare, etc.
  • We can ask each node for its .length
  • We can call node.substring
  • Etcetera...

Built-in types like string usually bring a lot of knowledge with them, because they are designed to be feature-ful! Usually it's not a problem to rely on this knowledge. Built-in types change so infrequently (especially not in breaking ways) that it's very safe to leave these dependencies in your code.

However, knowledge of concrete types can sometimes reduce flexibility and reusability. An immediate example that comes to mind is that some graphs might have numeric identifiers instead of string-based identifiers.

Bearing in mind my advice at the beginning about not doing too much premature design and not anticipating requirements, let's explore the way we can use generic types to make this code more flexible if appropriate.

First, let's make an alias of the string type, for clarity:

type Node = string;

interface BreadthFirstSearchable {
  successors(node: Node): Array<Node>;
}

export function breadthFirstSearch(
  graph: BreadthFirstSearchable,
  startingNode: Node,
  nodeToFind: Node
): Array<Node> {
  let result: Array<Node> = [];
  let visited: Array<Node> = [];
  let queue: Array<Node> = [startingNode];

  while (queue.length) {
    let node = queue.pop();
    let nodesToVisit = graph.successors(node);
    // ...
  }

  return result;
}

It's now easier to see where we must insert type parameters in order to remove the knowledge of the specific type. After removing the alias type Node and adding generic parameters, the code looks like this:

interface BreadthFirstSearchable<Node> {
  successors(node: Node): Array<Node>;
}

export function breadthFirstSearch<Node>(
  graph: BreadthFirstSearchable<Node>,
  startingNode: Node,
  nodeToFind: Node
): Array<Node> {
  // ...
}

We've successfully crossed off a piece of knowledge, and have made our code more flexible in the process. This is what our knowledge list looks like now:

  • What type of nodes the graph stores (each node is a string)
  • The names of particular methods the graph object has (in this case, just successors)
  • How to call those methods (e.g. successors takes a single Node)

Parametricity is a cloud of unknowing

There's an esoteric concept known as parametricity which is typically talked about in functional programming. This article from Cambridge (PDF) puts it thus (with added emphasis):

Parametricity can be thought of as the dual to abstraction. Where abstraction hides details about an implementation from the outside world, parametricity hides details about the outside world from an implementation.

The use of a type parameter means our function knows less about the outside world, which is what we wanted. This has interesting implications.

Our code now can't use methods of the String class like substring, because Node could mean any type, depending on the caller of our code.

As described handily in this article using Scala for examples, this lack of knowledge limits the choices we can make when implementing code. We can no longer accidentally use node.length or any other specific methods or properties. We are prevented from peering beyond the cloud of unknowing created by the generic type parameter.

(Of course, because JavaScript has reflection, we can determine information about values with unknown types at runtime. However, typeof and instanceof are more likely to be asked about in code review than an innocuous node.length.)

But, significantly, the caller now also knows that our code is operating inside this cloud of unknowing. The caller knows that our code cannot rely on Node being any specific type. This gives the caller more freedom and flexibility.

Recapping our journey

Here's the code we've ended up with:

interface BreadthFirstSearchable<Node> {
  successors(node: Node): Array<Node>;
}

export function breadthFirstSearch<Node>(
  graph: BreadthFirstSearchable<Node>,
  startingNode: Node,
  nodeToFind: Node
): Array<Node> {
  let result: Array<Node> = [];
  let visited: Array<Node> = [];
  let queue: Array<Node> = [startingNode];

  while (queue.length) {
    let node = queue.pop();
    let nodesToVisit = graph.successors(node);
    // the rest of the algorithm
  }

  return result;
}

And here is our list of knowledge that this code has about external code:

  • The names of particular methods the graph object has (in this case, just successors)
  • How to call those methods (e.g. successors takes a single Node)

We've come a long way, and reduced our knowledge to the very minimum. The remaining dependencies reflect the core abstractions of our domain. This module should last unchanged for a long time, serenely weathering changes in the code surrounding it, without demanding to be changed in lockstep.

While it might seem like we might have looked into the future to imagine requirements this code might be fulfilling (e.g. new graph classes, or graphs with numeric nodes), the changes we've made were low-impact and broadly applicable to many situations. We didn't add new features, or introduce new abstractions: we systematically removed knowledge from the code, and in doing so made it less dependent on the code around it, and more reusable in unexpected future contexts.

I hope that next time you're writing new code, or refactoring a tangled web of dependencies, these techniques and ways of understanding dependency will help you cut through the chaos and discover clean, single-responsibility modules.

Lessons in object-oriented design

I will be re-reading the following quotes from Sandi Metz's Practical Object-Oriented Design every morning before I start coding:

Object-oriented design is about managing dependencies. It is a set of coding techniques that arrange dependencies such that objects can tolerate change. In the absence of design, unmanaged dependencies wreak havoc because objects know too much about one another. Changing one object forces change upon its collaborators, which in turn forces change upon its collaborators, ad infinitum. A seemingly insignificant enhancement can cause damage that radiates outward in overlapping concentric circles, ultimately leaving no code untouched.

This makes sense of why dependency injection containers exist, the principle of inversion of control, and how to split up classes into the right responsibilities.

Asserting that code should be easy to change is akin to stating that children should be polite; the statement is impossible to disagree with, yet it in no way helps a parent raise an agreeable child. The idea of easy is too broad; you need concrete definitions of easiness and specific criteria by which to judge code. If you define easy to change as

  • changes have no unexpected side effects,
  • small changes in requirements require correspondingly small changes in code,
  • existing code is easy to reuse, and
  • the easiest way to make a change is to add code that in itself is easy to change,

then the code you write should have the following qualities. Code should be

  • Transparent The consequences of change should be obvious in the code that is changing and in distant code that relies upon it.
  • Reasonable The cost of any change should be proportional to the benefits the change achieves.
  • Usable Existing code should be usable in new and unexpected contexts.
  • Exemplary The code itself should encourage those who change it to perpetuate these qualities.

Code that is Transparent, Reasonable, Usable, and Exemplary (TRUE) not only meets today’s needs but can also be changed to meet the needs of the future. The first step in creating code that is TRUE is to ensure that each class has a single, well-defined responsibility.

Optimising for unknown future requirements seems like a great way to go about writing code that's usable today as well.

A stateless token case study: Algolia search API

At work, we use Algolia to outsource the job of managing search infrastructure. One part of its API intrigued me. Algolia's server-side library allows us to create "secured API keys" to give to our users (i.e., browsers), with which our users can perform searches over our Algolia data with filters.

For example, our Algolia account contains search data from Teams A, B and C. When a user from Team A logs in, our server generates an Algolia token for that user with a filter set to only show results from Team A's data.

The cool thing is, these secured tokens can be created without any calls to Algolia's servers, making them very lightweight and easy to use. I wanted to find out how Algolia were actually doing it! Having been reading a lot about JWTs and trying them out on some APIs, I wanted to discover a good use-case for them. This seemed similar enough, but I could tell that what Algolia was creating were not actually JWTs.

The search

In our app, creating a "secured token" looks something like this:

$searchToken = SearchClient::generateSecuredApiKey($secret, [
    'filters' => 'team:' . $user->team_id,
]);

In the example, $secret is a server-side configuration value Algolia gives us, which we never share with clients. $searchToken gets sent to the client's browser on page load. Because creating a token doesn't require any API calls, we create new tokens on every page load, and could quickly refresh or modify them during a session if we needed to.

To work out what was actually contained in these tokens, I went digging in the source code of their PHP library. I found the relevant code here:

public static function generateSecuredApiKey($parentApiKey, $restrictions)
{
    $urlEncodedRestrictions = Helpers::buildQuery($restrictions);
    $content = hash_hmac('sha256', $urlEncodedRestrictions, $parentApiKey).$urlEncodedRestrictions;
    return base64_encode($content);
}

So the token that gets sent to a client will be structured something like this:

base64_encode(
    '8b02da15d77ee56bf593849cb4ca8494f2cff19403c8c0bd99fc362e91a5ec69'
    . 'filters=team%3A123'
)

The client could decode this and pull out the query parameter data if it wanted, but any change to it would make the initial HMAC invalid, and therefore Algolia wouldn't accept the token from the client.

If it quacks like a JWT

These Algolia tokens obviously don't include any JSON; they encode their payload data as a URL query string instead. But you could achieve a similar result using a JWT. Both are ways to send data between two trusted services via an untrusted intermediary. The data is unencrypted, so the client can inspect the data. But because of the cryptographic signature attached to the data, the client cannot modify the data without detection.

The general principle works like this:



  1. We copy the secret token from Algolia to our servers "manually" (or via config management software)
  2. Our server creates a secured token for a specific user when that user needs to search, with parameters specific to that user
  3. The token is shared with the client
  4. The client uses the token, as well as other identifying information, to make requests directly to Algolia
  5. Algolia checks that the token is correct (has not been tampered with), then extracts the parameters and performs the query the client requested

Beyond the initial sharing of the secret between Algolia and our own servers, we don't need to send requests to Algolia's API; the client can communicate with them directly when searching, which is great.

Is it really stateless?

There's an important subtlety to notice here. Our "shared secret" is only shared between Algolia and our company. It's different for every Algolia customer (and even every registered application belonging to the same customer). Most JWT tutorials sign the JWT with a single secret per service, as if every Algolia customer were using the same shared secret. This probably changes the exact understanding of "stateless".

In the step described as validate(token, app), Algolia must look up the shared secret belonging to app, in order to check that token's signature is valid. Depending on how this is implemented, it might require database lookups, etc., but that's for Algolia to optimise. From our perspective when creating tokens, no round-trips to Algolia are required.

JavaScript's ecosystem is uniquely paranoid

Another week, another NPM-related snafu. Why does this keep happening to the JavaScript ecosystem? The answer is paranoia. 😱


Many are quick to assert that JavaScript just has a too-low barrier to entry and the n00bs are messing it up. Whenever anyone says "JavaScript is the new PHP!" this is probably what they mean. I don't feel the need to provide evidence against this claim; I think it comes from an understandable frustration, expressed through all-too-common tech elitism.

Others say we should blame resume-driven-development, and the ego boost of having published thousands of open-source modules. We must all suffer, the argument goes, because a few obsessive individuals want to be able to boast about how critical they personally are to the JavaScript ecosystem. While this is probably a real trend, why isn't it more prevalent in other open-source ecosystems?

Disclaimer before proceeding: I use JS every day, and I actually really like it. I'm not trying to criticise it, just to explore some unique problems it has. I hope this post doesn't come across as too harsh.

There are probably many contributing factors that have shaped NPM into what it is today. However, I assert that the underlying reason for the bizarre profusion of tiny, absurd-seeming one-liner packages on NPM is paranoia, caused by a unique combination of factors.

JavaScript makes you paranoid

Three factors have caused a widespread cultural paranoia among JavaScript developers. This has been inculcated over years. These factors are: JavaScript's weak dynamic type system; the diversity of runtimes JavaScript targets; and the fact of deploying software on the web.

1. Weak dynamic typing

It's well-known that JavaScript's "type system" leaves a lot to be desired. This well-known talk is a humourous take on some of the many ways you can shoot yourself in the foot in JavaScript.

Unless your team (and every open-source package your team depends on) always uses ===, knows exactly when typeof is acceptable, is good at defensive programming, and designs APIs that have good type discipline*, you've probably been tripped up by a string that behaved like a number, a 0 that was skipped for being falsy, an undefined turning up somewhere surprising, typeof null === 'object', etcetera.

This isn't entirely unique to JavaScript - many languages have dynamic types, and many languages have weak types and implicit coercions. But I would argue JavaScript is quite a dire example. And this is still an important contributing factor, without which the second factor probably wouldn't be as significant.

*Or, you are TypeScript users. See Appendix 3.

2. Browser runtimes

It's not just the case that "JavaScript is missing a standard library". For example, there is a really easy and straightforward "standard" way to check if an object is an array: thing instanceof Array.

But wait! Enter the iframe! If the array came from a different context, this check will fail, because the iframe's Array constructor is a different object from the parent window's Array. Do you really know where that value came from?

Enter Array.isArray to save the day! But wait! What if your code needs to run in an older browser which doesn't support isArray? Is your transpilation+polyfill pipeline reliable enough to handle this? What do you mean you're not using babel-env-preset or whatever the package is called now? This is the downfall of many a well-intentioned addition to JavaScript's standard library (like String.padStart).

Having to deal with an extreme diversity of runtimes seems unique to JavaScript among mainstream languages. This could be my bias showing (I'm primarily a web developer), but it's certainly true of the difference between web frontend code and web backend code. You just never know where your code is going to run - in Internet Explorer 8, on Opera for Android, or someone's old version of Safari on their iPhone 5 they're clinging to because it would be too expensive to upgrade.

This is bad enough for application developers, who can to some extent draw a line and decide not to support users in certain demographics. (Or, in Kogan's case, charge those users more.) But it's a nightmare for library developers, who want to make their code usable by as many other developers as possible.

3. Bundle size

Do you remember a few months ago when the internet joined in a collective hate-on for the is-buffer package? This package, as its name suggests, checks whether something is a Buffer.

Why would one need a package for that? Well, weak typing might make one want to check types like this; moving targets in the runtime might make one worry that one doesn't know how to check the type reliably - but still, why doesn't one just depend on the buffer package?

Enter the final triumvir of this unholy alliance: bundle size paranoia, which was ostensibly the reason the is-buffer package was created. Because JavaScript programs have to be downloaded frequently by users (even multiple times by the same user on the same day, if caching isn't used carefully), and because Google has convinced us that milliseconds of additional page load time will have dire consequences for our users and consequently for our bank accounts, and because bundlers and module systems have not provided adequate support for modularity, we web developers go to extreme lengths to avoid shipping unnecessary bytes to our users.

When the unit of modularity is "NPM package", rather than "file" or even "function", some will go to great lengths to split their code across NPM packages. (For more on this, see Appendix 1.) This works with old bundlers that can't tree-shake, and it can avoid reuse - though as noted by the lodash project itself, they are thankfully moving away from this pattern because it may introduce more opportunities to duplicate code than to deduplicate it!

A huge amount of effort has been poured into not just minifying a source bundle, but producing the best possible bundle in the first place. The NPM ecosystem as it stands today has been shaped in part by these efforts.

Future proof

These three factors combine and interact in unexpected and awful ways.

Over the years there has been rapid evolution in both frontend frameworks and backend JavaScript, high turnover in bundlers and best-practises. This has metastasized into a culture of uncertainty, an air of paranoia, and an extreme profusion of small packages. Reinventing the wheel can sometimes be good - but would you really bother doing it if you had to learn all the arcane bullshit of browser evolution, IE8 compatibility, implementation bugs, etc. ad infinitum?

And it's not just that you don't understand how things work now, or how they used to work - but that they'll change in the future!

Whenever NPM's package culture is discussed, one of the benefits touted is that if one of your dependencies is ever updated, your own code will now be updated "for free"! Your application will remain correct, because it depends on an abstraction that will remain correct. (Abstractions are good, but see Appendix 2.)

This is a very reasonable expectation, and an important piece of progress in software development. But I believe the paranoia created by the three factors I discussed above have led to the excesses we see in the current NPM ecosystem. This is why we have is-even and its whole ludicrous web of dependencies, and why we don't have is-even in Python.

"Surely," the rational developer exclaims, "there could be no future changes to the is-even package. The definition of even numbers isn't going to change any time soon!"

No, the definition of even numbers won't ever change. But sadly, my friend, this is JavaScript - and you can never really be sure.

Discuss this post on dev.to


Appendix 1. In praise of modules

My thoughts on this issue have been brewing for a while, but this comment by Sindre Sorhus, noted small-package developer, really put it all in focus for me.

Sindre makes a very good argument in favour of modules:

tl;dr You make small focused modules for reusability and to make it possible to build larger more advanced things that are easier to reason about.

However, this is not an argument in favour of NPM packages. All the benefits Sindre lists could be achieved by simply designing programs in a modular way. If another developer wants to avoid having to re-implement an interesting but not-entirely-trivial piece of functionality, they should be able to lift a well-defined module (ideally a single file) from one project to another.

A lot of the issues with NPM are caused by... well, NPM, not by some inherent property of small modules. This was the case for last week's is-promise debacle (which precipitated me actually writing this blog post). Small NPM packages are the "problem", not small modules, and the problem, at its root, is caused by paranoia.

Appendix 2. The meaning of abstractions

What's wrong with this code?

const isPromise = require('is-promise');

if (isPromise(thing)) {
  thing.then(successCallback).catch(failureCallback);
}

(It's from a real application that uses is-promise, but I won't name names.)

Did you spot it? catch might be undefined. Why? is-promise implements the Promises/A+ spec, which only requires a then method. The specific meaning of "is thing a promise?" can actually change based on how you want to use the answer. The "promise" is not a reliable abstraction here, because JavaScript has so many versions of it, and because promises can be used in many ways.

This is slightly tangential to the paranoia discussed above, but is an outcome of a "don't ask" approach to packages ("don't ask" because the details will horrify you), and probably not unique to JavaScript.

UPDATE: apparently even the package maintainer failed to notice the distinction between a Promise and something that has a then method. This stuff is not trivial.

The pattern of doing this kind of typecheck is all-too-prevalent in the JS ecosystem, which privileges APIs that seem "simple" because you can chuck anything you want into them, but pushes the burden of being compatible with every conceivable input onto the library. Which brings me to my next appendix...

Appendix 3. TypeScript

Is there a solution to all this? How can we stop the madness?

I don't believe TypeScript is a solution. If anything, it's a clear symptom of the problem. But I believe that TypeScript helps do something important: it makes poorly-typed code annoying to write.

Yes, you can design a method that accepts anything from a string to a thenable that will return an object containing a Float64Array, but writing the type of that method becomes ugly, and implementing it becomes a pain because TypeScript forces you to demonstrate to its satisfaction that you've done it correctly.

Fewer APIs that take and return different types make it less necessary to implement code like is-buffer, is-number, etcetera. Of course, browser compatiblity and bundle size anxiety will still present problems. But maybe with an increase in JavaScript developers designing code with types, we'll see less demand for typecheck packages and the like.

Appendix 4. Deno

One of the reasons I'm excited for Deno's upcoming stable release is that it builds on a philosophy of fewer, better dependencies. But even in cases where you need a specific dependency, Deno's URL-based imports make it trivial to:

  • Import just a single file without downloading a whole package plus its tests and everything else. Refer back to Appendix 1 for why this is cool.

  • Pin each import to a commit hash or other stable identifier.

Yes, many people are concerned about the idea of importing URLs for many legitimate reasons. NPM is a more trusted place to host packages than some random website. But not even NPM can be 100% reliable indefinitely. Deno at least makes you stop and think... do I trust this source?