focus on the outliers

when i worked at a large tech company (years ago) we analyzed data about our web app

we looked at transaction times for requests made by users of the system

they fell pretty much into a bell curve (except certain really long transactions we called outliers)

usual requests took from 0.1 to 2 seconds—outliers took 100 seconds or more

the first thing we did was remove the outliers—we couldnt do analysis on these! (certainly they were meaningless)


through time we learned to look at the outliers—those problem cases tell you more about your system than all the usual cases combined

world—i think we need to look at the outliers instead of throwing them out

homeless people (for example) are not a little (unimportant) problem to be discarded from the data set before proceeding

theyre more essential to solving our collective problems than the usual cases

my case (for example)

why do i struggle with homelessness and its threat? why am i living on minimum wage? struggling to pay my hospital bills (for an illness i get disability for) why is it too expensive for me to be sick? too expensive for me to live?

you cant just ignore me and hope ill go away

im not an inconvenient data point

i am the data point

You'll only receive email when they publish something new.

More from ▵dirt
All posts