focus on the outliers
September 15, 2022•216 words
when i worked at a large tech company (years ago) we analyzed data about our web app
we looked at transaction times for requests made by users of the system
they fell pretty much into a bell curve (except certain really long transactions we called outliers)
usual requests took from 0.1 to 2 seconds—outliers took 100 seconds or more
the first thing we did was remove the outliers—we couldnt do analysis on these! (certainly they were meaningless)
right?
through time we learned to look at the outliers—those problem cases tell you more about your system than all the usual cases combined
world—i think we need to look at the outliers instead of throwing them out
homeless people (for example) are not a little (unimportant) problem to be discarded from the data set before proceeding
theyre more essential to solving our collective problems than the usual cases
my case (for example)
why do i struggle with homelessness and its threat? why am i living on minimum wage? struggling to pay my hospital bills (for an illness i get disability for) why is it too expensive for me to be sick? too expensive for me to live?
you cant just ignore me and hope ill go away
im not an inconvenient data point
i am the data point