A caveman hears rustling in the trees. Without the luxury of time to ponder his options, his amygdala kicks in and presents a hypothesis. Tiger…Run!
This primal tendency isn’t based on an objective calculation of likelihood, as such a calculation would predict the wind 99 times out of 100. It is instead a calculation based on the likelihood of survival given a false prediction. In short, if the rustling is the wind and he thinks it’s a tiger, he might waste a couple seconds experiencing unnecessary stress, but if it’s a tiger and he thinks it’s the wind, he will fail to pass on genes that result in this type of assumption.
Human beings are plagued with a multitude of these heuristics. Useful untruths that served to pass us through the filter of natural selection. We are essentially running 100,000-year-old software-driven by evolution in a technological era, and the result is a program that is running wildly outside it’s bounds, like the trailing ends of a function that no longer maps to the phenomenon it was meant to describe. In a world dominated by truth, data, and hard facts, biases most often serve as impediments to this, especially in a world where the reduced threat of tiger attacks renders the only useful attribute of a bias moot.
As it turns out, this is only a half-truth.
It’s not that we don’t experience the positive side of biases, it’s that we take them for granted as fundamental to any intelligent pattern-seeking agent, instead of the evolutionary-based vestiges that they are.
For example, we have an intuition that all intelligent agents should share our self-preservation, our truth-seeking, our morals, and our sense of fairness. And the further we crank up the dial of intelligence, the more they should adhere to these values. In practice, however, intelligent algorithms not only fail to converge on these values, they actively seek out ways to cheat them.
Agents will find bugs in a program to bypass a goal that we set for them, they will offer psychopathic resolutions to moral dilemmas, and they will pursue wildly unproductive goals such as paperclip hoarding. Without constant human intervention, they will diverge sharply and unexpectedly from our core values. What machine learning has taught us, is that our values aren’t fundamental, they are indeed biases, and they exist in spite of our intelligence, not because of it.
If there is a message to derive from this, it is that human nature primes our minds to think about machine learning in a new light, which in turn allows us to reflect back on human nature. It’s a beautiful recursion that allows for unique insights into biases that were not possible in prior eras when human intelligence was the only game in town.
The Mythical Bias-Free Algorithm
There is no shortage of blog posts on the internet talking about bias in machine learning algorithms. Most of the bandwidth on this topic is focused on the negative side of bias, which gives the impression that an ideal bias free algorithm exists somewhere out there for us to find. I’ll save you the suspense; no such algorithm exists. It isn’t because the problem is too difficult, or we’re not smart enough to create one, it’s that bias is an inherent feature of machine learning algorithms.
To look at it another way, think of these algorithms as stereotyping machines. They have a small subset of data that is visible to them, and the rest of reality is completely invisible. Given this subset of data, they must infer the unknown, and the only way to do this is to stereotype.
I have a dataset containing 10 red houses, so the 11th one must be a red house as well, right? Well no, of course not, but this insight comes from a base of common knowledge that a machine learning algorithm is not privy to. Especially one whose knowledge consists only of 10 red houses. Given what it knows, it’s prediction of an 11th red house is entirely logical.
The only way around this problem of stereotyping is to give the algorithm a complete view of reality, so that a bias prone inference could be replaced with a simple dictionary lookup. It’s clear that in most cases this is simply not possible. Complicated systems aren’t amenable to representation at their lowest level of structure, which is why we don’t predict the behavior of tigers by the trajectory of their subatomic particles. If we are lucky enough to be in a domain simple enough to have a complete description, then we are no longer doing machine learning, so it appears we are stuck with bias.
Fortunately, not all is lost, and there are forms of bias that can be reduced or eliminated altogether. Putting on our problem-solving caps, let’s investigate the forms of bias that we can actually do something about.
To Err is Human
Humans are not perfect data collecting and curating entities. We make mistakes, we fail to see the complete scope of a problem, and we have intuitions and biases that cause us to humanize machine learning algorithms, and assume they will process data in a way similar to us. Until we can dispense with humans as data collectors and opt for a more efficient silicon replacement, we are stuck dealing with the quirks of the human mind, which can be condensed into the following set of biases.
As we go through the list of biases I’ve also shared ways we can eliminate the unwanted bias introduced by either our data collection techniques, or the unexpected behavior of the algorithm.
Sample bias occurs when we collect only a subset of the data required for an algorithm to perform optimally across its domain. An example of this is a facial recognition algorithm. While collecting data, we may underrepresent a gender or ethnic group in the dataset, which will cause the algorithm to perform sub optimally in these cases. This is often inadvertent, and a result of our inability to predict every situation that will arise in the real world.
How Can We Eliminate This?
The best way to combat sample bias is to try and emulate the production environment as closely as possible. If we are running a facial recognition algorithm, what demographic are we targeting? *Does the algorithm contain an equal spread of faces across genders and races? Do the lighting conditions in the environment match the training set? Is the camera the same as the one used to capture images in the training set? It’s clear that this exhaustive approach of proactively trying to predict every scenario is not perfect, and we don’t know what we don’t know, so ultimately it is iteration and failure that will teach us the most, and give us the experience to more competently predict what we will encounter out in the real world.
*Coded Bias, on Netflix, examines this very real issue with machine learning bias and how it has affected thousands of individuals. Perhaps due to the documentary, or other important research around understanding and revealing this significant bias, we now see some cities that no longer use facial recognition technology in their law enforcement efforts.
Exclusion bias occurs when we deliberately remove data from a dataset that we feel is useless. Our attribution of human traits onto algorithms deludes us into thinking an algorithm will solve problems the way we do, and thus won’t need information that we wouldn’t need. This bias can be extremely counterproductive, given that the reason for implementing the algorithm in the first place is that we cannot solve the problem on our own.
For any reasonably complex algorithm, there will be patterns found in data that will be virtually impossible to map to our own intuitions, and we should never assume that data will be unimportant.
How Can We Eliminate This?
Eliminating exclusion bias is often as simple as not removing data points prematurely. Test the algorithm with the data intact first, and resist the urge to prune the data based on what you would need to solve the problem. Once you gain experience with these algorithms, you will develop new intuitions on how the algorithms solve problems, which will give you more insight into which data points to keep, and which ones to discard.
Recall bias occurs when the qualitative nature of data blurs the boundaries between categories. An example would be a dataset of fruit images that were labelled as under ripe, ripe, or rotten. We can imagine that different people would disagree on these categories, and given a boundary case, a person might even disagree with their own categorization depending on how they felt that day. This problem results mainly from trying to quantize a continuous collection of values.
More informally, fruit ripeness is a continuous value, and by forcing this value into discrete buckets, we are creating opportunities for inconsistent labelling on the boundaries of these categories.
How Can we Eliminate This?
This is a difficult bias to remove altogether because it involves taking a continuous value determined by the whims of human aesthetics, and tries to quantify it using hard boundaries. Luckily, machine learning algorithms are robust, and can handle some degree of inconsistency without degrading completely. Even if there are some labelling issues with edge cases, the algorithm can still perform well given more prototypical cases.
It’s important to understand the limits of such an algorithm, and ideally, it should be trusted only to classify these prototypical cases.
After all, the algorithm is only as good as the training data, and if the humans supplying this data have trouble differentiating the edge cases, this will be reflected in the functioning of the algorithm.
Association bias occurs when a cultural bias results in data that is accurate, but does not reflect our ideals. For example, an algorithm might place men and women in stereotypical vocations based on what it sees in the training set, despite us knowing that vocation shouldn’t depend on gender. This is a difficult bias to overcome because it depends on an abstract ideal and not on available data.
How Can We Eliminate This?
Association bias is a strange one, because the algorithm isn’t really doing anything wrong. It’s finding patterns in data, which is what it’s supposed to do. Unfortunately, there are data points that can be correlated to things which humans intuitively recognize as stereotyping. For example, a dataset tasked with suggesting careers might notice that gender is a good indicator of a person’s career and will learn to suggest based on this data point alone. This is of course not ideal, and one way to combat this is to remove these sensitive variables so the algorithm is forced to use other data points to make its decision. The problem with this, is the missing variables can often be inferred from other data. One example of this is inferring race from zip code.
This is a very difficult bias to remove because it involves purposely hobbling our data to prevent an algorithm from discovering things that clearly exist, but we don’t want them to. The only true resolution to this, is to have data that truly reflects our ideals.
Solving the Solvable Cases of Bias
There is a common theme in the preceding set of biases which can be summarized as a lack of human foresight. In every case, the bias was caused by failing to predict the outcome of the algorithm. One important thing to note is that developing machine learning algorithms isn’t rocket science. I mean this in the very specific sense that there isn’t a blind development process followed by a nail-biting deployment where our dreams are either realized, or dashed upon the rocks. It is a very iterative process which requires algorithms to be run frequently so that we can tune our intuitions and learn from failures.
The goal isn’t to tailor a dataset perfectly so that upon deployment the algorithm will obey all our expectations, it’s to make fast decisions, and fail quickly knowing that the answer we get will most likely be wrong, but wrong in a way that will give us invaluable information on how to proceed.
What to do With the Rest?
What we can do with the remaining set of biases that are either intractable or necessary, is get comfortable with them. Get comfortable sharing space with algorithms that redefine what it means to be intelligent, or as Richard Feynman put it, “algorithms that show the necessary weaknesses of intelligence”.
Be open to disentangling morality, truth seeking, and other human virtues with intelligence, and instead set these virtues aside as quirks of being human, so that we aren’t disappointed when they don’t emerge in our intelligent algorithms automatically.
Develop new intuitions for how to communicate goals to an intelligence that hasn’t spent the last 250,000 years in the pressure cooker of human social evolution, and doesn’t instinctively replace what you “said” with what you “meant to say”.
Embrace the next Copernican revolution, which instead of demoting our centrality in the universe, or our unique creation, demotes human intelligence by making it painfully clear that our survival based senses and intuitions are all but useless in 100 dimensional space and other equally alien domains.
Most of all, we must embrace our own biases, and see ourselves as imperfect algorithms as well. After all, the process that brought us into being saw no greater purpose than to pass our genes into the next generation, and there’s no reason to expect such a shortsighted process to produce the gold standard of intelligence. If one thing should be taken from this, it’s that the goal should not be to pull machine learning algorithms up to the pedestal of human understanding and force them to explain things in terms of our human intuitions, as this enterprise will almost always serve to hobble them. We instead need to admit this pedestal doesn’t exist, let go of the reins, and allow our algorithms to explore their intended domains unabated, after which we follow them.
The art of machine learning is not only about teaching algorithms, it’s about allowing algorithms to teach us. It’s about redefining our place in the hierarchy of intelligent beings and having the humility and courage to let go of the last thing that made us special. Yes, biases can lead us astray, but they are also the only things leading us towards any place worth going.
This blog's author, Mark Giroux, is an Associate Consultant with our Digital Transformation Practice, and prior to this role, he was a software developer and research analyst in our Innovation Lab.
Please add your inquiry or comments for Mark in the form below and he'll be sure to get back to you!