The Dangers and Consequences of Black Box Algorithms
Algorithm! What is it? You hear this word everyday and everywhere you go. It is what YouTubers blame for their lack of success on YouTube. It is what social media influencers blame for their decreased audience reach on Facebook, Twitter, Instagram, and other social media platforms. It is what marketers blame for their lack of success with google SEO while simultaneously thanking algorithms for marketers targeted success with digital advertising. It is what the Russians used to elect Donald Trump as U.S. President in 2016 by using Facebook User and Advertisement platform. It is what allowed a system called AlphaZero by DeepMind to use algorithms to teach itself chess, shogi, and go games without human beings teaching it anything at all. This AlphaZero system then went on to defeat the world champions in go, shogi, and chess game.
Algorithms affects all aspects of our daily lives from using your phone alarm clock to wake up in the morning to the music and podcast you listen to, from the smart devices in your home to your social media interactions, from your smart fridge to your web browsing activity, from ride sharing and food delivery services to sending and receiving text messages, phone calls, and emails. Algorithms are weaved into the tapestry of our lives.
Algorithms are embedded in our every day lives whether we like it or not. So, if algorithms are running this world we live in today, what exactly is an algorithm and most importantly for data scientists, what is a BLACK BOX ALGORITHM?
To answer the question of what a black box algorithm is, we first have to get a general sense of what an algorithm is. So, in this blog we will explore what is an algorithm, what is a black box algorithm, a real world example of a black box algorithm that is both positively and negatively impacting our lives, how are black box algorithms created in the first place, and what are the dangers and consequences of black box algorithms. Let’s dive in.
To begin, What is an Algorithm?
Oxford dictionary defines an algorithm as “a process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer”.
Wikipedia defines algorithm as unambiguous specifications for performing calculation, data processing, automated reasoning, and other tasks.
Now that we have a basic understanding of what an algorithm is, let’s move on to black box algorithm.
What is a black box Algorithm?
Black box algorithm refers to a machine learning model where you know what goes in and what comes out, but you don’t know or understand the inner workings of the algorithm or how the algorithm is producing its results. Black box algorithms are usually complex machine learning models as opposed to simplified machine learning models like logistic regression. If black box
algorithms are so difficult to explain, why would a data scientist choose a black box algorithm as opposed to a simplified algorithm? Well, the simple answer is accuracy. Typically, black box algorithms produce better outcomes with greater accuracy than simple algorithms. Black box algorithms gain their names from the mystery they present in exactly how they work and the difficulty
in explaining why the result is what they are.
Example of a Black Box Algorithm
One examples of a black box algorithm is COMPAS which was discovered by a ProPublica investigation. COMPAS is an algorithm used across the United States by police officers to determine if somebody is more or less likely to commit a crime again in the future. From the outside looking in, one might think such an
algorithm is a blessing to the local communities and the societies at large because officers are more likely to stop a crime or a criminal before it happens. But, the problem with this black box algorithm called COMPAS is that it has bias baked into it and because it is a black box algorithm, we can’t easily decode this bias or its source and therefore remove it from the algorithm.
So, the algorithm will continue to function with its bias and continue its job of classifying one group of people who are not a threat to society as being riskier and more dangerous than another group of people who are more likely to re-offend and does in fact re-offend. COMPAS seems like a good idea at first. But, when COMPAS starts classifying low risk people as high risk of re-offending and high risk re-offenders as low risk people due to the bias baked into the algorithm, then a human life can be derailed because of a black box algorithm and there is nothing we can do about it because we don’t understand the algorithm, how it works, or how it makes decisions.
How Black Box Algorithms are Created!
This brings us to the question…how are black box algorithms created in the first place? According to a research paper by Harvard Journal of Law and Technology,
Machine learning algorithms are responsible for black box algorithms. The paper proceeds to name two types of algorithms that might help explain the origins of black box algorithm. It says…
“The first algorithm…is the deep neural network which often involves the use of thousands of artificial neurons to learn from and process data. The complexity of these countless neurons and their interconnections makes it difficult, if not impossible, to determine precisely how decisions or predictions are being made.
The second algorithm is the support vector machine which is used to illustrate how shallow (i.e. less complex) algorithms can also create a black-box problem because they process and optimize numerous variables at once by finding geometric patterns in higher-dimensional, mathematically defined spaces. This high “dimensionality” prevents humans from visualizing how the AI relying on the support vector machine is making its decisions or from predicting how the AI will treat a new data.
One possible reason AI may be a black box to humans is that it relies on machine-learning algorithms that internalize data in ways that are not easily audited or understood by humans.
Here are two examples to illustrate this. First, a lack of transparency may arise from the complexity of the algorithm’s structure, such as with a deep neural network, which consists of thousands of artificial neurons working together in a diffuse way to solve a problem. This reason for AI being a black box is referred to as “complexity.” Second, the lack of transparency may arise because the AI is using a machine-learning algorithm that relies on geometric relationships that humans cannot visualize, such as with support vector machines. This reason for AI being a black box is referred to as “dimensionality.””
Based on this research by Harvard Journal of Law and Technology, the creation of black box algorithms can be attributed to two main things “Machine Learning Algorithms” and “Support Vector Machines.” And, this algorithm systems present themselves as black box algorithms due to two main reasons, the “Complexity & Dimensionality” of these algorithms. Obviously, creating things we can’t understand is a problem which is why we must find a solution to black box algorithms.
The Dangers and Consequences of Black Box Algorithms.
Now you might be thinking, what’s the big deal with black box algorithms? Why are we humans desperate for the ability to explain black box algorithms. Some people might say that as long as it works, there is no reason to have to explain it. So, why then do we seek to understand black box algorithms and how the gears are turning to produce the outputs we see? There are several reasons why black box algorithms and being able to explain them is a big deal. Some of these reasons include:
- For a data scientist to be able to communicate with non technical people in a business environment is critical. So, if a data scientist doesn’t understand the model they are working with, then they can’t explain it to non-technical people. And, if they can’t explain how they got the results they have and how they came to this conclusion to business leaders, then the results from the black box algorithm might be rejected by business leaders even if the results are accurate. Humans have a natural tendency to throw away or reject anything they don’t understand. Also, if data scientists don’t understand the model they are working with, they can’t communicate it and explain it to their fellow data scientists.
- You can’t document something you don’t know exists. Proper documentation is important in programming. Documentation ensures consistency, helps you remember the work you did, and explain your work to others. If you can’t and don’t understand the inner workings of your algorithms, then you can’t document it.
- If you don’t understand how the algorithms is working, then you won’t know how to tweak it to improve the algorithms and thus improve your results.
- If you don’t understand the inner workings of the algorithms, the why and how it is deriving the outcomes it is producing, then you have no way of knowing if the algorithm is being consistent in its production process. You won’t know for a fact if the mechanical workings that lead to the last result is equivalent to the mechanical workings that lead to the current result.
- If you don’t understand the algorithm, then you won’t be able to check, spot and correct the biases that might be built into the algorithm. You won’t even know the algorithm has biases that you should compensate or correct for.
At the end of the day, it is no secret that black box algorithms are powerful but the curiosity of their working process still bugs the human mind. I have no doubt that one day, humans will be able to debug how every black box algorithm works. Until then, we will continue to love their results while simultaneously despising their mysterious nature.