The Black Box Fallacy
João Pires da Cruz, Closer Consulting
"Is it explainable?!" Each time someone decides to use some Artificial Intelligence process to tackle an operational challenge, the question pops up immediately. Regulators and auditors are pressuring top management of companies not to adopt AI processes that one cannot explain directly from the resulting mathematical model. In other words, AI should not be too "I". We will illustrate that intelligent processes do need to take direct explainability out of the goals. And it usually is if we speak about human intelligence.
Never in sci-fi movies, our planet is invaded by machines less intelligent than we are. We never saw a film about the invasion of toasters or refrigerators. They are always more intelligent than we are, which is essential for the hero because fighting a toaster would not be very glamourous.
The phenomenon reflects our dream/nightmare since someone created the first mechanism, a machine that not only thinks but thinks better than we do. Perhaps for that reason, Artificial Intelligence "fashion" suffers some oscillations through the years, alternating euphoric phases with disappointment. Nevertheless, progress goes on, and the last euphoria produces a remarkable achievement at the computer science level, building deep-learning systems. Deep learning is not exactly a new concept, but it was a challenging one due to the computer resources needed to achieve a practical application that would make it economically feasible.
The idea behind deep learning is easier to understand if we start from the "shallow" one. Let us imagine that some variable y depends on two other variables, x1 and x2, in the form y=w1 * x1+w2 * x2. We can build a neuron which is, in simple terms, a mathematical unit based on empirical values of y, x1 and x2 and, by minimizing the error between the observed y and w1 * x1+w2 * x2, gets the value of w1 and w2. In other words, we statistically build a representation of y in the space of x1 and x2.
When the neuron becomes cool is when x1 and x2 are not real numbers, but generic features, like blond or burnet and y is, mere example, being pretty. It still works! We can represent "being pretty" in the space of "color of hair." It is called "distributed representation,” and it is the most powerful concept around machine learning. However, in our example, we are still giving the machine the set of features we want to teach it to learn from.
We can go even deeper, i.e., we want it to teach itself the features it can represent the result y. That is the immense power of deep learning: coming from data, building several feature spaces in the middle and represent the result in the final one. In our example, reading the pixels, getting the edges and colors, detecting hair and, finally, represent the value of "being pretty." As one can imagine, if a neuron is a computationally heavy process, a deep learning process is in another championship of computer consumption, and that is one of the reasons why we spent so many years waiting for it.
That "internal" construction of the feature space, we say "hidden", which is a direct result from data, but something that stays between data and the result is, however, also a problem. A problem called "explainability." Humans like simple rules, which say that y=w1 * x1+w2 * x2, mapping data to the result unequivocally. And some applications of machine learning are sensible, like credit applications, where some reason must be taken out of a decision, even if a computer makes that decision. So, regulators and auditors oppose the use of machine learning processes because, for them, it lacks explainability. They argue that businesses cannot use a black box. To counter this argument, we will need a few more math.
Mathematically speaking, deep learning brought us much more than a form of proper representation of the result in a feature space. It brought us a tool to handle correlated objects with statistics. When we decide to make statistics over something, say coin toss, we always think that heads and tails are independent. Even if we consider that the coin can stay up, we believe it is independent of getting heads or tails. Interestingly, if everything were independent of everything, we would not have a universe but, at least, we could solve every scientific problem using statistics. However, all scientific problems left to solve are problems where the objects under study are not independent. On the contrary, their nature hangs in the correlation between themselves, like economic agents, words in a text, or entangled particles in a quantum system.
With the last one, it seems that we are exaggerating. However, it is there for a reason. One of the essential physics and Artificial Intelligence problems has been making a convergent path in the last decade. We call it the "n-body problem," the physics of particles that are so correlated that we cannot explain phenomena without thinking of a way of representing each particle in the other particles' space, which means to make the distributed representation of each particle.
Why is this important? Because physicists do not believe in explainability. Either we can explain it or we cannot; there is no such thing as an unexplainable phenomenon. Physicists know that if they do not bring some "physics" to the problem, it is not the computer that will explain it. The convergence between n-body physics and Artificial Intelligence is the most promising signal to the evolution of a thinking machine, mostly because it brings several insights into opening what people believe to be black boxes.
Managers in a company understand this perfectly. If we think of our decision process, most of the time, we cannot draw a perfect line between data and a management decision. In fact, if we could do it, we could give computers the management tasks.
We cannot do it because the decision process relies on the correlation between data and several factors present in the manager's mind and not in data.
If one asks the computer to decide to give credit just by reading financial data, he probably deserves that the model is unexplainable. When humans took that decision, nobody asked them to do it only based on numeric features. If that was plausible, we could exchange rating agencies by computers. Our intelligent brains can make representations of those numbers and combine them with other less intelligible features to turn the set into a decision; a lot of those features pretty fuzzy. Our brain seems even darker than the black box of the deep learning model.
To make it explainable, we need to make a path similar to theoretical physicists to bring the fundamental mechanisms’ knowledge into the model's hidden features. Even if they are fuzzy, like in quantum mechanics, where there is nothing certain except the results, in which we include the machine where one reads this text. What people believe to be unexplainable is the product of correlation between the objects, which is unavoidable in systems that live on correlation. The human brain can easily make mappings between things it never saw and the ones it already knows. It can build connections between objects and events and that we cannot easily state with simple statistics. We need to include the relationships in the features.
What people call a black box is the normal functioning of a brain or, at least, the closest that we have found so far. We only need a bit more math and physics to go along the same path followed in the quantum n-body problem. It seems evident that trying to reduce a system with correlated objects to tackle the shortage of knowledge of regulators does not make sense. Perhaps we should exchange them for computers, at least they will understand the hidden features' role.
With all this, we do not mean that we are close to substituting human brains. What we said about quantum physicists is the opposite. We still need the one that maps the hidden features with quantum states or economic agents. The important thing is that what people believe to be a "black box" is only how nature reveals itself. There are no such things as black boxes, only things we need to study more to understand.
Managers getting into Artificial Intelligence should not surrender to simple explainability because it means limiting the goals to elementary tasks. The value of AI application is, naturally, behind those simple tasks. It is on building a representation of our business environment. What AI implementations mean is developing and improving the company's capabilities in understanding those representations because that would lead to an indirect but solid Artificial Intelligence. And with that, to dream on the intelligent machines that perform better than toasters.
Do you want to know more? Schedule a meeting with us here.
AI is not about machines. It is about humans. We will be glad to share our experience and assist you on that journey.