As the technology advances, we might soon cross some threshold beyond which using AI requires a leap of faith. Sure, we humans can’t always truly explain our thought processes either—but we find ways to intuitively trust and gauge people. Will that also be possible with machines that think and make decisions differently from the way a human would? We’ve never before built machines that operate in ways their creators don’t understand. How well can we expect to communicate—and get along with—intelligent machines that could be unpredictable and inscrutable?
The result is that modern machine learning offers a choice among oracles: Would we like to know what will happen with high accuracy, or why something will happen, at the expense of accuracy? The “why” helps us strategize, adapt, and know when our model is about to break. The “what” helps us act appropriately in the immediate future.
It can be a difficult choice to make. But some researchers hope to eliminate the need to choose—to allow us to have our many-layered cake, and understand it, too. Surprisingly, some of the most promising avenues of research treat neural networks as experimental objects—after the fashion of the biological science that inspired them to begin with—rather than analytical, purely mathematical objects.
And yet the rise of machine learning makes it more difficult for us to carve out a special place for us. If you believe, with Searle, that there is something special about human “insight,” you can draw a clear line that separates the human from the automated. If you agree with Searle’s antagonists, you can’t. It is understandable why so many people cling fast to the former view. At a 2015 M.I.T. conference about the roots of artificial intelligence, Noam Chomsky was asked what he thought of machine learning. He pooh-poohed the whole enterprise as mere statistical prediction, a glorified weather forecast. Even if neural translation attained perfect functionality, it would reveal nothing profound about the underlying nature of language. It could never tell you if a pronoun took the dative or the accusative case. This kind of prediction makes for a good tool to accomplish our ends, but it doesn’t succeed by the standards of furthering our understanding of why things happen the way they do. A machine can already detect tumors in medical scans better than human radiologists, but the machine can’t tell you what’s causing the cancer.
Given that thoughts are a jumble of fragments and pieces, it occurred to me that a recorded transcript of those jumbled pieces actually might not be very illuminating. It might not even be intelligible. Meanwhile the (admittedly much more arduous) process of writing down my thoughts had been surprisingly enlightening. In one swoop, my brain was capable of detecting the patchy notions swirling in my mind, filling in their gaps to make them whole—that is, adding the stripes—and then evaluating them for their credibility and value, or lack thereof.
In other words, my own brain was a brain decoder. It required a lot more effort than merely using a digital recorder as I’d imagined, but it was also a whole lot more sophisticated—say, a trillion times more—than anything scientists have conceived of inventing.
In science, the question of when to believe is a deep and ancient problem. There is no universal answer, and evaluating the merits of any potential discovery always includes considering the prior beliefs of the people involved. There is no way around this.
• • •
This was the genius of the fake signal injection: Whatever the prior belief of an individual scientist might be, it gave him or her reason to doubt it. A scientist who believed that the current generation of instruments was simply not up to the task would have to allow for the possibility that it was. A scientist tempted to elevate a signal because of the benefits of a real detection would have to temper his or her enthusiasm to avoid making a false claim. The fake injection bugaboo forced us to keep an open mind, apply skepticism and reason, and examine the evidence at face value.
At a dinner I attended some years ago, the distinguished differential geometer Eugenio Calabi volunteered to me his tongue-in-cheek distinction between pure and applied mathematicians. A pure mathematician, when stuck on the problem under study, often decides to narrow the problem further and so avoid the obstruction. An applied mathematician interprets being stuck as an indication that it is time to learn more mathematics and find better tools.
I have always loved this point of view; it explains how applied mathematicians will always need to make use of the new concepts and structures that are constantly being developed in more foundational mathematics. This is particularly evident today in the ongoing effort to understand “big data” — data sets that are too large or complex to be understood using traditional data-processing techniques.
Our current mathematical understanding of many techniques that are central to the ongoing big-data revolution is inadequate, at best. Consider the simplest case, that of supervised learning, which has been used by companies such as Google, Facebook and Apple to create voice- or image-recognition technologies with a near-human level of accuracy. These systems start with a massive corpus of training samples — millions or billions of images or voice recordings — which are used to train a deep neural network to spot statistical regularities. As in other areas of machine learning, the hope is that computers can churn through enough data to “learn” the task: Instead of being programmed with the detailed steps necessary for the decision process, the computers follow algorithms that gradually lead them to focus on the relevant patterns.
In Shakespeare’s Julius Caesar, a soothsayer warns Caesar to “beware the ides of March.” The recommendation was perfectly clear: Caesar had better watch out. Yet at the same time it was completely incomprehensible. Watch out for what? Why? Caesar, frustrated with the mysterious message, dismissed the soothsayer, declaring, “He is a dreamer; let us leave him.” Indeed, the ides of March turned out to be a bad day for the ruler. The problem was that the soothsayer provided incomplete information. And there was no clue to what was missing or how important that information was.
Like Shakespeare’s soothsayer, algorithms often can predict the future with great accuracy but tell you neither what will cause an event nor why. An algorithm can read through every New York Times article and tell you which is most likely to be shared on Twitter without necessarily explaining why people will be moved to tweet about it. An algorithm can tell you which employees are most likely to succeed without identifying which attributes are most important for success.