The Human-AI Alignment Problem

Wait 5 sec.

We’re now deep into the AI era, where every week brings another feature or task that AI can accomplish. But given how far down the road we already are, it’s all the more essential to zoom out and ask bigger questions about where we’re headed, how to get the best out of this technology as it evolves, and, indeed, how to get the best out of ourselves as we co-evolve.There was a revealing moment recently when Sam Altman appeared on Tucker Carlson’s podcast. Carlson pressed Altman on the moral foundations of ChatGPT. He made the case that the technology has a kind of baseline religious or spiritual component to it, since we assume it’s more powerful than humans and we look to it for guidance. Altman replied that to him there’s nothing spiritual about it. “So if it’s nothing more than a machine and just the product of its inputs,” says Carlson. “Then the two obvious questions are: what are the inputs? What’s the moral framework that’s been put into the technology?”[time-brightcove not-tgx=”true”]Altman then refers to the “model spec,” the set of instructions an AI model is given that will govern its behavior. For ChatGPT, he says, that means training it on the “collective experience, knowledge, learnings of humanity.” But, he adds, “then we do have to align it to behave one way or another.”And that, of course, leads us to the famous alignment problem—the idea that to guard against the existential risk of AI taking over, we need to align AI with human values. The concept actually goes back to 1960 and the AI pioneer Norbert Wiener, who described the alignment problem this way: “If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere… we had better be quite sure that the purpose put into the machine is the purpose which we really desire.”But there’s actually a larger alignment problem that goes much farther back than 1960. To align AI with human values, we ourselves need to be clear about the universal values we ascribe to. What are our inputs? What’s our model spec? What are we training ourselves on to be able to lead meaningful lives?These are the questions we need to answer before we decide what inputs we want AI to draw on. Even if we could perfectly align AI with where humanity is right now, the result would be suboptimal. So now is the time to clarify our values before we build a technology meant to incorporate and reflect them.Because right now we’re experiencing a profound misalignment. In our modern world, we’ve lost our connection to the spiritual foundation that our civilizations, both Western and Eastern, were built on. We’ve been living in its afterglow for centuries, but now even the afterglow has dimmed and we’re unmoored and untethered.That foundation began to slowly crumble with the Enlightenment and the Industrial Revolution, but kept drawing on these eternal truths. We had been voicing them—and even believing in them—less than before, but we were still guided by them. But now that the connection has been severed in order to train AI to align with human values, we first need to excavate them and reconnect with them.In his new book, Against the Machine: On the Unmaking of Humanity, Paul Kingsnorth explores how every culture is built on a sacred order. “This does not, of course, need to be a Christian order,” he writes. “It could be Islamic, Hindu, or Taoist.” The Enlightenment disconnected us from that sacred order, but, as Kingsnorth puts it, “society did not see it because the monuments to the old sacred order were still standing, like Roman statues after the Empire’s fall.” What we do see is the price societies pay when that order falls: “upheaval at every level of society, from the level of politics right down to the level of the soul.”Isn’t this exactly what is happening right now? “It would explain the strange, tense, shattering, and frustrating tenor of the times,” Kingsnorth writes.In his conversation with Carlson, Altman talked about AI being trained on the “collective experience” of humanity. But are we ourselves actually accessing the full collective experience of being human?As we build a transformational technology that’s going to change everything about our lives, we need to ensure that we train it on the fundamental and unchanging values that define us as humans.