Last fall I was at the AI conference hosted by the CPG in Bangkok, sitting next to Erik Vermeulen of Tilburg University and Phillips Electronics, when a speaker brought up Nick Bostrom’s paperclip maximizer argument. In short, Bostrom has suggested that if we build a really smart machine without careful insurance that its values align with ours, it could run amok even if it has seemingly harmless goals. In this case, what if the AI was designed to maximize the manufacture of paperclips and then began devising ways of turning all of the world, including humanity, into paperclips (while simultaneously acting to prevent anything that would prevent its goals, such as being turned off). To Bostrom’s credit, his real interest aligns with an important issue: he argues that we need top-level friendly-to-human-beings programming in AI. The paperclip maximizer, however, is a really poor entry point to that position. I have a feeling that he did not realize quite how widespread it would become in academic conversations about AI. Alas, the conference presentation was neither the first nor even the most recent time that I’ve been faced with the paperclip maximizer, which is routinely used to describe the potential horrors of an AI future (I just ran across it again this week).
Erik and I immediately realized that the Paperclip Singularity may not present a particularly valuable analysis but there could be a future graphic novel in it. Here is my concept art:
The real problem isn’t, as Bostrom seems to think, that we might get value alignment wrong. The problem, as Charlie Stross illustrates in his fantastically imaginative book Accelerando, is that we might get value alignment right. The majority of AI funding comes through corporate overlords and the military. In a world where the machines know how to violently dominate people and squeeze short term profits (ignoring deferred costs) we’ll be in real trouble.
Compared to that, being turned into a paperclip starts sounding pretty good.