How do we make a friendly AI?

How we do avoid creating a superhuman artificial intelligence (AI) that does not end up harming humanity? This is a question of great consequence to AI researchers and thinkers who believe that future AIs will have capabilities and will act in a way completely different and unfathomable to humans, just as our actions may seem unfathomable to apes. Such beings could pose an existential threat to humanity even if they weren’t of the ‘killer robots’ variety; instead, they may be completely indifferent to humans but may decide that it’s just more efficient or interesting to disassemble the Earth in order to create a wormhole (or whatever). It’s safe to say that this kind of indifference most certainly counts as ‘unfriendly.’

My extremely cursory reading suggests that few people have any good ideas about how to ensure that any superhuman AI will end up being friendly — that is, generate positive effects for humanity — rather than unfriendly. Part of the problem is that while we may intuitively think that we should raise them like good parents by giving them solid moral instruction, provide good examples, and so on, this assumes that any AI we create will be sufficiently like a human for that to work.

Another problem is what counts as a positive effect for humanity. Science fiction is littered with examples of naive do-gooder AIs that try to maximise some variable or another, like human lifespan or happiness or numbers, with the end result being some horrific dystopia of miserable immortals or blissed-out drug addicts. These stories, while presenting entertaining evil genies-in-a-lamp updated for modern audiences, are perhaps not giving AIs enough credit. Still, the question remains: what would be a good effect? Most people can barely agree on a political framework, let alone what constitutes the good life; and most humans don’t have the capacity for ultra long-term thinking. Maybe a utilitarian-leaning AI might decide that in the long term, it’d be worth throwing an asteroid at the Earth to kill a billion people today in order to unite the planet and improve matters a couple of centuries hence.

Now, even this kind of cold-blooded AI is preferable to our indifferent wormhole-generating one, but would we prefer a different kind of friendly AI? Amid the fervour for creating AIs as soon as possible lest we waste even a second of AI-enhanced goodness, it seems odd not to reflect on what, exactly, we want from them as individuals and as a species. Perhaps the reason why this feels like an difficult issue is because it poses uncomfortable questions — not about the future, but about how we govern ourselves today, and how we live our lives today.

Related: How to get Posthuman Friends (2062), Object 93 in A History of the Future in 100 Objects

Also related: Episode 10 of The Cultures podcast

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s