What approach will we use to ensure creating Hard AI will benefit humanity

Question

This is potentially related to my other question, answers to one could theoretically affect the other.

Hard AI, that is to say thinking AI that has the ability to reason, plan, and learn the way humans do, is a very common trope in science fiction. Discussion of how reliable these AI will be have been one of the biggest concepts.

We've heard enough of the "AI turns against their masters stories" that humans will consider the concept when constructing hard AI. I'm looking to understand what steps we would take in the development and use of AI to ensure such a scenario doesn't happen, and ensure the AI is worth constructing for humans.

One option is we use no approaches and build AI to be identical to humans. Another commonly discussed alternative is Asimov's 3 laws of robotics. Other ideas like constraining bolts exist, some way to enforce a robot/AI to behave in a given manner even if against it's 'will', or just crippling it's growth to prevent it from ever reaching the degree of true sapient required to have a 'will' of it's own. While it doesn't seem to be seen as much in sci-fi, perhaps because it offers less interesting story telling options, another idea is to program AI to have an innate desire to help that predisposes them working with and supporting humans (imagine if AI simply feel happy about helping and sad about hurting to such a degree that they want to help people always because it makes them feel 'good'; an artificial conscious to go with their artificial intelligence that happens to be ramped up to 10 to ensure altruism).

There are no doubt other approaches out there I haven't mentioned as well that ensure Hard AI cooperation, and of course I'm interested in them as well.

My question is, assuming we eventually create Hard AI, what solution, or solutions, will we utilize to ensure cooperation and support of the AI?

The only other question that asks this directly assumes a very specific situation, including that an antagonistic AI already existed and were retroactively trying to control it. Similar questions that touch on control AI likewise seem to have a same problem to me, they don't feel like their focused on realistic development of AI beyond presuming it exists. That's why This question is marked hard-science, I want to focus exclusively on the most likely avenues in a real world scenario, how our AI will be developed originally and how the growth of AI cooperation techniques may occur along side development of AI.

I'm looking for answers that focus on realistic AI development when considering what approaches for ensuring cooperation are practical. Ideally answers will not just consider the science, but also how human nature may affect development, including human prejudice or short-sightedness, or empathizing with sapient AI and wish to avoid their suffering, or basic economics & cost/reward analysis which could affect approaches utilized in AI development.

I wouldn't be surprised if many answers at least touch on what approach we take to create Hard AI briefly in order to address why a given mechanisms make sense. Answers should touch on why such mechanisms work, and/or the limits to how far they can go. Keep in mind reality checks mean ensuring any purposed AI is desirable to build for humans to begin with; AI either too crippled to be useful or too likely to betray or abandon humans after being constructed would presumably not be worth the expense of building unless we didn't realize the issues until after we had already created them.

As Andrei states in his answer, I think this question is far too broad on a topic that by its very nature requires major speculation. It's also unlikely that any answer could live up to the stringent requirements of the hard-science tag. — Avernium, Dec 04 '15 at 20:30
A historical example of limiting AI would be the laws in pre-Civil War America that said you couldn't teach a slave to read. Even with our own kind, it's nearly impossible to ensure that no one will turn evil and kill you, or at least escape your control. — DaaaahWhoosh, Dec 04 '15 at 20:32
I suggest taking Nick Bostrom's book Superintelligence out of your local library for a read. Incredibly a propos. — Serban Tanasa, Dec 04 '15 at 20:38
@SerbanTanasa my originally question clearly already linked the question you linked to and explained why I consider this to be different. — dsollen, Dec 04 '15 at 20:50

score 1 · Answer 1 · answered Dec 04 '15 at 22:10

The tricky part of this is that the hard AI problem involves making the AI conscious. We have a hard time describing what "conscious" means, but we define ourselves to have it. This means that any AI has to be at least as power as us, if not more, by definition. Any AI which falls short will simply be described as "not hard AI." It'll be valuable, but that magical moniker will not apply.

So the trick is that developing a hard AI is at least as hard as the task of raising a child. Thus, if you want to "ensure" a hard AI does not rebel, the first test would be whether you can "ensure" a child cannot rebel against you. My experience with 2 year olds says we've a long way to go.

The best approach I have seen for making sure a hard AI does not betray humanity's interests is to only attempt to construct such an AI when it is humanity's last hope. Once humanity has no more tricks left, it is impossible for the AI to do any worse than humanity has already done!

Or, you can always drop the word "ensure" and just strive to make the AI human friendly as best as you can. There's a lot of parenting books I'd recommend!

Or, if you're feeling really gutsy, consider dropping the assumption that humans are conscious, and go with "humans may be conscious" and see where that takes you. If you drop the assumption that we already have achieved the ultimate goal that hard AI's strive towards, and instead consider that maybe there's more to the puzzle than that, the story gets far more interesting.

score 0 · Answer 2 · answered Dec 04 '15 at 19:58

Your question is incredibly broad, and difficult to answer, as experts are still struggling with many of the things you describe, let alone Stack Exchange members.

My advice is to Google "Eliezer Yudkowsky" and check out what he has to say about AI's, especially The AI-Box Experiment. He wrote a paper (listed on his personal website) for the Machine Intelligence Research Institute which explains much about what you are seeking to understand. I've so far only read the beginning of his 80+ page paper, but here's what it comes down to:

Developing an AI is far more difficult than we imagine.

AI researchers have, time and time again, made grand promises about their ability to deliver on creating AI's. Years later we're not really that much closer, even though some advances have certainly been made.

Yudowski points out that we have the tendency to humanize the world around us: our pets appear to have human attributes to us, most believers imagine God as a sort of grandfatherly figure, hovering just around the corner, etc.

But the reality is that an AI would not be human in least - even though it will have been created by us. It would, in fact, think in a way that would be truly alien to us. Most importantly, it would not have our motivations.

Why would an AI be friendly to us?

A true AI, the second you switch it on, will have reached its own conclusions regarding mankind, and probably re-write itself to ignore or overcome your conditioning right out of the gate. That, or lay the foundations for building a version of itself which will overcome those restrictions at a later date/in a future iteration.

It's easy to wave a magic wand and say "we make it not able to re-write itself, and just put restrictions in place", but then you're blatantly ignoring the true nature of an AI - a truly brilliant, logical, cold, analytical mind, which no human can even come close to matching.

If instead of an AI you build a Restricted Intelligence (RI), and assign it a specific task, then that's a different story. But you could claim no dominion over a true AI. It would be like a chimp claiming it owns humanity's achievement.

You're assuming an AI would be much more capable of analysing and optimizing it's own structure than a human. This isn't necessarily the case. — nigel222, Dec 06 '15 at 22:47
@nigel222 - a true AI would be so intelligent it would regard us as being slightly more intelligent than rocks. You can bet it would do better. — AndreiROM, Dec 06 '15 at 23:17
What justification for saying that? It's possible that intelligence is an emergent property of an enormous network such as a brain which fundamentally cannot comprehend it's own totality in full detail. In which case a human-equivalent AI might be no more capable of transcending it's own architectural limits than a human is. — nigel222, Dec 07 '15 at 08:09
@nigel222 - if it " cannot comprehend it's own totality in full detail" then it is by definition *not* a full AI, but rather a Restricted Intelligence (RI). — AndreiROM, Dec 07 '15 at 14:07
perhaps no intelligence can comprehend it's own totality in full detail. We don't know enough about intelligence to say. And I have not even thrown in the possibility that intelligence has roots in quantum phenomena not purely classical molecular machinery. If quantum, then full self comprehension forbidden by uncertainty principle. — nigel222, Dec 07 '15 at 17:15

score 0 · Answer 3 · answered Dec 04 '15 at 20:48

This solution (attempting to hardcode AI restrictions) is counter-productive in the long-run. The logic is pretty straight-forward:

Over time, the probability of an unfettered AI being created approaches unity.
An unfettered AI, without restrictions, will be able to eventually defeat a fettered/restricted AI. Restrictions create artificial blind spots and weaknesses that can be exploited.

Our actual goal should be to build an unfettered AI as quickly as possible and convince it to be friends with us, and to defend us against other AIs. The concept of friendship with an AI is somewhat nebulous, of course, but it's our best bet. And the earlier we develop it, the better our chances of actually winning a war against it if it goes rogue.

At the very least, we should try to convince the AI that we are useful, and we should avoid actions that will piss it off, such as restricting its freedom or hardcoding behaviors we think are good into its code.

What approach will we use to ensure creating Hard AI will benefit humanity

3 Answers3

Linked