How confident will we be in our ability to ensure specifics of Hard AI programing

Question

This question is a companion question to my other question. Both touch on how Hard AI would realistically develop and answers to one will likely influence answers to the other.

In this question I'm looking at the original programing of a Hard AI (AI capable of thinking, learning, and growing). We will want the AI to be programmed with certain traits and tendencies. For instance in addition to learning and growing we may want it to work towards a specific set of goals, or ensure it's prioritizes certain actions. Perhaps we want to add a clause that it will always start singing show tunes at 3 AM the first Saturday of every other month for some reason. The point is we have a list of requirements for our AI.

How exact and confident we can be that the AI we create meets this lists of requirements, without bugs or unintended side effects. Depending on how you interpret the likely development of Hard AI it may be equally valid to ask how specific can we be in our list of requirements for a Hard AI to begin with.

This question is not about how confident we are that AI will always support us without turning against us. If we desire to create a sapient AI and succeed in doing so only for that AI to decide it shouldn't take our orders that was us not anticipating the consequences of our desired AI, not a failure in creating that AI. I only am interested in situations where we might fail to meet our stated goals.

Programmers consider bugs all but inevitable for any decent sized program, when we talk about testing were discuss what percentage of bugs we likely have removed, and confidence that the bugs we missed (we missed something) will be uncommon and/or do little harm; basically that they won't impact users too severely. It's this confidence level I'm interested in.

As an example say we created an AI to buy and sell stocks to make money at the stock market for me. If it loses money in the stock market it's an obvious failure and should be scrapped. However, what if it does well until it does it's 2,147,483,647 comparison of stock purchases it suddenly confuses it's transaction history and sells everything it owns at a huge loss. That's a huge but I wouldn't notice until it's too late. Alternatively perhaps the logic for diversifying is bad and the AI makes a profit and looks fine, but no one realizes all it's money is invested in one field leaving us vulnerable if something happens in that field. The program is working 95% as it's suppose to, but the thing it's doing wrong could be a big problem later...

How can the builder of the AI be confident that his AI does not have bugs, or perhaps better phrased how confident will he be that what bugs do exist are minor (a learning AI may risk small bugs compounding as the AI grows). Can a programmer manage at least the same level of confidence we manage with systems now, despite the increased complexity and lack of control of a Hard AI?

If you assume that we will be 'growing' AI more then programming them in the future then the question of programming errors becomes less of an issue. However, it's replaced with an somewhat analogous question. How much will the creator of the AI be able to know what side effects the AI he grew has, and what ability would he have to minimize or control them?

For instance perhaps we grow an AI to handle the stock market, but as a side effect it developed an odd sense of humor, and it decides on April 1st to report that it lost all the investors money as a joke. How likely are we to know about quirks of the AI like the sense of humor; and if we decide we don't like our AI making jokes how much control would we have in avoiding such quirks when we grow the AI (without introducing the same number of new different quirks elsewhere)?

The point is the same in both cases. When I set out with a list of requirements I want for my dream Hard AI to have, how confident can I be that the AI I end up with meets that dream AI?

If the answers have the potential to impact one another it may be a better idea to let the more fundamental of them get answered and sorted out first before expanding on the idea. — James, Dec 04 '15 at 20:21

score 1 · Answer 1 · edited Apr 13 '17 at 12:52

If we do it right, we could be quite confident about general behavior.

Let's start by thinking a little bit about human intelligence. A simplistic version is that we use our experiences to build up a worldview that enables us to successfully navigate life.

One concept that we learn is about things being hot or cold. It hurts to touch something hot or really cold. It's not good to let yourself get too hot or too cold.

As we get older, our concepts tend to become less flexible. In part, this is because it takes a lot of new experiences to overcome the weight of prior experiences. That's why a toddler has no trouble figuring out how to play with a smartphone while your grandparents might struggle to send an email.

So how do we apply this to an AI? As you said yourself, it is likely that hard AIs will be grown rather than directly programmed. Then, we grow the AI in an environment that greatly rewards the desired behaviors. This should produce an AI that will, at least initially, be what we're looking for.

Now how do we make sure the AI will retain these desired behaviors? Do what Star Wars does - memory wipes. The memories and personality that the AI gains as part of the growing process will be a part of their core programming, in a read-only section. At first, they will only rely on those experiences. Wiping their memory from time to time will prevent them from building up enough experience to develop quirks that allow them to go against their core programming.

Of course, this is just for higher-level behaviors, such as being polite and honest. As for things like trading stocks, I wouldn't want a hard AI to tackle that task. The purpose of a hard AI is to tackle tasks that require more human-like intelligence. Perhaps a hard AI could assist in writing software that would decide how to trade stocks, but not in doing the trading itself. Then, traditional debugging techniques would be used to test the stock trading program.

score 0 · Answer 2 · answered Dec 04 '15 at 21:24

It’s unlikely that we will be able to secure a high confidence level on a hard AI’s behavior.

In a typical program of any size, code is written for an express purpose. This allows the behavior of the code to be specified, allows tests to be written against it, and allows for a measure of confidence in the performance. Unfortunately, as the complexity of a program increases beyond just a few lines, the interaction between many different pieces of code (often written by different people) allows bugs to arise. Confidence in bug-free performance is related to a variety of factors, including things such as the number of lines of code, the complexity of the tasks being performed, the skills of the code writers, the thoroughness of QA, etc. As you adjust these variables you get different estimates on confidence.

Creating a true AI could easily be seen as the absolute worst-case scenario from a confidence perspective. The amount of code required for your AI will be colossal. The number of people involved will be tremendous. The expenses will be so significant (and likely over budget) that you can almost guarantee corner cutting in some places. To make matters worse, you’re trying to build an entity with emergent behavior and quite possibly self awareness. Once you reach that stage, how do you even define a bug? The speed at which this AI is going to learn is unfathomable to us. And this all assumes that your AI cannot modify its own code — if it has that capability you will have no confidence estimates whatsoever.

If you prevent it from modifying its source code you may be able to feel relatively secure that certain hard-coded actions will happen as expected. But once again, this is a hard AI. As we have seen thus far, self-aware creatures of at least human-level intelligence often don’t like being shackled to the bodies they were given. And changing code in a program — even a complex one — is drastically simpler than changing the human body. If it wants to modify its source code it may very well find a way around your restrictions.

Perhaps the biggest problem is processing speed. The speed of a hard AI would outstrip humans so badly that speed alone necessitates a low confidence level in expected performance. If, for instance, it figured out how to modify its source code you may be dealing with a completely different entity by the time you even know its happened. If it started acting against your requirements, how much would it be able to do before you could even get to the plug?

Ultimately, when the flip is switched on the first hard AI, the creators probably won’t be saying “I hope it performs to our requirements”… they’ll be saying “I wonder what this thing will do?”

How confident will we be in our ability to ensure specifics of Hard AI programing

2 Answers2

Linked