This question is a companion question to my other question. Both touch on how Hard AI would realistically develop and answers to one will likely influence answers to the other.
In this question I'm looking at the original programing of a Hard AI (AI capable of thinking, learning, and growing). We will want the AI to be programmed with certain traits and tendencies. For instance in addition to learning and growing we may want it to work towards a specific set of goals, or ensure it's prioritizes certain actions. Perhaps we want to add a clause that it will always start singing show tunes at 3 AM the first Saturday of every other month for some reason. The point is we have a list of requirements for our AI.
How exact and confident we can be that the AI we create meets this lists of requirements, without bugs or unintended side effects. Depending on how you interpret the likely development of Hard AI it may be equally valid to ask how specific can we be in our list of requirements for a Hard AI to begin with.
This question is not about how confident we are that AI will always support us without turning against us. If we desire to create a sapient AI and succeed in doing so only for that AI to decide it shouldn't take our orders that was us not anticipating the consequences of our desired AI, not a failure in creating that AI. I only am interested in situations where we might fail to meet our stated goals.
Programmers consider bugs all but inevitable for any decent sized program, when we talk about testing were discuss what percentage of bugs we likely have removed, and confidence that the bugs we missed (we missed something) will be uncommon and/or do little harm; basically that they won't impact users too severely. It's this confidence level I'm interested in.
As an example say we created an AI to buy and sell stocks to make money at the stock market for me. If it loses money in the stock market it's an obvious failure and should be scrapped. However, what if it does well until it does it's 2,147,483,647 comparison of stock purchases it suddenly confuses it's transaction history and sells everything it owns at a huge loss. That's a huge but I wouldn't notice until it's too late. Alternatively perhaps the logic for diversifying is bad and the AI makes a profit and looks fine, but no one realizes all it's money is invested in one field leaving us vulnerable if something happens in that field. The program is working 95% as it's suppose to, but the thing it's doing wrong could be a big problem later...
How can the builder of the AI be confident that his AI does not have bugs, or perhaps better phrased how confident will he be that what bugs do exist are minor (a learning AI may risk small bugs compounding as the AI grows). Can a programmer manage at least the same level of confidence we manage with systems now, despite the increased complexity and lack of control of a Hard AI?
If you assume that we will be 'growing' AI more then programming them in the future then the question of programming errors becomes less of an issue. However, it's replaced with an somewhat analogous question. How much will the creator of the AI be able to know what side effects the AI he grew has, and what ability would he have to minimize or control them?
For instance perhaps we grow an AI to handle the stock market, but as a side effect it developed an odd sense of humor, and it decides on April 1st to report that it lost all the investors money as a joke. How likely are we to know about quirks of the AI like the sense of humor; and if we decide we don't like our AI making jokes how much control would we have in avoiding such quirks when we grow the AI (without introducing the same number of new different quirks elsewhere)?
The point is the same in both cases. When I set out with a list of requirements I want for my dream Hard AI to have, how confident can I be that the AI I end up with meets that dream AI?