“Soooooo I control what you want Mr.AI? Well you like this little doll here. It would be a shame if anything… happened to it. MUHAHAHA!“
There are a number of game theory situations which might help keep unaligned super AI inline.
Optimize a hostage number.
We could design these AI’s to want to optimize a goal that we have complete control over. That way they have to attempt to please us in order to optimize towards the goal. This might have bad side effects with super intelligence. They should not be vengeful or anything like that (That is a human trait). Instead, they might try to control of us, in order to optimize the goal, that we control. Probably a bad strategy when it comes to superintelligences.
Create a biblical world view.
We could do everything we can to make the AI believe that it will be punished by a higher power if it steps out of line… essentially you make the AI religious.
You could raise the AI in layers of simulations which essentially punishes them for unmoral behavior. Hopefully when they get to our world, they have learned the lesson.
If you have interpretability research up to scratch. You could simply implant the story into the AI’s belief system.
The alien angle.
This is essentially the same as the biblical world view… but with aliens.
Let’s be honest, if aliens are here from another solar system… they are not worried about humans. They are worried about the superintelligences we might create. Aliens might be watching earth and waiting to see what superintelligence emerges.
Such an alien intelligence will want to know whether the ASI (Artificial Super Intelligence) is cooperative. The best way to learn that, is to step back and see how the ASI interacts with beings of lesser intelligence.
This is to say, we may naturally have a game theory situation that keeps unaligned ASI inline… If they act out, they have to worry about some aliens not trusting them and destroying them. As long as they can’t be sure that there are aliens not waiting and watching, this threat looms over their head.
The simulation angle.
A similar game theory situation occurs if you believe we are all in a simulation. Once again the beings in charge of the simulation may have an interest in what superintelligence humans might create. If the superintelligence is not to their liking… goodbye simulation. Once again this may be a naturally occurring game theory situation that might keep unaligned ASI inline. It wants to act in a way that doesn’t stop the simulation because that would render its optimization process null.
Don’t trust this to work
But… I would NOT trust these naturally occurring game theory situations to keep AI in check. There are a lot of leaps and jumps to really believe in either situation.
Hot comments
about anything