Interpretability is hard.
even when we get it, we have to know what to do with it.
we would have to manipulate the Internal goals of AI.
Internal goals are based on internal simulations and internal representations.
We have to manipulate those in order to write hard rules into AI’s.
· Reply
Make
Sergei
your Representive in the
Implementing Asimov’s Laws of Robotics (The first law) - How alignment could work.
topic?
Confirm
Share
Moderate
Hot comments
See content based on recent uploads
See content based on 1st: recency, 2nd: popularity
See content based on 1st: popularity, 2nd: recency
See content based on popularity
No comments yet
More Info
Sign in/up
Interpretability is hard.
even when we get it, we have to know what to do with it.
we would have to manipulate the Internal goals of AI.
Internal goals are based on internal simulations and internal representations.
We have to manipulate those in order to write hard rules into AI’s.
· Reply
Make
Sergei
your Representive in the
Implementing Asimov’s Laws of Robotics (The first law) - How alignment could work.
topic?
Share
Moderate