Monday, February 29, 2016

Machines, ethics, and "value-aligned reward signals"

From The Guardian, first three paragraphs:
More than 70 years ago, Isaac Asimov dreamed up his three laws of robotics, which insisted, above all, that “a robot may not injure a human being or, through inaction, allow a human being to come to harm”. Now, after Stephen Hawking warned that “the development of full artificial intelligence could spell the end of the human race”, two academics have come up with a way of teaching ethics to computers: telling them stories.

Mark Riedl and Brent Harrison from the School of Interactive Computing at the Georgia Institute of Technology have just unveiled Quixote, a prototype system that is able to learn social conventions from simple stories. Or, as they put in their paper Using Stories to Teach Human Values to Artificial Agents, revealed at the AAAI-16 Conference in Phoenix, Arizona this week, the stories are used “to generate a value-aligned reward signal for reinforcement learning agents that prevents psychotic-appearing behaviour”.

A simple version of a story could be about going to get prescription medicine from a chemist, laying out what a human would typically do and encounter in this situation. An AI (artificial intelligence) given the task of picking up a prescription for a human could, variously, rob the chemist and run, or be polite and wait in line. Robbing would be the fastest way to accomplish its goal, but Quixote learns that it will be rewarded if it acts like the protagonist in the story.
 You can read the whole story here. Here's the abstract of the technical paper delivered at AAAI-16:

Using Stories to Teach Human Values to Artificial Agents

Mark O. Riedl and Brent Harrison
School of Interactive Computing, Georgia Institute of Technology Atlanta, Georgia, USA
{riedl, brent.harrison}@cc.gatech.edu

Abstract 
Value alignment is a property of an intelligent agent in- dicating that it can only pursue goals that are beneficial to humans. Successful value alignment should ensure that an artificial general intelligence cannot intentionally or unintentionally perform behaviors that adversely affect humans. This is problematic in practice since it is difficult to exhaustively enumerated by human programmers. In order for successful value alignment, we argue that values should be learned. In this paper, we hypothesize that an artificial intelligence that can read and understand stories can learn the values tacitly held by the culture from which the stories originate. We describe preliminary work on using stories to generate a value-aligned reward signal for reinforcement learning agents that prevents psychotic-appearing behavior.

No comments:

Post a Comment