Getting In Your Robot’s Head: Building Mental Models of Robots Using Human-Concept Learning
To collaborate effectively and safely with machine partners, it is essential to understand and be able to predict how they will behave. This thesis explores how using variation theory [1], a human-concept learning theory from cognitive science, to select training demonstrations of agent/robot behavior impacts people’s understanding of how the agent will act. By helping people improve their mental model of the artificial agents they interact with, the goal is to improve people’s ability to predict their particular machine’s behavior. To best teach people a concept, variation theory proposes a teaching method composed of four steps: familiarization, contrast, generalization, and fusion [1]. In prior work [2], the contrast step was compared to familiarization in the domain of a robotic arm’s movement, and results suggested that contrast particularly helped people’s understanding in unseen scenarios. In this study, the first three steps of variation were tested to determine which steps are most beneficial to human’s learning of an agent’s behavior in a self-driving domain and which, therefore, should be prioritized when training. To test differences in participants’ ability to predict across these different teaching methods, a user study was run to establish the efficacy of each method by using each method to train people on various policies of a self-driving car’s motion. Results suggested that being shown the combined steps of variation theory improve people’s ability to predict agent behavior, especially in scenarios they were trained with. However, the step of generalization was particularly helpful for scenarios that subjects had not encountered, improving accuracy from familiarization by 9%. Using the full steps of variation theory also increased people’s trust in the agent, but results appeared to be policy-dependent. Interestingly, trust in the agent appears separate from people’s true understanding of the agent’s policy – though contrast had the lowest overall correctness, people trained with that method had the highest trust in their agent at 65.9%.