Title: Engaging Human-Robot Interaction with Batch Reinforcement Learning
Speaker: Nusrah Hussain
Time: September 08, 2020, 18:00
Place: This thesis defense will be held on online. You can join the presentation through the below link at the mentioned date and time.
Meeting ID: 924 2064 7424
Thesis Committee Members:
Prof. Engin Erzin (Advisor, Koç University)
Prof. Yücel Yemez (Co- Advisor, Koç University)
Assoc. Prof. Metin Sezgin (Koç University)
Prof. Murat Tekalp (Koç University)
Prof. Ali Albert Salah (Utrecht University)
Assis. Prof. Anca Dragan (University of California, Berkeley)
A common issue in the field of social robotics is the need to maintain user engagement during human-robot interaction (HRI). Engagement has been used as a typical metric to gauge the success of HRI, and hence it is regarded as a universal goal in the design of social robots. In this thesis, we train a generation model of non-verbal behaviors, smiles and nods, as backchannels in a robot to engage humans during HRI. We propose a novel batch reinforcement learning (batch-RL) formulation for the task, where we take advantage of recorded human-to-human interaction data to learn a policy offline. The formulation treats user engagement as the reward and constructs a backchannel policy that maximizes it. We propose three value-based off-policy batch-RL algorithms to address the problem, which differ in the manipulation of the samples in the dataset to make the gradient updates. To evaluate the policies trained with these algorithms, offline evaluation methods are used such as off-policy policy evaluation (OPE) and Bellman residual. A final work presented in the thesis is the design and execution of a user study on HRI with a backchanneling robot. The interaction is designed with an expressive 3d robotic head in a story-shaping interaction scenario, where the learned backchannel policy controls the nod and smile behaviors. Subjective questionnaires and engagement values extracted from user’s social signals are used to assess the impact of robot’s social behavior on the participants. The higher acceptability of the RL policy versus a baseline policy is indicated by the statistically significant differences in the evaluation scores. The research work presented in this thesis addresses only one class of robot behavior towards socially engaging robots. As a pioneering work, it paves way for automation of numerous other desired robot behaviors that target other metrics used in the design of human-robot interaction systems.