We propose a novel approach to learning personalized policies with applications to mixed reality. By observing (a) user gaze patterns when performing visual search tasks, such as browsing a supermarket shelf, our approach can learn a cooperative policy that implicitly identifies items of interest to the user (here: wine and juice). At runtime the policy only displays labels belonging to these categories and when the user looks at them (b, c). The approach does not require explicit supervision or explicit user interaction but relies only on gaze data to recover user preferences.
An ideal Mixed Reality (MR) system would only present virtual information (e.g., a label) when it is useful to the person. However, deciding when a label is useful is challenging: it depends on a variety of factors, including the current task, previous knowledge, context, etc. In this paper, we propose a Reinforcement Learning (RL) method to learn when to show or hide an object's label given eye movement data. We demonstrate the capabilities of this approach by showing that an intelligent agent can learn cooperative policies that better support users in a visual search task than manually designed heuristics. Furthermore, we show the applicability of our approach to more realistic environments and use cases (e.g., grocery shopping). By posing MR object labeling as a model-free RL problem, we can learn policies implicitly by observing users' behavior without requiring a visual search model or data annotation.
Apartment scenario: (a) a realistic apartment environment with label and object occlusions makes for a difficult visual search task; (b, c) policy that identifies target objects and only shows their labels; (d) policy that shows labels of objects which are close to user's gaze.
We thank Chip Connor and Casey Brown for conducting and organizing the two studies as well as Seonwook Park for providing the voice-over for the video. We are also grateful for the valuable feedback from the area chairs and external reviewers.