overview_diagram

Many computer vision tasks can benefit from the disentanglement of factors. While some of the factors are task-relevant and labeled (e.g. gaze direction), others are extraneous and unknown a priori (e.g. lighting condition). We propose the Self-Transforming Encoder-Decoder (ST-ED), which learns to discover and disentangle extraneous factors in a self-supervised manner. We show that explicitly disentangling extraneous factors results in more accurate modelling of task-relevant factors. Furthermore, given limited amounts of labeled training data, our method allows for improvements in downstream tasks via data augmentation. Please check our overview video for the general approach.

Teaser-Gif

Given an input image, our method can separetely control the gaze direction and head orientation with high-fidelity. Please check our supplementary video for more qualitative comparisons.


Overview Video



Supplementary Video



Acknowledgments

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme grant agreement No. StG-2016-717054.


Downloads


Published at

Neural Information Processing Systems (NeurIPS), 2020

Project Links

Bibtex

@inproceedings{Zheng2020NeurIPS, author = {Yufeng Zheng and Seonwook Park and Xucong Zhang and Shalini De Mello and Otmar Hilliges}, title = {Self-Learning Transformations for Improving Gaze and Head Redirection}, year = {2020}, booktitle = {Neural Information Processing Systems (NeurIPS)} }