AIT Lab

Research Overview

The AIT lab conducts research at the fore-front of human-centric computer vision. Our core research interests are algorithms and methods for the spatio-temporal understanding of how humans move within and interact with the physical world. We develop learning-based algorithms, methods and representations for human- and interaction-centric understanding of our world from videos, images and other sensor data. Application domains of interest include Augmented and Virtual Reality, Human Robot Interaction and more. Please refer to our publications for more information.

The AIT Lab, led by Prof. Dr. Otmar Hilliges, is part of the Institute for Intelligent Interactive Systems (IIS), in the Department of Computer Science at ETH Zurich.

Latest News

02.07.25

Yufeng Zheng passed the doctoral exam. Congratulations!
18.06.25

Gengyan Li passed the doctoral exam. Congratulations!
08.04.25

Muhammed Kocabas passed the doctoral exam. Congratulations!
26.02.25

We have 4 papers accepted at CVPR 2025.
22.01.25

We have 2 papers accepted at ICLR 2025.

See all News

Latest Projects

2025

HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars

Gent Serifi, Marcel C. Buehler

arxiv, 2025

Paper

Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior

Chen Guo*, Junxuan Li*, Yash Kant, Yaser Sheikh, Shunsuke Saito, Chen Cao * Equal contribution

Computer Vision and Pattern Recognition (CVPR), 2025

Video

ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos

Zetong Zhang, Manuel Kaufmann, Lixin Xue, Jie Song, Martin Oswald

Computer Vision and Pattern Recognition (CVPR), 2025

GauSTAR: Gaussian Surface Tracking and Reconstruction

Chengwei Zheng, Lixin Xue, Juan Zarate, Jie Song

Computer Vision and Pattern Recognition (CVPR), 2025

Video

ChatGarment: Garment Estimation, Generation and Editing via Large Language Models

Siyuan Bian, Chenghao Xu, Yuliang Xiu, Artur Grigorev, Zhen Liu, Cewu Lu, Michael J. Black, Yao Feng

Computer Vision and Pattern Recognition (CVPR), 2025

Video

Paper

No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models

Seyedmorteza Sadat, Manuel Kansy, Otmar Hilliges, Romann M. Weber

International Conference on Learning Representations (ICLR), 2025

Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Seyedmorteza Sadat, Otmar Hilliges, Romann M. Weber

International Conference on Learning Representations (ICLR), 2025

Gaussian Garments: Reconstructing Simulation-ready Clothing with Photorealistic Appearance from Multi-view Video

Boxiang Rong*, Artur Grigorev*, Wenbo Wang, Michael J. Black, Bernhard Thomaszewski, Christina Tsalicoglou, Otmar Hilliges * Equal contribution

International Conference on 3D Vision 2025, 2025

Video

Paper

2024

Grasping Diverse Objects with Simulated Humanoids

Zhengyi Luo*, Jinkun Cao*, Sammy Christen, Alexander Winkler, Kris Kitani, Weipeng Xu * Equal contribution

Annual Conference on Neural Information Processing Systems (NeurIPS), 2024

Paper

LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models

Seyedmorteza Sadat, Jakob Buhmann, Derek Bradley, Otmar Hilliges, Romann M. Weber

Annual Conference on Neural Information Processing Systems (NeurIPS), 2024

Video

VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresence

Phong Tran, Egor Zakharov, Long-Nhat Ho, Liwen Hu, Adilbek Karmanov, Aviral Agarwal, McLean Goldwhite, Ariana Bermudez Venegas, Anh Tuan Tran, Hao Li

SIGGRAPH Asia 2024 (TOG), 2024

Video

Paper

EgoHDM: An Online Egocentric-Inertial Human Motion Capture, Localization, and Dense Mapping System

Bonan Liu*, Handi Yin*, Manuel Kaufmann, Jinhao He, Sammy Christen, Jie Song, Pan Hui * Equal contribution

SIGGRAPH Asia 2024 (TOG), 2024

Video

Paper

DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions

Sammy Christen, Shreyas Hampali, Fadime Sener, Edoardo Remelli, Tomas Hodan, Eric Sauser, Shugao Ma, Bugra Tekin

Conference paper at SIGGRAPH Asia, 2024

Video

Cafca: High-quality Novel View Synthesis of Expressive Faces from Casual Few-shot Captures

Marcel C. Buehler, Gengyan Li, Erroll Wood, Leonhard Helminger, Xu Chen, Tanmay Shah, Daoye Wang, Stephan Garbin, Sergio Orts-Escolano, Otmar Hilliges, Dmitry Lagun, Jérémy Riviere, Paulo Gotardo, Thabo Beeler, Abhimitra Meka, Kripasindhu Sarkar

Conference paper at SIGGRAPH Asia, 2024

Video

Talk

Paper

Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Zicong Fan*, Takehiko Ohkawa*, Linlin Yang*, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung Jin Chang, Angela Yao

European Conference on Computer Vision (ECCV), 2024

ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild

Chen Guo*, Tianjian Jiang*, Manuel Kaufmann, Chengwei Zheng, Julien Valentin, Jie Song, Otmar Hilliges * Equal contribution

European Conference on Computer Vision (ECCV), 2024

Video

Paper

WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation

Tianjian Jiang, Johsan Billingham, Sebastian Müksch, Juan Zarate, Nicolas Evans, Martin Oswald, Marc Pollefeys, Otmar Hilliges, Manuel Kaufmann, Jie Song

European Conference on Computer Vision (ECCV), 2024

Video

Paper

HSR: Holistic 3D Human-Scene Reconstruction from Monocular Videos

Lixin Xue, Chen Guo, Chengwei Zheng, Fangjinhua Wang, Tianjian Jiang, Hsuan-I Ho, Manuel Kaufmann, Jie Song, Otmar Hilliges

European Conference on Computer Vision (ECCV), 2024

Video

Paper

GraspXL: Generating Grasping Motions for Diverse Objects at Scale

Hui Zhang, Sammy Christen, Zicong Fan, Otmar Hilliges, Jie Song

European Conference on Computer Vision (ECCV), 2024

Video

Paper

Data

Human Hair Reconstruction with Strand-Aligned 3D Gaussians

Egor Zakharov, Vanessa Sklyarova, Michael J. Black, Giljoo Nam, Justus Thies, Otmar Hilliges

European Conference on Computer Vision (ECCV), 2024

Video

Paper

AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos

Feichi Lu, Zijian Dong, Jie Song, Otmar Hilliges

European Conference on Computer Vision (ECCV), 2024

Video

Paper

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian, Otmar Hilliges, Luc Van Gool, Xi Wang

European Conference on Computer Vision (ECCV), 2024

ROMEO: Revisiting Optimization Methods for Reconstructing 3D Human-Object Interaction Models From Images

Alexey Gavryushin, Yifei Liu, Daoji Huang, Yen-Ling Kuo, Julien Valentin, Luc Van Gool, Otmar Hilliges, Xi Wang

European Conference on Computer Vision Workshops (ECCVW), 2024

best paper award of the T-CAP workshop

Paper

I-Design: Personalized LLM Interior Designer

Ata Çelen, Guo Han, Konrad Schindler, Luc Van Gool, Iro Armeni, Anton Obukhov, Xi Wang

European Conference on Computer Vision Workshops (ECCVW), 2024

Video

Paper

ContourCraft: Learning to Resolve Intersections in Neural Multi-Garment Simulations

Artur Grigorev, Giorgio Becherini, Michael Black, Otmar Hilliges, Bernhard Thomaszewski

ACM SIGGRAPH, 2024

Video

Paper

MARLUI: Multi-Agent Reinforcement Learning for Adaptive Point-and-Click UIs

Thomas Langerak, Sammy Christen, Mert Albaba, Christoph Gebhardt, Christian Holz, Otmar Hilliges

Proc. ACM Human-Computer Interaction 8, EICS, 2024

Video

Utilizing Synthetic Data in Supervised Learning for Robust 5-DoF Magnetic Marker Localization

Mengfan Wu*, Thomas Langerak*, Otmar Hilliges, Juan Zarate * Equal contribution

IEEE Transactions on Magnetics, 2024

Video

WANDR: Intention-guided Human Motion Generation

Markos Diomataris, Nikos Athanasiou, Omid Taheri, Xi Wang, Otmar Hilliges, Michael J. Black

Computer Vision and Pattern Recognition (CVPR), 2024

Video

Paper

VOODOO 3D: Volumetric Portrait Disentanglement for One-Shot 3D Head Reenactment

Phong Tran, Egor Zakharov, Long-Nhat Ho, Anh Tuan Tran, Liwen Hu, Hao Li

Computer Vision and Pattern Recognition (CVPR), 2024

Video

Paper

Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation

Razvan-George Pasca, Alexey Gavryushin, Muhammad Hamza, Yen-Ling Kuo, Kaichun Mo, Luc Van Gool, Otmar Hilliges, Xi Wang

Computer Vision and Pattern Recognition (CVPR), 2024

Video

Paper