Thin-Slicing Network: A Deep Structured Model for Pose Estimation in Videos

Authors: J. Song, L. Wang, L. Van Gool, O. Hilliges,
publication: In Proceedings CVPR, Hawaii, USA, 2017


In summary our main contributions are: (A). a structured model captures the inherent consistency of human poses in video sequences based on a loopy spatio-temporal graph. (B). An efficient and flexible infer- ence layer performs message passing along the spatial and temporal graph edges and significantly reduces joint posi- tion uncertainty (C). The entire architecture integrates a ConvNet-based joint regressors and a high-level structured inference model in a unified framework which can be op- timized in an end-to-end manner. Code and models will be available at GitHub.


Deep ConvNets have been shown to be effective for the task of human pose estimation from single images. How- ever, several challenging issues arise in the video-based case such as self-occlusion, motion blur, and uncommon poses with few or no examples in the training data. Tem- poral information can provide additional cues about the location of body joints and help to alleviate these issues. In this paper, we propose a deep structured model to esti- mate a sequence of human poses in unconstrained videos. This model can be efficiently trained in an end-to-end man- ner and is capable of representing the appearance of body joints and their spatio-temporal relationships simultane- ously. Domain knowledge about the human body is explic- itly incorporated into the network providing effective priors to regularize the skeletal structure and to enforce temporal consistency. The proposed end-to-end architecture is eval- uated on two widely used benchmarks for video-based pose estimation (Penn Action and JHMDB datasets). Our ap- proach outperforms several state-of-the-art methods.



	author = {Song, Jie and Wang, Limin and Van Gool, Luc and Hilliges, Otmar},
	title = {Thin-Slicing Network: A Deep Structured Model for Pose Estimation in Videos},
	booktitle = {CVPR},
	year = {2017},
	location = {Hawaii, USA},