Gaussian Processes for Style-Content Separation

Gaussian Process Models for Style-Content Separation

Abstract

We introduce models for density estimation with multiple, hidden, continuous "factors". In particular, we propose a generalization of multilinear models using nonlinear basis functions. By marginalizing over the weights, we obtain a multifactor form of the Gaussian process latent variable model. In this model, each factor is kernelized independently, allowing nonlinear mappings from any particular factor to the data. We learn models for human locomotion data, in which each pose is generated by factors representing the person's identity, gait, and the current state of motion. We demonstrate our approach using time-series prediction, and by synthesizing novel animation from the model.

People

Jack M. Wang
David J. Fleet
Aaron Hertzmann

Paper

Wang, J. M., Fleet, D. J., Hertzmann, A. Multifactor Gaussian Process Models for Style-Content Separation. In Proc. ICML 2007, Corvallis, OR.

Slides and Videos

ICML talk slides, containing videos.
Training data, six clips in total.
Compare subject 1's walk (left) vs. subjects 1's synthesized stride (right) , note the similarity in arm swing styles.
Compare subject 3's stride (left) vs. subjects 1's synthesized stride (right) , note the synthesized stride is different from the stride in the training set.
Compare subject 2's walk (left) vs. subjects 2's synthesized stride (right) , note the similarity in the conservative style.
Transitions, from walk to run to stride.
Random style variations, generated by fitting a Gaussian to the style spaces and sampling from them.

Software

Here's the latest version of the MGP code, see mgp/trainmgp.m and mgp/testmgp.m to get an idea of how to use it. You will need to download mocap data from CMU yourself and put them under the gpdm/mocaps directory. This is research code, use at your own risk!

Acknowledgements

This project was funded in part by the Alfred P. Sloan Foundation, the Canadian Institute for Advanced Research, Canada Foundation for Innovation, Microsoft Research, NSERC, and the Ontario Ministry of Research and Innovation. The data used in this project was obtained from mocap.cs.cmu.edu, which was created with funding from NSF EIA-0196217.