• Hi!
    I'm Yang Zhou

    I'm a 5th year CS PhD student in the Computer Graphics Research Group at UMass Amherst, advised by Prof. Evangelos Kalogerakis. I work in the areas of computer graphics and machine learning. In particular, I am interested in using deep learning techniques to help artists, stylists and animators to make better design. I obtained my master's degree from Georgia Institute of Technology and my master & bachelor's degree from Shanghai Jiao Tong University, advised by Prof. Weiyao Lin.

    Download CV



[NEW!] [Sept. 2020] I'm looking for job/post-doc positions (starting from around May 2021)!

[NEW!] [Aug. 2020] Our paper MakeItTalk conditioally accepted by SIGGRAPH ASIA 2020

[NEW!] [Apr. 2020] Our paper RigNet accepted by SIGGRAPH 2020. [Video]

► [Nov. 2019] Our summer intern project #SweetTalk was presented at Adobe MAX 2019 (Sneak Peek). [Youtube Link] [Press]

► [Aug. 2019] Our paper on Animation Skeleton Prediction accepted by 3DV 2019.

► [Jul. 2019] Our paper SceneGraphNet accepted by ICCV 2019.

► [Jun. 2019] Joined Adobe CIL (Seattle) as a summer intern.

► [Jun. 2018] Joined Wayfair Next Research as a summer intern and fall co-op intern.

► [Apr. 2018] Our paper VisemeNet accepted by SIGGRAPH 2018. [Video]



MakeItTalk: Speaker-Aware Talking Head Animation 2019-2020

Yang Zhou, D. Li, X. Han, E. Kalogerakis, E. Shechtman, J. Echevarria

We present a method that generates expressive talking heads from a single facial image with audio as the only input. In contrast to previous approaches that attempt to learn direct mappings from audio to raw pixels or points for creating talking faces, our method first disentangles the content and speaker information in the input audio signal. The audio content robustly controls the motion of lips and nearby facial regions, while the speaker information determines the specifics of facial expressions and the rest of the talking head dynamics. Another key component of our method is the prediction of facial landmarks reflecting speaker-aware dynamics. Based on this intermediate representation, our method is able to synthesize photorealistic videos of entire talking heads with full range of motion and also animate artistic paintings, sketches, 2D cartoon characters, Japanese mangas, stylized caricatures in a single unified framework.

[Project Page][Paper][Video]

RigNet: Neural Rigging for Articulated Characters 2018-2019

Z. Xu, Yang Zhou, E. Kalogerakis, C. Landreth, K. Singh

We present RigNet, an end-to-end automated method for producing animation rigs from input character models. Given an input 3D model representing an articulated character, RigNet predicts a skeleton that matches the animator expectations in joint placement and topology. It also estimates surface skin weights based on the predicted skeleton. Our method is based on a deep architecture that directly operates on the mesh representation without making assumptions on shape class and structure.

[Project Page] [Video] [Code] [Paper]

SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation 2018-2019

Yang Zhou, Z. While, E. Kalogerakis
International Conference Computer Vision (ICCV), 2019

We propose a neural message passing approach to augment an input 3D indoor scene with new objects matching their surroundings. Given an input, potentially incomplete, 3D scene and a query location, our method predicts a probability distribution over object types that fit well in that location. Our distribution is predicted though passing learned messages in a dense graph whose nodes represent objects in the input scene and edges represent spatial and structural relationships.

[Project Page] [Paper] [Code]

Predicting Animation Skeletons for 3D Articulated Models via Volumetric Nets 2018-2019

Z. Xu, Yang Zhou, E. Kalogerakis, K. Singh
International Conference on 3D Vision (3DV) 2019

We present a learning method for predicting animation skeletons for input 3D models of articulated characters. In contrast to previous approaches that fit pre-defined skeleton templates or predict fixed sets of joints, our method produces an animation skeleton tailored for the structure and geometry of the input 3D model.

[Project Page] [Code]

Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55 2017

L. Yi, L. Shao, M. Savva, H. Huang, Yang Zhou, et al.
International Conference Computer Vision Workshop (ICCVW), 2017

ShapeNet is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D shapes. We collaborate with ShapeNet team in helping building the training and testing dataset of “Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55”. In particular, we help check the geometry duplicates in ShapeNet Core dataset.

[3D Shape Reconstruction and Segmentation Task Page] [Paper] [ShapeNet Duplicate Check]

A Tube-and-Droplet-based Approach for Representing and Analyzing Motion Trajectories 2014-2016

W. Lin, Yang Zhou, H. Xu, J. Yan, M. Xu, J. Wu, Z. Liu
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 39(8), pp. 1489-1503, 2017

We address the problem of representing motion trajectories in a highly informative way, and consequently utilize it for analyzing trajectories. We apply our tube-and-droplet representation to trajectory analysis applications including trajectory clustering, trajectory classification & abnormality detection, and 3D action recognition.

[Project Page] [Paper] [Dataset] [Code]

Unsupervised Trajectory Clustering via Adaptive Multi-Kernel-based Shrinkage 2014-2015

H. Xu, Yang Zhou, W. Lin, H. Zha
International Conference Computer Vision (ICCV), pp. 4328-4336, 2015

We introduce an adaptive multi-kernel-based estimation process to estimate the 'shrunk' positions and speeds of trajectories' points. This kernel-based estimation effectively leverages both multiple structural information within a trajectory and the local motion patterns across multiple trajectories, such that the discrimination of the shrunk point can be properly increased.


Representing and recognizing motion trajectories: a tube and droplet approach 2013-2014

Yang Zhou, W. Lin, H. Su, J. Wu, J. Wang, Y. Zhou
ACM Intl. Conf. on Multimedia (MM), pp. 1077-1080. 2014

This paper addresses the problem of representing and recognizing motion trajectories. We propose a 3D tube which can effectively embed both motion and scene-related information of a motion trajectory and a droplet-based method which can suitably catch the characteristics of the 3D tube for activity recognition.




Adobe, Inc | Computer Vision Lab

June, 2020 | Research Intern

Collaborate with researchers on 3D facial/skeleton animations based on deep learning approaches.

Adobe, Inc | Creative Intelligence Lab

June, 2019 | Research Intern

Collaborate with researchers on audio-driven cartoon and real human facial animations and lip-sync technologies based on deep learning approaches.

Our intern project #SweetTalk was presented at Adobe MAX 2019 (Sneak Peek).

[Youtube Link] [Press]

Wayfair, Inc | Wayfair Next Research

June, 2018 | Research Intern

Working on 3D scene systhesis based on deep learning approaches.

NetEase Game, Inc

June, 2015 | Management Trainee

Working on mobile game design, especially on profit models and user-experiences.

Best way to

Contact Me

Best way to reach me is to send an Email