Unconstrained Pose Invariant Face Recognition using 3D Generic Elastic Models

Utsav Prabhu, Jingu Heo and Marios Savvides


Classical face recognition techniques have been successful at operating in well controlled conditions, however they have difficulty in robustly performing recognition in un-controlled real-world scenarios where variations in pose, illumination and expression are encountered. In this paper we propose a new method for unconstrained pose invariant face recognition. We first construct a 3D model for each subject in our database from a single image using the 3D-Generic Elastic Model (3D-GEM) approach. These 3D GEM models act as our intermediate search database which we use to generate novel pose 2D views for matching. An initial estimate of the pose of the test query is attained using a linear regression approach based on automatically fitting a facial landmark-based shape model. Each 3D GEM model is then rendered at different poses within a limited search space about the estimated pose, and the resulting images are matched against the test query using a normalized correlation matcher. We present sound results on challenging datasets demonstrating high recognition accuracy under controlled as well as uncontrolled scenarios using a fast implementation.


Multi-PIE dataset

We conducted "constrained" experiments on a subset of the Multi-PIE database. Since the data for these experiments were captured under carefully controlled environments, with minimal illumination and expression variation, the results are optimistically biased. Moreover, only pose variation in yaw and roll are observed in these cases.
Three experiments were conducted in all:
  • A brute-force search which selects the best match against all possible views of a model
  • Using a linear regression based pose estimation to estimate the pose of the test image
  • A limited search space within a range of the estimated pose
ROC Curves: (Click on images to enlarge)
Complete brute-force search
Using linear regression pose estimation
Constrained search within range of estimated pose

Real-World datasets

The real-world experiments feature more realistic "unconstrained" scenarios taken from tracking databases, hollywood moviews and TV shows. The subjects in these experiments undergo large pose variations in pitch, yaw and roll. In these experiments, the eye locations of the subject in the video were hand-clicked to avoid registration errors due to erroneous face detection. We show a few representative frames from each video and a graph demonstrating the GEM matching scores (in solid
) against the matching score with the template used to generate these models (in
, dotted).

"David Indoor" (from Ross et al.)

The results from this video clearly show the accuracy of the models, as we see a nearly constant matching score inspite of the large pose and illumination variation. The video is available here.

"Dudek" (from Ross et al.)

Another tracking video taken from the same database, shows the variation with out-of-plane rotation. The video is available here.

"Groundhog Day" (1993) - Reporting scene

This clip was taken from the movie 1993 movie "Groundhog Day" and features Bill Murray demonstrating large pose variations (profile to frontal). The video clip with results are available here.

"Dressed To Kill" (1946) - Dr. Watson and the Music Box

This clip was taken from the 1946 Sherlock Holmes movie "Dressed to Kill" (public domain, available here). In this scene, Nigel Bruce (playing Dr. Watson) examines a music box, demonstrating very large pose variations in both yaw and pitch. The results demonstrate the accuracy of the 3D GEM models to matching vastly off-angle poses. The video clip with our results are available here.

"Seinfeld" - S8 E13 "The Comeback"

In this experiment, we constructed 3D models using single frontal images of the four lead characters of the TV sitcom "Seinfeld", and ran a brute-force simple matching scheme on a video clip from the episode "The Comeback". Download the video clip with results here.