Unconstrained Pose Invariant Face Recognition using 3D Generic Elastic Models
AbstractClassical face recognition techniques have been successful at operating in well controlled conditions, however they have difficulty in robustly performing recognition in un-controlled real-world scenarios where variations in pose, illumination and expression are encountered. In this paper we propose a new method for unconstrained pose invariant face recognition. We first construct a 3D model for each subject in our database from a single image using the 3D-Generic Elastic Model (3D-GEM) approach. These 3D GEM models act as our intermediate search database which we use to generate novel pose 2D views for matching. An initial estimate of the pose of the test query is attained using a linear regression approach based on automatically fitting a facial landmark-based shape model. Each 3D GEM model is then rendered at different poses within a limited search space about the estimated pose, and the resulting images are matched against the test query using a normalized correlation matcher. We present sound results on challenging datasets demonstrating high recognition accuracy under controlled as well as uncontrolled scenarios using a fast implementation.
Multi-PIE datasetWe conducted "constrained" experiments on a subset of the Multi-PIE database. Since the data for these experiments were captured under carefully controlled environments, with minimal illumination and expression variation, the results are optimistically biased. Moreover, only pose variation in yaw and roll are observed in these cases.
Three experiments were conducted in all:
- A brute-force search which selects the best match against all possible views of a model
- Using a linear regression based pose estimation to estimate the pose of the test image
- A limited search space within a range of the estimated pose
Real-World datasetsThe real-world experiments feature more realistic "unconstrained" scenarios taken from tracking databases, hollywood moviews and TV shows. The subjects in these experiments undergo large pose variations in pitch, yaw and roll. In these experiments, the eye locations of the subject in the video were hand-clicked to avoid registration errors due to erroneous face detection. We show a few representative frames from each video and a graph demonstrating the GEM matching scores (in solid
green) against the matching score with the template used to generate these models (in