The human visual system can recognize a familiar face across wide variations of viewpoint, illumination, expression, and appearance. This remarkable computational feat is accomplished by large-scale networks of neurons. We will test a face space theory of the representations that emerge at the top layer of deep learning convolutional neural networks (DCNNs) as a model of human visual representations of faces. Computer-based face recognition has improved in recent years due to DCNNs and the easy availability of labeled training data (faces and identities) from the web. Inspired by the primate visual system, DCNNs are feedforward artificial neural networks that can map images of faces into representations that support recognition over widely variable images. Although the calculations executed by the simulated neurons are simple, enormous numbers of computations are used to convert an image into a representation. The end result of this processing is a highly compact representation of a face that retains image detail in an invariant, identity-specific face code. This code is fundamentally different than any representation of faces considered in vision science. This theory we test combines key components of previous face space models (similarity, learning history) with new features (imaging conditions, personal face history) in a unitary space that represents both identity and facial appearance across variable images. We will test whether this model can account for human recognition of familiar faces, which is highly robust to image variability (pose, illumination, expression). The model will also be applied to understanding long standing difficulties humans (and machines) have with faces of other races.
We aim to bridge critical gaps in our knowledge of how DCNNs work, linking psychological, neural, and computational perspectives. A fundamentally new theory of face representation will alter the questions we ask about face representations in all three fields. A new focus on understanding how we (or neural networks) ?perceive? a single familiar identity in widely variable images will give rise to a search for representations that gracefully merge the properties of faces with the real-world image conditions in which they are experienced. This project presents a unique opportunity to study, manipulate, and learn from these representations, and to apply the findings to broader questions about high-level vision from neural and perceptual perspectives.
Human recognition of familiar faces is highly robust to image variability (pose, illumination, expression)?a skill that is likely due to the quality and quantity of experience we have with the faces of people we know well. Deep convolutional neural networks are modeled after the primate visual system and have made impressive gains recently on the problem of robust face recognition. Understanding the visual nature of the face ?feature? codes that emerge in these networks can give insight into long-standing questions about how the human visual system can, but does not always, represent a face in a way that generalizes across images that vary widely.