The dynamic data driven estimation of 3D human shape and motion from optical sensors (cameras) is a fundamental problem that has many applications such as non-invasive security and monitoring systems and human-computer interaction. The difficulty of the problem stems from a) the shape complexity and the many degrees of freedom due to the high articulation and deformations of the human body, b) the noise introduced by the sensors, c) the dynamically changing appearance of the human body in an image sequence in terms of shape and intensity, and d) the unknown distributions of the visual cues (e.g., edges and optical ow) and the lack of a principled methodology of how to combine them. Lagrange dynamics-based 3D deformable models have the potential of being successful in analyzing the shape and motion of non-rigid or articulated data such as the face and hands since they can adapt to the shape and motion variations across individuals. This proposal aims to develop a deformable model-based framework for human shape and motion estimation which can cope with the dynamic changes of the input visual data and the resulting need for the dynamic integration of visual cues extracted from the input data. Our proposed approach should be able to evaluate automatically and dynamically the rustworthiness" of each visual cue and integrate subsequently the cues in a manner that reects their importance. The methods that are proposed here are general and are not only applicable to face/hand tracking but to whole body tracking.