This research is focused on the creation of new techniques and algorithms to support comprehensive analysis of Android applications. We have developed formally grounded techniques for extracting accurate models of smartphone applications from installation images. The recovery formalization is based on TyDe, a typed meta-representation of Dalvik bytecode (the code structure used by the Android smartphone operating system). In developing TyDe, we are formalizing the TyDe type inferencing, ill-formed bytecode structure management, and creating a generalized Dalvik-to-Java retargeting logic based on bytecode "instruction templates".
TyDe and the models they represent are being used to perform deep analysis of application structure to infer potential application behaviors that may harm users, their data, or the cellular or Internet infrastructure. In particular, these analyses support whole program analysis, reflection, and smartphone specific data flow analysis. Such analyses provide a means for evaluating an applications adherence to best security practices or organizational requirements by inspecting permission structures, component interfaces, and source code and library origins for signals of malicious behavior. The analysis techniques are being evaluated on a large corpus of real-world applications extracted from real application markets.
In the broadest view, this work is providing new avenues for researchers, industry, and consumers to assess potential dangers presented by applications retrieved from smartphone application markets, an advancing the state of the art in application program analysis.