The field of natural language processing (NLP) has, to date, largely focused its efforts on technology for English, even though it is a typological outlier and the majority of the world's people do not speak it. This project aims to develop statistical natural language analysis tools to disambiguate the morphological and syntactic structure of non-English text. Specifically, the objective of the pilot study is to design, train, implement, and disseminate statistical morpho-syntactic parsing models for Arabic and Hebrew. This project starts with a straightforward formalism (statistical head automaton grammars) and makes use of novel discriminative learning methods to build models that can be easily ported to new datasets. While previous work has simplified the problem by assuming perfect morphological disambiguation prior to parsing, for most languages, accurate morphological disambiguation is not yet available; this project aims to integrate morphological disambiguation into the parsing algorithm for better accuracy on both tasks. Impact: This project will improve global access to information by directly advancing core language processing technology in languages spoken by more than half a billion people and - because of the language-portability principle - by facilitating future work on many more languages. It is expected that this project will improve the state-of-the-art in parsing accuracy for the languages under consideration, and the models and algorithms developed will be made freely available for research purposes. These tools are expected to aid researchers working on applied technologies such as machine translation and multilingual information extraction.