With new existing biotechnologies, a typical longitudinal study following up a sample of subjects (or organisms, in general) can involve collection of multi-thousand dimensional gene expression profiles (or biomarkers, in general) at one or more points in time, change of medical treatments over time, and missing/censored data on clinical outcomes such as survival. An important issue with such data -- in addition to the more classical issues """"""""censoring"""""""" and """"""""confounding of treatment"""""""" involving gene expression is that each gene represents important parameters itself, so that one typically needs to estimate thousands of parameters at once. In particular, this implies that one needs statistical inference in a setting where one estimates many more parameters than one has independent observations. In addition, visualization techniques and sophisticated supervised/unsupervised clustering techniques are required to discover the significant and important overall structures. This research will develop statistical semi parametric methods for the analysis of data arising in observational longitudinal studies collecting gene-expression data. An important focus of this work will be to apply the proposed methods to analyze longitudinal studies in collaboration with subject matter experts on 1) the causal effect of air pollution on the natural history of asthma in children, 2) the causal effect of leisure time activity on survival and health in the elderly population, 3) causal relationships between recurrence/survival and gene expression profiles in cancer patients and 4) gene expression in yeast data sets and its relationship to the non-coding regions. The methods will make it possible to learn how expression of different genes (and hence the encoded proteins) interact, providing insight into biochemical pathways and clues about underlying causal mechanisms at work on the genomic level. In particular, it is believed that gene expression is an important indicator of cancer progression and that an understanding of which genes are active in this process will ultimately lead to better strategies for diagnosis and treatment.