Statistical models for genetics data are often surprisingly challenging, and often require advanced and new statistical methods. This project continues to investigate a number of such areas, including, for example, a global analysis of X-chromosome dosage compensation. We begin by noting that Drosophila has a special dosage compensation complex, which upregulates the male X chromosome about two-fold relative to the autosomes, thus maintaining X-versus-autosomal genic balance. However, this complex is only present in the soma, not in the germline. Nevertheless, germline tissues also display striking two-fold upregulation of genes on the male X-chromosome, as revealed by careful measurements of gene expression using microarrays (Gupta, Malley, Oliver, et al., 2006). Analysis of published data from mouse and worm expression arrays reveals a similar balance between X and autosomal genes. Taken together, these results (with indicate that multiple means have evolved to achieve the same end) emphasize our fundamental ignorance of the underlying transcription-linked process that is being regulated. We note that this paper by Gupta, Malley, Oliver, et al. (J. Biology, Feb. 2006) was accessed more than 8,500 times in the year following its appearance in Feburary 2006, and was the third most accessed paper in this journal over that time period. More recently we have undertaken the study of genome wide associations and how statistical learning machines can be applied to such ultra large data (500K or 1,000K snps), with the aim of locating the most predictive genes or snps among the available features and understanding how linkage disequilibrium compromises or assists these detection methods. More recently, as discussed above, we are rapidly expanded our search and fusion program of analyzing ultra large scale genetic data sets. Routinely, we derive fully validated error rates and top lists of most important predictors from two million snps per subject. Reproducible error rates and congruent lists of predictors are now obtainable rather easily, using learning machines implemented on the NIH Biowulf cluster.