The goals of the statistical and programming core (Core B) are to provide statistical guidance and programming support for all analyses undertaken in each of the projects. The statistical analysis group is led by Dr. Donna Spiegelman, who has served as the senior statistician for the Health Professionals'Follow-up Study since 1990 and as the senior statistician for the Pooling Project since its inception. In addition, she has been the Leader of this Core since it began in 2000. She is supported by four junior statisticians, Ellen Hertzmark, Ruifeng Li, Sherry Yuan, and Lydia Liu, who have master's degrees in mathematics, biostatistics, and statistics. These four junior statisticians represent a tremendous human resource, with nearly 40 cumulative years of experience working with our group, and each is an expert statistical programmer, data analyst, and data manager. Along with statistical programmers Christine Rivera and Al Wing, this team will conduct all database management and analyses required by Projects 1-3 under the direction of Dr. Spiegelman and project investigators. Programming support is provided for the identification of cases and controls in the analysis of biomarkers and genetic materials, as required by Projects 1-2. Dr. Spiegelman is joined by Dr. Kraft, who specializes in statistical genetics. Dr. Kraft will oversee the analyses of genetic data from Projects 1 and 2 and will be responsible for the development of statistical methods for multiple genes and pathway analysis, Core B provides programming assistance to all investigators, and develops user- friendly, professionally documented SAS macros as required by the Projects. Core B is responsible for quality control of all analyses and documentation of data in all reports arising from Projects 1-3. Another aspect of Core B is new statistical methods development as required by the Projects. Methods will be developed to allow for correction of bias due to misclassification and measurement error on the estimates of the effects of exposurethat change over time, with cumulatively averaged diet the primary focus of our attention. As one of the few prospective cohort studies of diet and health in the world with repeated measures of diet available on the vast majority of participants every four years, this unique feature of our studies requires the development of new methods to permit valid and efficient methods for measurement error correction, along with the software to translate the theory into accessible practice. We will develop new methods that address problems of sparse data in the context of candidate gene association studies involving gene-gene and gene-environment interaction, a major focus of two of the three projects in this proposal. By using a priori biological knowledge to model disease risk as a function of multiple variants in a pathway, these methods can also alleviate the multiple testing problem. Each of these areas represents an innovative avenue of statistical investigation at the forefront of contemporary cancer epidemiology.
Showing the most recent 10 out of 21 publications