Autism spectrum disorders (ASD) are a group of complex neurodevelopmental diseases that lead to enormous social, emotional, and economic impact. Decades of research have demonstrated the strong genetic contribution to the disease etiology. Many of the high-confidence autism genes are localized to the postsynaptic density (PSD), which is a complex protein-dense structure typically located in the dendritic spine of excitatory synapses. It is comprised of a diverse panel of proteins including master scaffolds, neurotransmitter receptors, and cytoskeleton regulators. Even though studies have shown that disruption of the PSD is a central mechanism of autism, it remains unclear how these genes aggregate in protein pathways and disrupt synaptic function. To better understand the molecular mechanisms of disease in the synapse, the proposed study will construct a comprehensive data-driven model of the PSD to decipher critical pathways in autism and prioritize novel disease candidates.
In Aim 1, a random forest model will be trained to predict novel PSD genes, which will be validated through in vitro experiments. The machine learning model will integrate a broad spectrum of different data types to identify PSD genes based on their biological properties such as expression profile, protein structure, and others. The predicted PSD genes will be validated through immunocytochemistry and Western blot analysis in human induced pluripotent stem cell (hiPSC)- derived neurons.
Aim 2 will organize the identified PSD network into a hierarchical ontology to enable pathway analysis of disease genes. Logistic regression and gene enrichment analysis will be applied to the novel PSD hierarchy to determine key pathways in autism pathogenesis.
Aim 3 will leverage the PSD ontology to predict the protein neighbors most likely to be disrupted by ASD genes. To validate the predictions, CRISPR/Cas9 DNA editing system will be used to delete high-confidence ASD genes in hiPSC-derived neurons; quantitative polymerase chain reaction (qPCR) and biochemical analysis will be completed to characterize the predicted protein neighbors. Collectively, these aims will reveal the critical synaptic pathways in ASD pathogenesis and provide an integrative map for how seemingly disparate disease genes can lead to the same disease phenotypes. These multidisciplinary studies will be the first of their kind in the synapse and will enable the development of novel therapeutic strategies for ASD. The proposed studies will be completed in Dr. Trey Ideker?s lab at UCSD, which is equipped with state-of-the-art instruments to enable the computational and experimental work described. The proposed training plan focuses on gaining expertise in integrative studies, bioinformatics, neuroscience, mentorship, leadership, and communication. Completion of these aims will provide significant experience in all five domains, and facilitate the transition to academic independence.
Even though many genes associated with autism spectrum disorders (ASD) have been localized to the postsynaptic density (PSD), the exact molecular mechanisms involved in pathogenesis remain unclear. To address this, the proposed study will construct a data-driven map of the PSD that hierarchically organizes genes into functional pathways. By applying an integrative approach of machine learning and experimental validation, the study will identify novel disease genes and key synaptic mechanisms in ASD.