Biomolecular condensates formed via liquid-liquid phase separation (LLPS) are important organizers of cellular biochemistry. Protein LLPS is often mediated by contacts between intrinsically disordered proteins or regions (IDPRs) within larger proteins, frequently in cooperation with multivalent folded interaction domains. The space of IDPR sequences that support LLPS has been only sparsely sampled, owing to the low- throughput techniques currently used. Thus, a general framework describing the connections between IDPR primary sequence, conformational ensemble, and LLPS remains elusive. By developing and applying novel high-throughput methods, the work I propose aims to break the bottlenecks of classical biochemistry. I will design a ~10,000-sequence DNA library encoding GFP-tagged IDPRs, varying amino acid composition and residue patterning in the primary sequence. Library sequences will be constructed by high-throughput gene synthesis and sequences will be individually expressed microfluidic droplets using in vitro transcription/translation. I will then use fluorescence-activated droplet sorting to select droplets bearing sequences that support LLPS, which will be identified by high-throughput sequencing of DNA from sorted droplets. Computational simulations and biochemical/biophysical characterization will demonstrate how primary sequence and conformation ensemble are connected in IDPR phase separation. Further applying this approach, I will identify IDPR features that modulate LLPS by multivalent folded domains. Finally, I will identify IDPR features that govern rates of condensate solidification, a process that is accelerated in disease- associated mutants of condensate-forming IDPRs. The proposed work will provide a set of rules governing phase separation by IDPRs, backed up by an empirical dataset of unprecedented scale. This work will be carried out in the lab of Dr. Michael Rosen, a pioneer in the field of biomolecular condensates, at the University of Texas Southwestern Medical Center, a leading biomedical research institute with excellent facilities and investigators in a wide array of fields. As part of this research training plan, I will be take on a robust series of career development and training activities, including coursework in the responsible conduct of research, grant writing, laboratory management, and educational techniques. I will also have numerous opportunities to attend seminars and conferences, where I will be able to present my work in poster sessions and talks, and opportunities to publish my work in peer-reviewed journals. I also plan to develop a teaching portfolio and gain mentorship experience by training graduate students, rotation students, and visiting undergraduates. In summary, the research training plan proposed here will provide me with the skills, knowledge, and experience necessary to fulfill my long-term goal of becoming an independent investigator.

Public Health Relevance

Biomolecular condensates are membraneless cellular structures that concentrate or exclude proteins and nucleic acids to achieve a wide variety of functions in normal cellular physiology, while aberrant condensate function underlies the pathophysiology of diseases including several cancers and neurodegenerative diseases. Using a novel high-throughput approach, this study will characterize the features of intrinsically disordered protein sequences that govern their ability to form condensates via liquid- liquid phase separation. The proposed experiments will provide empirically validated models describing the connections between intrinsically disordered protein sequences, the conformational ensembles they adopt, and their ability to form biomolecular condensates.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Postdoctoral Individual National Research Service Award (F32)
Project #
1F32GM136058-01
Application #
9909965
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Sakalian, Michael
Project Start
2020-02-01
Project End
2022-01-31
Budget Start
2020-02-01
Budget End
2021-01-31
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Texas Sw Medical Center Dallas
Department
Physiology
Type
Schools of Medicine
DUNS #
800771545
City
Dallas
State
TX
Country
United States
Zip Code
75390