Nextgenerationgenomescalesequencingofpatientsisnowbecomingroutinefortwoclassesofdisease:rare Mendeliantraitsandcancer.Infavorablecases,thesedataallowidentificationofrelevantmutationsandthus aiddiagnosisandtherapy.Inbothclassesofdisease,themostcommontypeofmutationismissense-single base changes that result in an amino acid substitution in a protein. Uncertainty as to the impact of these mutationsoninvivoproteinactivityhasresultedinaveryconservativeapproachtotheirinterpretationinthe clinic, so causing many missed opportunities for targeted treatment. The goal of this project is to use a combinationofthreestrategiestomaketheinterpretationofthesemutationsmuchmoreapplicableintheclinic. Therearealreadyalargenumberofcomputationalmethodsthatattempttodeterminetheimpactofmissense mutationsonfunction,andthereissubstantialevidencethatthesehaveusefulaccuracy.Theprimarydifficulty isthattheaccuracyinanyparticularcaseisnotreliablycalibrated.Therefore,ourfirstaimistouseacombination ofthesemethodstodevelopanapproachfocusedonmorereliableestimatesfortheprobabilityofhighimpact on protein function (i.e. more confident P values).
The second aim i s to maximize the utilization of three- dimensionalstructuralinformation,largelyignoredbymostcomputationalmethods.Alargefractionofmissense mutationsintheseclassesofdiseaseactbydestabilizingproteinstructureandknowledgeofstructureallows thesetobeidentifiedwithmuchhigherreliability.Also,structureprovidesaframeworkfordetailedannotation andcomprehensionoffunction.Tofacilitatetheutilizationofstructure,wewillimplementamodelingplatform thatleveragesavailableexperimentalinformationtomaximizethestructuraldataavailableforanalyzingmutation impact. An important aspect of the platform is incorporation of methods for evaluating the reliability of the structuralfeaturesrelevanttoanalysisofeachmutation.
Inthe thirdaim wewillbuildspecificfunctionalmodels foreachproteinofinterest,integratinginformationfromcurrentdatabases,theliterature,andcommunityinput, soastoprovidetherichestpossiblebackgroundagainstwhichtojudgetheimpactofmutations.Proteopedia,a wellestablishedmediawikiforproteins,willbeusedtoprovideanintegratedviewoftext,data,andstructure.A keycomponentoftheinformationresourcewillbecontributionsfromcurators,whowillprovideannotationand alsosolicitinputfromotherexperts.Thisaspectoftheprojectbuildsonexperiencewithothercrowdsourcing endeavors, including CASP, CAGI and Proteopedia. There will be three primary outcomes from the project: First,improvedreliabilityfortheinterpretationofmissensemutations.Second,aprototypemutationannotation proceduresuitableforuseinaclinicalsetting.Third,theresourcewillprovideinformationofbenefittoarange ofotherscientists,thusfacilitatingtheanalysisofdiseaserelatedmutations.

Public Health Relevance

Genome scale DNA sequencing is now contributing to diagnosis and therapy in cases of rare human diseaseandcancer.Fullexploitationofthesedataiscurrentlyhamperedbyinadequateunderstanding ofwhichDNAchangesaffectproteinfunctionsoastocontributetodisease.Thisprojectaimstodevelop themethodsandtoolsneededtoremovethatobstacle.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM120364-03
Application #
9504498
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Lyster, Peter
Project Start
2016-09-26
Project End
2019-06-30
Budget Start
2018-07-01
Budget End
2019-06-30
Support Year
3
Fiscal Year
2018
Total Cost
Indirect Cost
Name
University of Maryland College Park
Department
Miscellaneous
Type
University-Wide
DUNS #
790934285
City
College Park
State
MD
Country
United States
Zip Code
20742