The wealth and variety of data generated in modern medical and health-care settings present tremendous research challenges as well as opportunities in artificial intelligence and machine learning. Extensive electronic medical records - with thousands of fields recording patient conditions, diagnostic tests, treatments, outcomes, as well as narrative about the patients and care delivery - provide an unprecedented source of information. Tapping into this data source can bring clues leading to improvements in a wide range of health-care applications, such as disease modeling and early detection, chronic disease management, and efficient design of clinical trials.

Intellectual Merit: The Workshop on Machine Learning for Clinical Data Analysis (http://sites.google.com/site/mlclinicaldata) will be held during the International Conference on Machine Learning (ICML), 2012, Edinburgh, Scotland on June 30-July 1, 2012. The workshop aims to bring together machine learning and informatics researchers interested in problems and applications in the clinical domain, with the goal of bridging the gap between the theory of machine learning and the needs of the health informatics applications. The award provides funds to cover the travel costs of invited speakers and graduate students. The Ph.D. student participants will be able to present their work, interact with their peers from other universities as well as hundreds of leading researchers in machine learning from around the world. In addition to attending the workshop, they will attend the technical sessions, plenary talks, and tutorials of their choice at the conference. The invited speakers will present talks covering state-of-the-art research as well as open machine learning research challenges in building predictive models from clinical data. The workshop aims to educate the machine learning research community regarding machine learning research opportunities and challenges in health care applications, especially in connection with recent electronic health record initiatives; identify new machine learning problems not previously addressed by the community; and help build a community of researchers who can advance machine learning informed by the challenges and opportunities presented by clinical data analytics.

Broader Impacts. Machine learning is playing an increasingly important role in many emerging data-rich sciences and application domains, such as bioinformatics, computational biology, health informatics, and security informatics. Participation in the workskshop and the ICML and COLT conferences will enrich the education and training of student researchers at early stages in their careers. The travel awards will help broaden the participation of women and members of underrepresented minority groups within the Machine Learning and Health Informatics research communities.

Project Report

, that was held during the International Conference on Machine Learning (ICML), 2012 in Edinburgh, Scotland. The workshop was co-chaired by Noemie Elhadad from Columbia University, and Milos Hauskrecht from University of Pittsburgh, the PI of the NSF award. A broad PC committee with expertise in machine learning, data mining and biomedical informatics was recruited to support the paper reviews. Workshop Aims. Our aim was a broad participation of established researchers and graduate students working on machine learning problems and machine learning solutions for clinical and health care data, with the goal of exchanging ideas and perspectives on medical applications, research bottlenecks, and the needs of the healthcare community. Submissions and review. We solicited twenty-two papers and extended abstract submissions by advertising the workshop on mailing lists of major related conferences and by posting the ads for the workshop on the relevant email lists and online groups focused on health informatics, machine learning, data mining, and natural language processing. Each contribution was reviewed by three PC and based on their recommendations; we accepted ten full-length papers and ten extended-abstracts Full length papers were presented orally and extended abstracts as posters. Workshop. The workshop was a two-day workshop (June 30-July 1, 2012) and consisted of a sequence of oral paper presentations, invited talks, poster session and a panel on challenges for Machine Learning on Clinical Data Analysis. The workshop program and final versions of all accepted papers are available at http://sites.google.com/site/mlclinicaldata/ website built and maintained by Dr. Hauskrecht, the PI of the NSF award. We estimate the workshop was attended by approximately 40 researchers and students over the course of two days. Invited speakers. We invited four senior speakers in the field to the workshop. The speakers were identified on the basis of long-standing contributions to research and teaching activities related to clinical data analysis. The invited speakers for the workshop were: (1) Dr. Gregory Cooper, MD, PhD, a Professor of Biomedical Informatics and of Intelligent Systems at the University of Pittsburgh; (2) Shahram Ebadollahi, PhD, IBM Healthcare Systems and Analytics Research, (3) John Holmes, MD. PhD, University of Pennsylvania, (4) George Hripcsak, MD, PhD, a Professor and Chair of Columbia University’s Department of Biomedical Informatics and Director of Medical Informatics Services for NewYork-Presbyterian Hospital. Panel on challenges for clinical data analysis was held at the end of the workshop. The panel participants consisted of Dr. Zoran Obradovic, Dr. Hagit Schatkay, Dr. Gregory Cooper and Dr. John Holmes. However, many other workshop attendees actively participated in the discussion. The topics discussed were: problems, data characteristics, challenges, and outcome measures that are specific to the analysis of and learning from clinical data and that make them different from the problems and data analyzed routinely by the machine learning and data mining community. Initial plans for a possible publication of a paper summarizing these specifics were laid down and discussed. Inclusion of underrepresented groups. Clinical data analysis provides an appealing application domain that attracts a broad range of participants with diverse backgrounds. Especially important is the inclusion of students and groups that are commonly underrepresented in machine learning conferences. The ICML conference had traditionally encouraged students to participate in the main conference and workshops, and supported them by offering student scholarships that helped to pay the conference registration costs in exchange for conference volunteering and some of the travel expenses. However, 2012 was different and only a very few student fellowships were given to conference and especially workshop participants. Thanks to the NSF award, we were able to support the travel of four graduate students from US universities who were able to come, present their results and receive feedback from the senior researchers in the field. The four student participants who applied and received the student travel support were Charmgil Hong, Zitao Liu, Jenna Viens and Marzyeh Ghassemi. We believe without the NSF support the students would not be able to attend. Summary. We believe the 'Machine Learning for Clinical Data Analysis' workshop was a success and that participants left with new insights and inspiration for further work in the area. Thanks to the NSF funding that supported in part the travel of invited speakers and graduate students we were able to put together a high-quality technical program and create an opportunity to share the knowledge and give feedback to early research work. It is our hope that this workshop together with previous ICML workshops: ML for Health Care Applications in 2008 and Learning from Unstructured Clinical Text Data in 2011 will pave the way to increased integration and sharing of the results and methods among these communities and to a permanent forum or a conference on this topic.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1243409
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2012-06-15
Budget End
2013-05-31
Support Year
Fiscal Year
2012
Total Cost
$18,000
Indirect Cost
Name
University of Pittsburgh
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15260