A workshop as part of the Pacific Symposium on Biocomputing 2024

Risk prediction: Methods, Challenges, and Opportunities

The primary efforts of disease and epidemiological research can be divided into two areas: identifying the causal mechanisms and utilizing important variables for risk prediction. The latter is generally perceived as a more obtainable goal due to the vast number of readily available tools and the faster pace of obtaining results. However, the lower barrier of entry in risk prediction means that it is easy to make predictions, yet it is incredibility more difficult to make sound predictions. As an ever-growing amount of data is being generated, developing risk prediction models and turning them into clinically actionable findings is crucial as the next step. However, there are still sizable gaps before risk prediction models can be implemented clinically. While clinicians are eager to embrace new ways to improve patients’ care, they are overwhelmed by a plethora of prediction methods. Thus, the next generation of prediction models will need to shift from making simple predictions towards interpretable, equitable, explainable and ultimately, casual predictions. The purpose of this workshop is to introduce and discuss the current and future of risk prediction in the context of disease and epidemiological research. We will discuss the pressing topics ranging from data sources to model implementation.


9:00 - 9:05 Workshop Introduction

9:05 - 9:35 Building Trust in AI for Improving Health Outcomes - Dr. Randi Foraker (Washington University in St. Louis)

9:35 - 10:05 Knowledge extraction from clinical notes for AI-enhanced risk prediction: the promise of generative NLP approaches and large language models - Dr. Graciela Gonzalez-Hernandez (Cedars Sinai Medical Center)

10:05 - 10:35 TBD - Dr. Bogdan Pasaniuc (UCLA)

10:35 - 10:45 Break

10:45 - 11:15 Improving risk prediction by leveraging biomarker genetics - Dr. John Witte (Stanford University)

11:15 - 11:45 Shortening odysseys: AI for rare disease diagnosis and therapeutic innovation - Dr. Marinka Zitnik (Harvard University)

11:45 - 12:00 Discussion

Workshop Organizers

Ruowang Li, Ph.D. is an Assistant Professor in the Department of Computational Biomedicine at Cedars Sinai Medical Center. His lab focuses on developing computational methods to extract knowledge from large-scale population-level data, such as biobank-linked electronic health record data. His area of research includes multi-omics data integration, federated learning of patients’ data, genetic risk prediction, and genome-phenome associations.

Rui Duan, Ph.D. is an Assistant Professor of Biostatistics at the Harvard T.H. Chan School of Public Health. Her research interests focus on developing statistical and machine learning methods for effective use of biomedical data, in order to generate reliable evidence and knowledge that enable precise and accurate diagnostics, support clinical decision making, and optimize individualized treatments. Specifically, her lab focuses on predictive models based on electronic health records (EHR) and EHR-linked biobanks, federated learning and meta-analysis methods for effective evidence synthesis and data integration, and methods to account for suboptimality of real-world data, including missing data and measurement errors.

Lifang He, Ph.D. is an Assistant Professor in the Department of Computer Science and Engineering at Lehigh University. Her group focuses on developing advanced computational methods for biomedical research such as on understanding disease mechanisms, diagnosis, prognosis, disease biomarkers, and disease pathways. Her research interests broadly include machine learning computational medical imaging, AI for health, tensor computing, and multimodal analysis.

Jason H. Moore, Ph.D. is a biomedical informatician and founding Chair of the Department of Computational Biomedicine at Cedars-Sinai Medical Center in Los Angeles. His research on artificial intelligence methods for the analysis of biomedical data has been continuously funded the NIH for more than 20 years. He has been a pioneer in the development of automated machine learning methods for risk prediction in populationbased studies and samples derived from electronic health records. He is an elected fellow of the American College of Medical Informatics, the International Academy of Health Sciences Informatics, the American Statistical Association, the International Statistics Institute, and the American Association for Advancement of Science. He is Editor-in-Chief of the open-access journal BioData Mining.