Machine learning for discovery: deciphering RNA splicing logic

-
Oded Regev, New York University
Fine Hall 214

Recent advances in machine learning such as deep learning have led to powerful tools for modeling complex data with high predictive accuracy. However, the resulting models are typically black box, limiting their usefulness in scientific discovery. Here we show that an "interpretable-by-design'' machine learning model captures a fundamental cellular process known as RNA splicing. Our model provides a systematic understanding of RNA splicing logic, recapitulating and extending on existing domain knowledge. It also allowed us to discover and experimentally validate novel splicing features. This study highlights how interpretable machine learning can advance scientific discovery. The talk will not assume any prior biological knowledge.

Based on joint work with Susan E. Liao and Mukund Sudarshan.