Invited Talks - AI4SC'24

It's the Data, Stupid….

Speaker: Prof. Carl Kesselman

Abstract: Recent advances in machine learning and large language models have produced extraordinary results. For example, Alphafold has solved a grand challenge in biology that has been open for over fifty years. However, if you look carefully, you will find that Alphafold was only possible with the existence of high-quality, well-curated data, specifically the Protein Database. In general, these realizations have led to the emergence of what is called data-centric AI, an approach in which one focuses on the creation of high-quality data, rather than focusing on the creation of new models. In this talk, I will describe our efforts in creating high-quality FAIR data for high-quality reproducible machine learning applications. I will also present examples from the FaceBase data repository and smaller-scale collaborations in Glaucoma detection.

Speaker Bio:

Carl Kesselman is the William M. Keck Professor of Engineering in the USC Viterbi School of Engineering. He is a professor in the Daniel J. Epstein Department of Industrial and Systems Engineering and holds positions in the Department of Computer Science, the Department of Population and Public Health Sciences in the Keck School of Medicine, and Biomedical Sciences in the Ostrow School of Dentistry. Dr. Kesselman is a USC Information Sciences Institute Fellow, where he directs the Informatics Systems Research Division and the Director of the Center of Excellence for Discovery Informatics in the Michelson Center for Convergent Biosciences. He has been the PI on collaboration and data management and analysis infrastructure for numerous large-scale NIH-funded initiatives in areas such as craniofacial development, kidney reconstruction, synaptic mapping, and genito-urinary tract development.

Dr. Kesselman has received numerous honors for his pioneering research, including the Lovelace Medal from the British Computing Society, the Goode Memorial Award from the IEEE Computing Society, and the IEEE Internet Award. He is a Fellow of the British Computing Society, the IEEE, and the Association for Computing Machinery (ACM).

AI and Scientific Discovery

Speaker: Dr. Shounak Datta

Abstract: The integration of advanced Artificial Intelligence (AI), particularly foundation models in language and vision, is revolutionizing scientific discovery. AI methods are transforming research by facilitating hypothesis generation, experiment design, data analysis, and interpretation, enabling breakthroughs across diverse fields such as genetics and climate modeling.

This talk explores recent applications of AI tools and models that are reshaping the scientific process across various fields ranging from genetics to climate modeling. In this context, we discuss AI methods ranging from self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, to generative methods that can synthesize hypotheses by analyzing diverse data modalities, including images and sequences. While these advancements offer significant enhancements, we will also address ongoing challenges and also aim to discuss the algorithmic innovations necessary to further accelerate scientific progress with AI.

Speaker Bio:

Dr. Shounak Datta is currently an ML Researcher at Arm Inc., working on making large AI models more efficient. He holds a Ph.D. in Computer Science from the Indian Statistical Institute in Kolkata, India, an M.E. in Electronics and Telecommunication Engineering from Jadavpur University in Kolkata, and a B.Tech. in Electronics and Communication Engineering from the West Bengal University of Technology in Kolkata.

Before joining Arm, Dr. Datta has worked as an Applied ML Scientist at Amazon, and as a Postdoctoral Associate in the Department of Electrical and Computer Engineering at Duke University. His research interests range across topics such as large foundation models, few-shot learning, class imbalance, etc.

Speaker's Webpage: https://shounak-d.github.io/