Future Developers Meeting
Organizers: Olga Vitek and Michael Shortreed
Description
This workshop brings together developers (and aspiring developers) of computational and statistical tools for mass spectrometry, proteomics and metabolomics. The program will include tutorials by leading developers of tools, invited oral presentations, oral and poster presentations selected from submitted abstracts, a panel discussion, as well as ample opportunities for informal interactions..
We invite everyone to submit abstracts describing tools for mass spectrometry, proteomics and metabolomics. Of particular interest are topics such as treatment of match-between-runs, normalization of multi-batch experiments, and treatment of missing values, however all topics are welcome. The abstracts should focus less on the specific functionality, and more on the design of the tool, and its strategy for enabling reproducible research, sustainability, and inter-operability with other computational tools.
A subset of the abstracts will be selected for oral or poster presentations. All accepted presenters will receive a free admission to this part of the program.
Target audience
Anyone interested in developing computational and statistical tools for mass spectrometry, proteomics and metabolomics.
Meeting Agenda
Below is the draft of the 2025 meeting agenda. More updates and additions will be posted here soon.
Location: Northeastern University Main Campus, Boston MA (Room TBA)
Saturday, May 10
May 10, 9:00am – 12:30pm | Tutorial: Knowledge-driven reconstruction of context-specific molecular networks using proteomics. This session combines theoretical insights and practical approaches to context-specific biological network inference, leveraging prior knowledge from OmniPath. We will explore the foundational concepts and methodologies, including the various levels of information provided by prior knowledge, the trade-off between coverage and confidence, and both basic and advanced techniques for single- and multi-condition analyses using tools from NetworkCommons and CORNETO. The session will also feature an open discussion on the major limitations in the field and the strategies being developed to address these challenges. For the practical part of the session, attendees should have a basic understanding of Python programming and a Google account to run notebooks in Google Colab. Speaker: Martin Garrido |
May 10, 12:30pm – 1:30pm | Lunch break and posters |
May 10, 1:30pm – 5:00pm | Tutorial: Challenges in scalable computing for biological data. Languages like R and Python provide an accessible introduction to programming for analyzing biological data through communities like Bioconductor and Biopython. However, scaling workflows to analyze massive datasets efficiently can be very difficult and often requires significant programming expertise. We will discuss common challenges in parallel processing and distributed computing for biological datasets, current techniques for dealing with these challenges, and future directions the community can take to improve the state of the art. Participants should be comfortable programming in R or Python with an openness to learning concepts from other languages Speaker: Kylie Bemis |
May 10, 6:00pm – 9:00pm | Future developers meeting dinner and scientific panel Location: TBA, all participants are invited |
Sunday, May 11
Additional speakers TBA
May 11, 9:00am – 10:30am | Tutorial: Missingness-informed protein quantification and differential expression (DE) analysis powered by limma and limpa. In this lecture, we will introduce the suite of limpa and limma methods for differential expression (DE) analysis in mass spectrometry (MS)-based proteomics data. The new pipelines are based on the detection probability curve (DPC; Li and Smyth, 2023), which is a probabilistic model for non-ignorable missingness in MS-based data. DPC-based protein quantification takes missing values into account which significantly improves the DE analysis. We will also have a brief discussion on the best practices for exploratory data analysis, batch correction and normalization of MS data. Attendees should have basic to intermediate R coding skills. Speaker: Mengbo Li |
May 11, 10:30am – 11:00am | Break and refreshments |
May 11, 11:00am – 12:30pm | Invited and contributed talks Invited speaker: Brian Searle, Ohio State University, Title TBA Additional speakers TBA |
May 11, 12:30pm – 1:30pm | Lunch break and posters |
May 11, 1:30pm – | Invited and contributed talks – Open-ended Additional speakers TBA |
Practical Details
Application
You may apply to attend the Future Developers Meeting via the May Institute application form. Deadlines are the same as for May Institute.
Talk/Poster Requirements
Must be your original research. Talks should be 15 minutes or less. Posters size roughly A0. Please include a max 1 page abstract with your application, or email it to mayinstitute@ccs.neu.edu by the May Institute application deadline.
Featured Speakers
Martin GarridoEMBL, Julio Saez-Rodrigez lab |
|
Originally from Córdoba, Spain, Martin is na biochemist with over 10 years of experience analyzing omics data. He earned his Bachelor’s degree in Biochemistry from the University of Córdoba, followed by two Master’s degrees: an MSc in Biomedical Research from the University of Córdoba and an MSc in Computational Biology and Bioinformatics from the National Health School in Madrid. He developed his PhD research at the Maimonides Biomedical Research Institute of Córdoba (IMIBIC) and the Clinical Bioinformatics Area in Sevilla. In 2021, Martin began his postdoctoral journey at Heidelberg University and EMBL. His current work focuses on advancing our understanding of cellular signaling processes. Martin specializes in conceptualizing and modeling these processes on a large scale, particularly through the functional analysis and interpretation of transcriptomics and functional proteomics data. To achieve this, he leverages computational network approaches and, as datasets grow larger, interpretable machine learning methods. | |
Kylie BemisNortheastern University |
|
Kylie is Assistant Teaching Professor in the Khoury College of Computer Sciences at Northeastern University. She holds a B.S. degree in Statistics and Mathematics, a M.S. degree in Applied Statistics, and a Ph.D. in Statistics from Purdue University. In 2013, she interned at the Canary Center at Stanford for Cancer Early Detection, where she developed the Cardinal software package for statistical analysis of mass spectrometry imaging experiments. In 2015, she was awarded the John M. Chambers Statistical Software Award by the American Statistical Association for her work on Cardinal. In 2016, she joined the Olga Vitek lab for Statistical Methods for Studies of Biomolecular Systems at Northeastern University as a postdoctoral fellow. In 2019, she joined Northeastern as faculty, where she now teaches data science and develops curriculum for the M.S. in Data Science program. Her research interests include machine learning and large-scale statistical computing for bioinformatics. | |
Mengbo LiWEHI, Gordon Smyth lab |
|
Mengbo Li is a Research Officer in Bioinformatics Division at WEHI (Walter and Eliza Hall Institute of Medical Research), Australia, where she completed her PhD on development of statistical methods for mass spectrometry-based proteomics and imaging-based spatial transcriptomics data. She has been a member on the Community Advisory Board for Bioconductor since 2023. She is an author to and/or has contributed to several most downloaded packages on Bioconductor including limma (vooma pipelines for proteomics data), edgeR (voomLmFit pipelines for RNA-seq) and scider (for single-cell resolution spatial transcriptomics data). Her research interests focus on development of statistical and computational methods for analysis of proteomics and spatial technologies. |