Event: Knot or Not? Sequence-Based Identification of Knotted Proteins With Machine Learning
Date: Friday 17 November 2023
Time: 15:00 - 16:00
Venue: BM401B, Centre for Molecular Medicine & Biobanking, Biomedical Sciences Building, UM Msida Campus
Project BioGeMT is organising the first seminar of a bimonthly Bioinformatics Seminar Series, detailed below. Visit the News & Events > Seminar Series page on their to stay up to date with the schedule.
Speaker: Joanna Sulkowska – Interdisciplinary Laboratory of Biological Systems Modelling – University of Warsaw
Presentation:
Knotted proteins, although scarce, are crucial structural components of certain protein families, and their roles remain a topic of intense research. Capitalising on the vast collection of protein structure predictions offered by AlphaFold, this study computationally examines the entire UniProt database to create a robust dataset of knotted and unknotted proteins.
Utilising this dataset, we develop a machine learning model capable of accurately predicting the presence of knots in protein structures solely from their amino acid sequences, with our best-performing model demonstrating a 98.5% overall accuracy. Unveiling the sequence factors that contribute to knot formation, we discover that proteins predicted to be unknotted from known knotted families are typically non-functional fragments missing a significant portion of the knot core. The study further explores the significance of the substrate binding site in knot formation, particularly within the SPOUT protein family. Our findings spotlight the potential of machine learning in enhancing our understanding of protein topology and propose further investigation into the role of knotted structures across other protein families.
