Extended references for "AI told you so: navigating protein structure prediction in the era of machine learning"

In addition to the few references in our article (https://doi.org/10.1042/bio_2024_118), we have compiled a reference list to the many papers, websites and videos we discussed, as well as several that have been released since we handed in our final draft.
 
Machine-learning structure prediction algorithms
AlphaFold2: Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021 596:7873 596, 583–589 (2021).
AlphaFold-Multimer: Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv 2021.10.04.463034 (2021) doi:10.1101/2021.10.04.463034.
AlphaFold 3: Abramson, J., Adler, J., Dunger, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature (2024). doi:10.1038/s41586-024-07487-w
RoseTTAFold: Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science (1979) 373, 871–876 (2021).
RoseTTAFoldNA: Baek, M. et al. Accurate prediction of protein–nucleic acid complexes using RoseTTAFoldNA. Nature Methods 2023 21:1 21, 117–121 (2023).
RoseTTAFold All-Atom: Krishna, R. et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science (1979) (2024) doi:10.1126/science.adl2528.
ColabFold: Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nature Methods 2022 19:6 19, 679–682 (2022).
ESMFold: Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science (1979) 379, 1123–1130 (2023).
RGN2: Chowdhury, R. et al. Single-sequence protein structure prediction using a language model and deep learning. Nature Biotechnology 2022 40:11 40, 1617–1623 (2022).
EMBER3D: Weissenow, K., Heinzinger, M., Steinegger, M. & Rost, B. Ultra-fast protein structure prediction to capture effects of sequence variation in mutation movies. bioRxiv 2022.11.14.516473 (2022) doi:10.1101/2022.11.14.516473.
OmegaFold: Wu, R. et al. High-resolution de novo structure prediction from primary sequence. bioRxiv 2022.07.21.500999 (2022) doi:10.1101/2022.07.21.500999.
 
Antibody-specific references
IgFold: Ruffolo, J. A., Chu, L. S., Mahajan, S. P. & Gray, J. J. Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies. Nature Communications 2023 14:1 14, 1–13 (2023).
ABlooper: Abanades, B., Georges, G., Bujotzek, A. & Deane, C. M. ABlooper: fast accurate antibody CDR loop structure prediction with accuracy estimation. Bioinformatics 38, 1877–1880 (2022).
Predicting antibodies using Rosetta: Bennett, N. R. et al. Atomically accurate de novo design of single-domain antibodies. bioRxiv 2024.03.14.585103 (2024) doi:10.1101/2024.03.14.585103.
 
Methods to evaluate protein complex predictions
PI-score: Malhotra, S., Joseph, A. P., Thiyagalingam, J. & Topf, M. Assessment of protein–protein interfaces in cryo-EM derived assemblies. Nature Communications 2021 12:1 12, 1–12 (2021).
DockQ: Basu, S. & Wallner, B. DockQ: A Quality Measure for Protein-Protein Docking Models. PLoS One 11, e0161879 (2016).
mpDockQ: Bryant, P. et al. Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search. Nature Communications 2022 13:1 13, 1–14 (2022).
pDockQ2: Zhu, W., Shenoy, A., Kundrotas, P. & Elofsson, A. Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes. Bioinformatics 39, (2023).
Local Interaction Score (LIS): Kim, A.-R. et al. Enhanced Protein-Protein Interaction Discovery via AlphaFold-Multimer. bioRxiv 2024.02.19.580970 (2024) doi:10.1101/2024.02.19.580970.
PPIscreenML: Mischley, V., Maier, J., Chen, J. & Karanicolas, J. PPIscreenML: Structure-based screening for protein-protein interactions using AlphaFold. bioRxiv 2024.03.16.585347 (2024) doi:10.1101/2024.03.16.585347.
 
Online resources: Databases, Tools and Tutorials
AlphaFold Protein Structure Database: https://alphafold.ebi.ac.uk/
ESM Metagenomic Atlas: https://esmatlas.com/
EMBL-EBI tutorial course for AlphaFold (Magana Gomez, P.G. and Kovalevskiy, O.): https://www.ebi.ac.uk/training/online/courses/alphafold/
An excellent video guide to ColabFold and AlphaFold (Ovchinnikov, S. and Steinegger, M., hosted by Bahl, C.): https://www.youtube.com/watch?v=Rfw7thgGTwI
UniProt: Bateman, A. et al. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res 51, D523–D531 (2023).
          Website: https://www.uniprot.org/
BioGrid: Oughtred, R. et al. The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci 30, 187 (2021).
          Website: https://thebiogrid.org/
Dali: Holm, L. Dali server: structural unification of protein families. Nucleic Acids Res 50, W210–W215 (2022).
          Website: http://ekhidna2.biocenter.helsinki.fi/dali/
Foldseek: van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nature Biotechnology 2023 42:2 42, 243–246 (2023).
          Website: https://search.foldseek.com
Progres: Greener, J. G. & Jamali, K. Fast protein structure searching using structure graph embeddings. bioRxiv 2022.11.28.518224 (2024) doi:10.1101/2022.11.28.518224.
          Website: https://github.com/greener-group/progres
Predictomes: Schmid, E. W. & Walter, J. C. Predictomes: A classifier-curated database of AlphaFold-modeled protein-protein interactions. bioRxiv 2024.04.09.588596 (2024) doi:10.1101/2024.04.09.588596.
PAE Viewer: Elfmann, C. & Stülke, J. PAE viewer: a webserver for the interactive visualization of the predicted aligned error for multimer structure predictions and crosslinks. Nucleic Acids Res 51, W404–W410 (2023).
 
Other related software and algorithms
ChimeraX: Meng, E. C. et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Science 32, e4792 (2023).
AlphaPulldown: Yu, D., Chojnowski, G., Rosenthal, M. & Kosinski, J. AlphaPulldown—a python package for protein–protein interaction screens using AlphaFold-Multimer. Bioinformatics 39, (2023).
AF-Cluster: Wayment-Steele, H. K. et al. Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 2023 625:7996 625, 832–839 (2023).
ACE: Schafer, J. W. & Porter, L. L. Evolutionary selection of proteins with two folds. Nature Communications 2023 14:1 14, 1–13 (2023).
AlphaFill: Hekkelman, M. L., de Vries, I., Joosten, R. P. & Perrakis, A. AlphaFill: enriching AlphaFold models with ligands and cofactors. Nature Methods 2022 20:2 20, 205–213 (2022).
AlphaFold2-RAVE: Gu, X., Aranganathan, A. & Tiwary, P. Empowering AlphaFold2 for protein conformation selective drug discovery with AlphaFold2-RAVE. (2024).
AF2BIND: Gazizov, A., Lian, A., Goverde, C., Ovchinnikov, S. & Polizzi, N. F. AF2BIND: Predicting ligand-binding sites using the pair representation of AlphaFold2. bioRxiv 2023.10.15.562410 (2023) doi:10.1101/2023.10.15.562410.
EMNGly: Hou, X., Wang, Y., Bu, D., Wang, Y. & Sun, S. EMNGly: predicting N-linked glycosylation sites using the language models for feature extraction. Bioinformatics 39, (2023).
CombFold: Shor, B., Schneidman-Duhovny, D., Rachel, T. & Benin, S. CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2. Nature Methods 2024 21:3 21, 477–487 (2024).
AlphaLink2: Stahl, K., Brock, O. & Rappsilber, J. Modelling protein complexes with crosslinking mass spectrometry and deep learning. bioRxiv 2023.06.07.544059 (2023) doi:10.1101/2023.06.07.544059.
Predicting diverse proteins: Wheeler, R. J. A resource for improved predictions of Trypanosoma and Leishmania protein three-dimensional structure. PLoS One 16, e0259871 (2021).
The Encyclopedia of Domains: Lau, A. M. et al. Exploring structural diversity across the protein universe with The Encyclopedia of Domains. bioRxiv 2024.03.18.585509 (2024) doi:10.1101/2024.03.18.585509.
DPAM: Zhang, J., Schaeffer, R. D., Durham, J., Cong, Q. & Grishin, N. V. DPAM: A domain parser for AlphaFold models. Protein Science 32, e4548 (2023).
MMseqs2: Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35, 1026–1028 (2017).
PTM-Mamba: Peng, Z., Schussheim, B. & Chatterjee, P. PTM-Mamba: A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks. bioRxiv 2024.02.28.581983 (2024) doi:10.1101/2024.02.28.581983.
 
Usage cases and other references
Related to the false positive example: Celestino, R. et al. JIP3 interacts with dynein and kinesin-1 to regulate bidirectional organelle transport. Journal of Cell Biology 221, (2022).
Molecular replacement for X-ray crystallography: Evans, P. & McCoy, A. An introduction to molecular replacement. Acta Crystallogr D Biol Crystallogr 64, 1–10 (2008).
Molecular replacement and AlphaFold2: McCoy, A. J., Sammito, M. D. & Read, R. J. Implications of AlphaFold2 for crystallographic phasing by molecular replacement. Acta Crystallogr D Struct Biol 78, 1–13 (2022).
An example of AlphaFold2 usage in crystallography: Cooper, B. F. et al. An octameric PqiC toroid stabilises the outer-membrane interaction of the PqiABC transport system. EMBO Rep 25, 82–101 (2024).
AlphaFold modelling the LIS1-dynactin interaction: Singh, K. et al. Molecular mechanism of dynein-dynactin complex assembly by LIS1. Science (1979) 383, 1431–1448 (2024).
AlphaFold used to help build an entire nuclear pore Fontana, P. et al. Structure of cytoplasmic ring of nuclear pore complex by integrative cryo-EM and AlphaFold. Science (1979) 376, (2022).
Protein identification in cryoET maps using AlphaFold: Chen, Z. et al. De novo protein identification in mammalian sperm using in situ cryoelectron tomography and AlphaFold2 docking. Cell 186, 5041-5053.e19 (2023).
A screen for protein complexes in yeast using RoseTTAFold and AlphaFold: Humphreys, I. et al. Computed structures of core eukaryotic protein complexes. Science 374, (2021).