Team of PhD Researchers Unveil AI-powered Platform to Open Source COVID-19 Vaccine Development

Powered by ArangoDB, epitopes.world lets vaccine developers leverage neural network to identify targets for SARS-CoV-2 vaccine testing.

A team of machine learning, immunology, and bioinformatics researchers today unveiled Epitopes.world, an AI-powered, open source, interactive web platform to help accelerate vaccine development for COVID-19. Led by Tariq Daouda, PhD, who is currently a postdoctoral researcher at Harvard Medical School, the team behind epitopes.world consists of volunteers who have doctoral degrees in machine learning and immunobiology, as well as bioinformaticians and web developers.

The process to develop a vaccine is typically a very lengthy and costly process due to the number of virus-infected cells that need to be analyzed – which tend to be rare, fragile, and precious. While obtaining his doctorate degree at the Institute for Research in Immunology and Cancer at the Université de Montréal, Dr. Daouda developed an AI algorithm that can predict which parts of a virus are more likely to be exposed at the surface of infected cells, which are called epitopes. These predictions can be used by researchers to generate a significantly shorter list of potential targets to test in the creation of a vaccine — essentially reducing a process that typically takes weeks or months to hours.

Today, this neural network, called CAMAP, has been made available for any researcher to use on Epitopes.world. Along with generating predictions for potential vaccine targets, epitopes.world will also contain interactive visualizations that will allow researchers to plot their results and use them for further research. In the future, the site will continue to grow with more tools that allow researchers to dive deeper into the data.

“COVID-19 stresses the need to accelerate the design of vaccines and therapies to reduce the human and economic impact of global pandemics,” said Tariq Daouda, PhD, team lead of Epitopes.world. “People infected with COVID-19 tend to have a significant decrease in circulating immune cells during the acute infection phase, making it difficult to systematically isolate enough immune cells to study appropriately in a lab. Through the results now available on Epitopes.world, the team is utilizing open source technologies to connect machine learning to biomedicine to help accelerate learnings and findings. The hope is that the scientific community will be able to leverage these results to help prioritize ongoing experimental work towards developing effective vaccination strategies.”

Dr. Daouda and his team built epitopes.world using open source technologies, which allowed them to not only accelerate the development of the project, but also allow researchers who use the platform to more easily collaborate and share results with each other. They also have a public API, and the code is available in GitHub under an open-source license.

ArangoDB, the leading open source multi-model graph database, serves as the backend of the portal, storing over 182,000 epitopes and their metadata: approximately 39,000 from SARS-CoV-2, 39,000 from SARS-CoV-1, and 104,000 from normal human sequences for comparison. As an open source, multi-model database, ArangoDB gives epitopes.world a streamlined deployment stack with data access flexibility as the project evolves in the future.

Jörg Schad, PhD, is the Head of Engineering and Machine Learning at ArangoDB and a core member of the epitopes.world team, responsible for the cloud infrastructure and maintenance. In order to support the project, ArangoDB is providing free cloud resources in the form of its fully-managed cloud offering, ArangoDB Oasis.

DreamHost

“The current situation with the novel coronavirus pandemic requires the combined expertise from a wide range of experts and diverse backgrounds ,” said Jörg Schad, PhD, Head of Engineering and Machine Learning at ArangoDB. “We are thrilled to be able to collaborate and contribute our knowledge and platform for scalable and high-performance data, and especially graph, processing to this project.”

Learn more 

About epitopes.world 
Epitopes.world is made and maintained by a multidisciplinary team of researchers and developers who are committed to share science openly and develop innovative ways of instantly disseminating research results when timing is crucial. Learn more at epitopes.world.

About ArangoDB 
One engine. One query language. Multiple data models. With more than 7 million downloads and over 9,000 stargazers on GitHub, ArangoDB is the leading open source multi-model graph database. It combines the power of graphs with JSON documents, a key-value store, and a full-text search engine, enabling developers to access and combine all of these data models with a single, elegant, declarative query language.

Simplifying complexity and increasing productivity is the mission of ArangoDB Inc., the company behind the project. Founded in 2014, ArangoDB Inc. is a privately-held company backed by Bow Capital and Target Partners. It is headquartered in San Francisco and Cologne, Germany with offices and employees around the world. Learn more at http://www.arangodb.com.

Epitopes derived from SARS-CoV-2 proteins are presented at the cell surface by MHC molecules. CD8 T cells then recognize these epitopes and become activated, and proceed to eliminate the virus infected cells.

Epitopes derived from SARS-CoV-2 proteins are presented at the cell surface by MHC molecules. CD8 T cells then recognize these epitopes and proceed to eliminate the virus-infected cells.

The hope is that the scientific community will be able to leverage the results on Epitopes.world to help prioritize ongoing experimental work towards developing effective vaccination strategies.