New, scientist-run virus database vows to be transparently run and simple to use
Pathoplexus is starting with sequences for Ebola strains and two other risky viruses
New, scientist-run virus database vows to be transparently run and simple to use
Pathoplexus is starting with sequences for Ebola strains and two other risky viruses
29 Aug 20243:35 PM ETByJon Cohen
Ebola virus isolated in November 2014 from patient blood samples obtained in Mali.
A new database will share sequences for public health threats such as Ebola virus.NIAID/Wikimedia Commons
16 Aug 2021
During the COVID-19 pandemic, researchers had sharp complaints about the main database for sharing SARS-CoV-2 sequences and its overseer, a controlling, secretive businessman with no formal scientific training and a checkered past. Now, a small band of scientists has launched an open-source database for some of the world’s most dangerous viruses that they say will be run by the very researchers who sequence the pathogens and analyze their genomes.
Called Pathoplexus, the database launched this week at first will focus on the Sudan and Zaire strains of Ebola virus, as well as Crimean-Congo hemorrhagic fever virus and West Nile virus. Like similar databases, it hopes to help communities derail outbreaks before they grow, and, if that fails, better respond to epidemics and pandemics. “We believe that building trust through transparency is essential for encouraging broader participation in data sharing,” says Pathoplexus co-founder Anderson Fernandes de Brito, a computational biologist at the All for Health Institute.
But Pathoplexus aims to stand apart in other ways—especially compared with the Global Initiative on Sharing All Influenza Data (GISAID) database, which has become a central repository of sequences for the viruses that cause COVID-19, influenza, mpox, pneumonia, chikungunya, dengue, and Zika. GISAID has been harshly criticized for concealing its finances and governance, and several scientists have complained about its founder, erstwhile businessman Peter Bogner, and his representatives reprimanding them for how they use the database and even cutting off access during disputes.
Pathoplexus will be run by an executive board of sequencing scientists from five continents. “Our governance is guided by what we’ve tried to put at the heart of what we call the Pathoplexus values and are supposed to live on beyond any single person,” says Swiss Tropical and Public Health Institute molecular biologist Emma Hodcroft, another co-founder of the project. Those values include assuring that people who generate pathogen sequence data receive recognition for their work and share in benefits from it, such as diagnostics, treatments, and vaccines—issues that proved to be a key sticking point in an attempt by the World Health Organization member states to agree on a pandemic treaty in June.
Unlike GISAID, Pathoplexus will allow scientists depositing virus sequences to set the terms of whether to make their data fully open or restrict their use. But unlike GISAID, the new database plans to share all unrestricted data it receives with the world’s three main government-funded genome databases (GenBank, EMBL-EBI, and the Database of Japan), as required by many journals publishing analyses of viral sequences. It will also limit the time submitters can restrict access to their data so they don’t get scooped on publications. GISAID never fulfilled a promise to do this.
“GISAID was set up to address these issues and protect submitters, but has suffered a major breakdown in community trust over the last couple of years,” says evolutionary biologist Edward Holmes of the University of Sydney, who is not involved with creating Pathoplexus and hopes it quickly catches on and expands. “So, there is a clear need for an alternative.”
Pathoplexus currently holds fewer than 15,000 sequences, as compared with GISAID’s nearly 18 million. The more sequences scientists submit, and the more quickly they openly share those data, the more valuable Pathoplexus will be, researchers say. “I am not sure it will be able to solve all the issues, but it is an attempt in the right direction,” says virologist Gustavo Palacios of the Icahn School of Medicine at Mount Sinai, who has studied the Ebola and Crimean-Congo hemorrhagic fever viruses.
ScienceInsider spoke with Hodcroft in this lightly edited interview.
Q: Why are you doing this?
A: The idea behind Pathoplexus is to try and create another option for people to help encourage pathogen data sharing. Sometimes people genuinely just don’t feel ready to share yet, but want to upload data. But it was too complicated, so they didn’t have the time and resources to figure out how to get that data online. Maybe the biggest thing is we’ve tried to just really lower the bar for uploading.
Q: Do GISAID’s governance issues have anything to do with it?
A: When you have millions of sequences that have been generated through the combined efforts of scientists around the world, it’s really important to be very sure that we are happy with the governance and the transparency. We did try and speak to people in the community about what they felt they would want if they could design a database from the ground up.
Q: How does Pathoplexus differ?
A: We want it to be clear how decisions are made before they’re made. Can you appeal? Can you ask for clarification? We’re making summaries of our meetings public so that people know what we are deciding and can give us feedback. Hopefully, it can be a reliable, open-source database that’s there to serve the community, no matter who’s on the board and who the members are.
Q: So there’s no Peter Bogner.
A: The day-to-day running of Pathoplexus is a five-person executive board that is elected and held accountable by the members. It’s not one person who is in charge.
Q: Where is your funding coming from?
A: By and large, this has really been a huge volunteer effort. We are actively looking for funding now. It’s wonderful to build this on rainbows and unicorns, but we can’t run that way forever.
Q: Why did you choose these four viruses?
A: We decided to focus on pathogens of public health concern that didn’t seem to have a good sharing solution at the moment. We also had connections to these communities. We’re already in discussions about what might be useful to add next.
Q: Is Pathoplexus going to mitigate issues about access to pathogen sequences and the benefits of sharing them?
A: This is something the biggest minds in the world have been working on for the past few months and have not been able to solve, so are we, as a small database, going to be the ones to come up with the answer? I think truthfully the answer is no. But we want to be part of that solution when there is a solution.
Q: How do you build trust?
A: There is friction between data sharing and data use in developed and developing countries in the Global North and the Global South, and we didn’t want to ignore that. So we had diverse input as we developed this, and then we have a really diverse board. As wonderful as it is to have really prestigious, big-name scientists on the board, they’re often at a point in their careers where they are not sitting down and mining the data anymore. And now I think we have to show the community that we really do want to hear from them, and that this is more than just a pretty, day one picture.
doi: 10.1126/science.zy6b10m
Jon Cohen
Science
Why won't they leave Africa alone!!!!!!!!!!!!!!!!! Evil bastards. Who in their search for truth believes we were reincarnated and willingly came here???????