The VODAN Implementation Network is one of the joint activities carried out by CODATA, RDA, WDS, and GO FAIR (Link to the Data Together Statement).
Read the full statement on Data Together COVID-19 Appeal and Actions.
Active GO FAIR Implementation Network
The spread of the virus causing the COVID-19 outbreak is far from over. During this epidemic and in earlier occasions, we have seen severely suboptimal data management and data reuse. Moreover, access to the immensely valuable data of past and current epidemics is not always equally accessible for different affected populations and countries. For instance, the data from the past Ebola epidemics are very difficult to find, to access, and if accessible, they are not interoperable, let alone reusable. Under the urgent need to harness machine-learning and future AI approaches to discover meaningful patterns in epidemic outbreaks, we need to do better and ensure that data are FAIR (in this sense also meaning Federated, AI–Ready).
Purpose of the Implementation Network
This time, we can do better. We now have the technical ability, as well as the commitment from experts in a series of affected countries, to make the SARS CoV-2 virus data FAIR, meaning that they are Findable, Accessible, Interoperable and thus Reusable by both humans and machines, during this epidemic of COVID-19. The technical components that make this possible can remain in place, waiting in ready state for potential future infectious disease outbreaks.
We started this Implementation network with a very narrow focus, based on seed funding from co-founding partners, ZonMW and the Philips Foundation (see the manifesto), namely to make source data FAIR and make them available for reuse in a distributed manner. These initial projects under VODAN IN are coordinated by the GO FAIR Foundation. With a sense of urgency driven by the rapid developments on COVID-19 we came together to launch a GO FAIR Implementation Network to address the immediate challenges. For this epidemic, unfortunately, we have to ‘FAIRify’ COVID-19 data ‘after the fact’ and use Chinese, Dutch, Swedish, etc. and English electronic (or even hand-written) health records to create proper FAIR data. The FAIRification will initially focus on the Clinical Research Form (CRF) model following the WHO standards. Multiple IN partners will create input forms that make it easy for local caregivers to create FAIR-CRF data in real time as a first step. As a second step, we will jointly develop (via online work sessions) localized FAIR Data Points (FDP). FDP is a FAIR data repository with ‘docking’ capabilities as a ‘station’ for ‘trains’ (virtual machines (VMs)) that come to ‘visit’ the data locally, with a specific question to ask. The local data custodian (frequently a hospital or centre for disease control and prevention type of institution) grants permission to VMs to ask the question / run analyses. As the personal data of patients never leaves the underlying database of the local institution, GDPR issues are largely accommodated and in this way data can be ‘shared’ or rather ‘visited’ without violating any patient rights and, in the case of a disease outbreak, also governed by the laws and policies of the individual jurisdictions in which the outbreak manifests.
Trains (VMs) can visit multiple local FAIR Data Points to get their questions answered. For more information on the underlying technological approach please visit the Personal Health Train IN pages. The data stewardship aspects of FAIR data will be addressed wherever possible with the Data Stewardship Competence Centres IN.
The VODAN IN consortium is a light-weight public private partnership (members listed under the VODAN clusters) that will jointly address in a stepwise fashion, the following issues:
- Ensure that the WHO-CRF(s) and other input forms for Corona data (and later viral outbreaks in general) are properly mapped to a machine readable (RDF) format, so that any stakeholder can create input forms that lead to the resulting data being a machine actionable (FAIR) digital objects.
- Create multiple user-friendly input systems (e.g., web forms) that create interoperable (FAIR) data ‘upon save’. Castor, one of the IN members, already took a first step.
- Assist partners in affected regions to use traditional source files to create FAIR versions of selected data available in their country with local experts (domain experts, EHR experts and Semantic Data experts), mainly with Online Collaboration sessions.
- Install, jointly with the local partners, a local FAIR Data Point (FDP) -or multiple points, in case the data are not centrally collected in the country. Again, this is a remote working session, as an FDP is a server application that can be installed in partnership with local institutions, according to the local specifications and in compliance with regional laws.
- Deliver, with the partners in the participating countries, a series of FDPs that can be ‘visited’ under well-defined conditions by VMs (trains) to answer questions and discover patterns in these ‘real world observation data’. (Note: this will not yet be a fully automated distributed learning environment)
- Demonstrate the value of this approach to WHO and other (national and international authorities and initiatives such as GLOPID-R) and seek certification and WHO approval for FAIR compliant CRF forms, FDPs and access protocols that do not violate personal privacy and/or national legislation.
- Advanced stage: Develop a FAIR conceptual model for viruses and viral outbreaks originally based on CRFs and the sort of data CDC/RIVM-type institutions typically collect.
- Offer the ‘Real World Data’ FDPs under agreed conditions to qualified research groups, institutions and private companies to use the data (by controlled querying, not downloading) to answer questions that may lead to the discovery of patterns in real world and established knowledge data, which in turn may lead to new prevention and intervention options.
- As a final step before the IN will dissolve and hand over its assets to either standing organisations or a new, larger IN dealing with a wider approach to real world data FAIRness (in a preparatory phase now), including registries of vaccines drugs, side effects etc, the consortium will document and publish in Open Access and under CC-by license, all specifications that allow future repetition of this process for new epidemics that will undoubtedly confront society with similar challenges.
Expectation management statement
This IN will NOT include the actual research projects that may make use of the FAIR Data Point network created. Third parties (that may or may not include partners also participating in the VODAN IN) will be able to gain access to the data under the conditions set by the data custodians and/or WHO. FAIR data form a substrate for machine-assisted research and are not a solution or a goal in themselves. In addition, the IN itself will not be able to sustain, or expand the service beyond its initial activity and funding cycle. It is therefore crucial that on ‘day one’ a scalability and sustainability model of the services will be developed by the partners.
More detailed information on all VODAN IN clusters including lists of cluster members and contact of cluster leads:
Link to manifesto in PDF
Note that due to the significant interest in VODAN IN, responses of the GO FAIR Office and Foundation might be slower than usual.