A massive research effort has been mobilized to address the COVID-19 pandemic. Timing is a critical to this investigative work given the transmission speed and detrimental impact of the disease. Coordinating research efforts and building upon previous work are key factors for shrinking the therapeutic discovery window. To assist in that process, this article presents the data definition of a international COVID-19 registry (see here). Initially starting with sites in Italy and China, the registry has been adopted by the American Society of Anesthesia Research committee on Critical Care Medicine and deployed in multiple medical centers across the United States.
It is hoped others will leverage this information to minimize redundant work and facilitate the aggregation of data for analysis. Those with an interest in collaborating please send your contact information to firstname.lastname@example.org. An article on techniques used to maximize registry data collection can be found in a separate blog post here.
As with all registries, the data definition is both over-inclusive and under-inclusive depending on the aims. Thus, a brief description of the international registry is in order to help appropriately frame the data definition and it’s potential utility. The registry had both administrative aims related to case reporting and care planning, as well as empirical aims directed toward preliminary hypothesis testing of rescue strategies and guidance for subsequent clinical trials. The clinical response to nitric oxide gas (iNO) was of particular interest. Also note the registry began in multiple Italian medical centers during a period of intensive time demands on medical staff, thus data entry burden was a key concern.
Taken together, these aims form the basis of the constraints around the scope of the data definition. Although the data collection domains are generally applicable to nearly all COVID-19 clinical studies, modifications are expected to accommodate the specific needs of each study. To be clear, it is not being asserted that the data definition presented here contains sufficient breadth for the bulk of COVID-19 clinical research projects.
The registry protocol included daily data collection for up to 28 days. The domains assessed included the following:
Data Definition Spreadsheet
A spreadsheet containing detailed information about the data definition can be downloaded below.
The spreadsheet is structured as one row per variable with the columns defined as:
Description of form
Checkbox (0 = not checked, 1 = checked)
Picklist or radio button (coded value & label [e.g., sex: 0 = female, 1 = Male])
Text / memo
Code (i.e., unique variable name, less than 30 characters)
Data entry prompt
Other Columns As applicable
Minimum value (numeric fields only)
Maximum value (numeric fields only)
Length (text fields only)
Required for data entry
Numeric coded values
Separated by commas into two components
The numeric value
Below are screenshots of all the forms. Note validation was turned off for the purpose of these screenshots so that all the fields would appear. The ‘live’ forms contain skip logic to hide/show fields as appropriate based on data entry.