Data Deja Vu: Haven't I entered this before?

John Putzke
Aug 18, 2022
7 min read

Updated: May 16, 2024

Summary

The powerful methodological advantages of within-subject designs and the inherent repeating nature of many clinical events (e.g., tumors) underscore the critical need for effective setup and management of repeating forms in various research projects. Several user-interface and design related factors complicate data entry. This article examines the requirements and related functionality associated with solving these data entry challenges. Examples are provided using a tumor registry.

Riding The Wave

A wave of new technologies, devices, treatments, and information systems is creating an exciting pipeline of clinical research projects. Automated data collection (e.g., devices) and extraction (e.g., electronic medical record) is becoming increasingly common, however, many research projects include a manual data entry component due to the high costs of incomplete, missing or inaccurate data.

When there is a manual component, the goal is to minimize the data entry burden and ensure accurate, timely and complete data. Unfortunately, numerous user-interface (UI) and design-related factors complicate data entry of repeating events, particularly when designing systems applicable across multiple diseases. This article highlights some of the main design challenges. Less emphasis is placed on UI challenges since exceptional UI design is immediately obvious when seen rather than described. Other data entry issues are discussed elsewhere (e.g., Form Design Best Practices; Utilizing Data In Clinical Decisions; Patient Data Entry Incentives).

Similar Puzzle Pieces

Tumors are one example from a broader collection of clinical research event-types that share many data entry UI and design-related challenges. Other examples include diagnoses, adverse events, seizures, irritable bowel flare, etc. There are also interventions associated with these events that have similar repeating data entry factors (e.g., medications, surgery, etc.). There is considerable variation across this collection, but the main overlapping requirements are:

When and how many?
- Unpredictable event timing and number
Over what time period?
- A time range when the event is applicable
What's its name?
- Variable values are used to identify events

The various event-types are distinguished by requirement specifics and workflow assumptions which, in turn, can be used to simplify the UI and validate data entry. This is done by developing core infrastructure to address overlapping requirements, and then adding configuration options to accommodate specific event-type needs. To give the reader a high-level sense of what’s involved and what can be done, summary notes on some of the main requirements, configuration options and related functionality are provided below. The more generic term “Event” is used, as opposed to a specific event-type (e.g., tumor), to emphasize the applicability across disease populations and research projects.

Before describing the functional requirements, a brief detour to mention the importance of one UI-related issue and the general importance of engaging participants in the research process.

Seeing Is Entering

Counterintuitively, one of the most important components of manual data entry has nothing to do with fingers striking the keyboard, but instead the eyes. That is, being able to visualize clinical events over time is a key component to ensuring accurate data entry. A visual display provides a quick and easy way to determine whether or not all relevant events have been captured and over what time period. There are a number of different ways to visualize data, various examples have been discussed in other articles (see here). Moreover, visualizing the data provides a powerful incentive to enter data in a timely fashion (see here). In the example image above, notice it is easy to appreciate the timing of various clinical events. In this case bolded endpoints indicate events with an effective range, as opposed to onset date only. The user can zoom in and out of the visual display to set the time range.

Engaging Patients

Patient reported outcomes (PRO) are a common part of clinical studies and registries. One part of solving the data entry puzzle is to allow patients to enter data themselves. However, if the ONLY thing patients can do is enter data, then incomplete data and high dropouts rates surely follow, particularly for long-term follow-up. Engaging patients in the process involves a broad range of functionality that have been described in other articles (e.g., “Patient Disease Surveys: Turning Giants Into Windmills”) and, thus are only briefly listed here.

Simple, intuitive data entry
Incentives for completing forms (e.g., earn points, exchange points for gift cards).
Secure messaging with staff to address questions, comments, etc.
Personalized Content (e.g., multimedia disease and study information based on patient's own data).

The Requirements

The main system requirements for capture of repeating clinical events are as follows.

Create Multiple Events

The most obvious requirement is the ability to create any number of events, each with an associated start date. When applicable, a minimum / maximum event number and / or a specified date range within which events are applicable are helpful configuration options.

Prevent Duplicate Events

Since multiple events can be created, a mechanism is needed to ensure all entered events are unique. The related concept in database design is the "Primary Key” variable which forces each row in a table to have a unique variable value. Various setup options could be used to ensure uniqueness, but typically users select which variable(s) determine whether an event is unique (e.g., seizure type).

Identify Correct Event

Since multiple events may exist, the UI requires a straight-forward mechanism to accurately identify and distinguish between events. Said in a manner that highlights the key issue, there is a perpetual decision by data entry staff to either work with an existing or create a new event. Although the identification mechanism often overlaps with the "Primary Key", it is a distinct concept. For example, tumor ID may be the primary key, but location and onset date are used to identify the correct tumor.

Workflow Display Characteristics

To the extent data can be used in day-to-day operations (e.g., clinical decisions), staff are much more likely to enter and keep data entry up to date (e.g., see blog on clinical reports here). One way to integrate the use of event data into operations is through workflow display characteristics. Distinct from primary key and identifying attributes, the idea is to create display sets of variable values that benefit a specific workflow endpoint. Thus, the requirement is allow users to define variable sets and layout options in a manner that best fits the task within a clinical setting. For example, when tasked with pulling tumor samples, tumor id, body location, date of onset are less helpful than a list of freezer storage locations.

Global vs Targeted Data

When entering data, the UI should make it intuitive and easy to distinguish between data that is applicable across multiple events, as opposed to associated with a single event. In other words, there are categories of data that are global versus specific in nature with regards to its relationship to repeating events, and the UI should guide users to appropriate data entry. For example, a chemotherapy agent often targets multiple tumors, whereas a surgical procedure typically targets a specific tumor.

Time Range

On the surface, an event’s time range is a straight-forward concept, just set the begin and end time. Beneath this layer, however, are numerous differences in event-type requirements and assumptions about workflow. Moreover, there is no guarantee data entry happens in a serial, time-based manner. Even with just these two factors, it quickly becomes complex to manage all the issues surrounding an event’s time range and validating data entry. To get a sense of some of the issues, consider the following:

There is a difference between the end of an event and ending data capture about the event.
Events may not have an end date (e.g., epilepsy seizure type).
Begin / end date may or may not be associated with the study / clinic visit date (e.g., pathological diagnosis date vs. medication start date).
Is the date entered inclusive or exclusive?
Future dates are possible (e.g., A planned dose change 2 weeks after the clinic visit).

The remaining requirements and functionality discussed below are related to appropriate configuration of the event time range.

Open / Start

In addition to a configurable name (e.g., open, start, begin), users need to accommodate two event scenarios.

Linked To Visit Date – The event begin is not directly entered, instead it is linked to the clinic / study visit date.
Independent Of Visit Date – The event is directly entered and is not tied to the clinic / study visit date (e.g., medication begin date).

Close / End

In addition to a configurable name (e.g., close, end), users need to accommodate three event scenarios.

Data Capture End – Date data will no longer be entered about the event.
Event End – Date the event ends (e.g., medication no longer used).
No End – Event has no end date (e.g., epilepsy seizure type).

Manage Change vs. Repeat Data Entry

Event identifying variable values are ALWAYS only entered once with configuration options to accommodate three different data entry scenarios from one visit to the next. Event-related forms are:

Blank at each clinic / study visit and entered anew (an imaging event [e.g., MRI]).
Pre-populated using the previous visit, but considered INDEPENDENT from the previous visit for data validation. In such cases, a change to historical data does not affect subsequent visit data (e.g., seizure severity).
Reconciled at each clinic / study visit (i.e., only changes are entered), and are DEPENDENT for data validation on the previous visit (e.g., if historical data is edited, current data must be reconciled again). For example, medication use over a year long period whereby data entry staff reconcile the list at each visit for accuracy. In such cases, a change in historical data would requiring re-validating medication use at all subsequent clinic visits.

Data Reconciliation

Medication use and seizure severity ratings are two examples whereby data may be carried forward across visits and ONLY variable value changes entered. In such cases, a mechanism is required to indicate the data have been reviewed at each visit. This data reconciliation process can be accomplished through a variety of UI methods (e.g., checkbox, signature, etc.) depending on the circumstances.

The above are some of the main requirements for repeating form data entry, there are others related to specific event types (e.g., medication use). Look for more articles on this subject coming soon.