Data collection mistakes: a cost driver for poorly performing clinical sites

Posted by Jim Lane on Jul 16, 2018 2:16:19 PM

Clinical trial data needs to be correctly collected, reported, and recorded so that accurate conclusions can be reached about a treatment’s safety and efficacy. Everyone makes mistakes — it is human nature. However, overlooking a procedure, forgetting to centrifuge a blood sample on time, or a missed communication between study management and clinic departments can mean that patient data must be discounted.

Clinical trial information that is gathered incorrectly, or not gathered at all, necessitates the use of data clarification forms. A small number of data errors at one site may not significantly impact the data integrity, but the cumulative effect of data problems at a number of sites can have extensive financial impact: It's estimated that resolving each data query costs $130, and that a typical 70-site, 300-patient study generates > 4,200 queries. That's a cost of over $546,000 for a typical trial.

Learn about blockchain and AI in clinical trials with this eBook download:  Clinical Trials in the Digital Age

Examples and effects of bad data

Data collection mistakes, such as missing data points or multiple documents showing different data for the same subject, can be generated by site, sponsor, and CRO personnel for a variety of reasons. Some examples include:


Site personnel

  • Forgotten, misremembered, or assumed data
  • Misunderstanding of questions
  • Errors in extraction from source document to CRF/eCRF (case report forms)
  • Incorrectly performed procedures
  • Mismeasurement or miscalculation of a value


Sponsor and CRO personnel

  • Errors in data processing, such as:

‑ Misinterpretation of handwritten values

‑ Errors in keying in data

‑ Miscalculation of a derived item

‑ Use of leading language to suggest 'better' values

  • Errors in the database (e.g. data stored in the wrong place)


"We must not tolerate subjecting patients to months of visits and procedures only to discard their data due to protocol deviations."


Data queries slow down clinical trials

An analysis of a Phase IIIb trial of over 2,700 subjects showed that the time between a query being submitted and resolved ranged from one day to 22.8 weeks, with a mean of 51.9 days. Although the delay did not ultimately impact on the trial results, timelines were increased and the process was resource-intensive for the sponsor, CRO, and site staff.  In some situations, where data queries are not resolved promptly, data points may be lost or, even worse, the subject’s records can be discarded.


What not to do

When designing the study protocol, it’s tempting for sponsors and CROs to ask for as much data collection as possible. But more data can magnify the risk of human errors and result in more queries. So it's important to only collect data that is directly related to the outcome of the trial.


It might seem logical to increase site monitoring and source data verification (SDV). Although SDV can detect transcription mistakes, the process itself is subject to human error and does not prevent data problems within the source document. Indeed, evidence suggests that 100% SDV does not significantly improve data quality.


Improving data collection

What can be done to mitigate the risk of human error and optimize clinical data? Nimita Limaye, CEO of Nymro Clinical Consultancy Services, sums up the need to address potential issues as early as possible: "It is also necessary to incorporate quality at the grassroots level, when defining the edit checks, when building the database, while designing the CRF and the CRF completion guidelines, while training the investigator, when framing the queries. Quality issues at this level can result in a cascade of quality issues."


The use of electronic data capture and the use of mobile solutions can ensure that both site staff and patients enter data correctly, and devices that can capture biometric data automatically are becoming more widespread. During and after data collection, risk-based monitoring and data centralization (read our blogs on risk-based monitoring here and here) improve data reliability.


“It is not enough to do your best; you must know what to do, and then do your best.”

W. Edwards Deming


Better data will save time and money.

Online training and support tools have been shown to reduce queries by up to 50%; a 20-50% reduction in queries in the above example of a typical trial would yield cost savings of $109k-$273k.  (source: ICON).


Albert Einstein is quoted as saying “Anyone who has never made a mistake has never tried anything new.” At Longboat, we help you to minimize mistakes through our study-specific training for site staff and study teams, as well as our range of support tools to assist sites and patients in implementing your trial. Given that the quality of data collected during a trial is a predictive measure of a study’s success, taking proactive measures to reduce bad data collection will set well-performing studies apart.

New Call-to-action

Topics: Clinical Trial Results, Clinical Research Conduct, monitoring, Clinical Trial Compliance, clinical site support

Subscribe to Monthly Recaps of Our Original Content

Posts by Topic

see all

Popular Posts