article

Five areas in life sciences where data integrity is vital

Anca Ciobanu, Strategic Theme Lead at the Pistoia Alliance, discusses the need to ensure data integrity in life sciences. She explains how data integrity can reduce costs and increase patient safety across five key areas: reproducibility, IDMP, the microbiome, semantic enrichment, artificial intelligence and machine learning.

Much of our progress in life sciences R&D hinges on data integrity. Bad data costs organisations time and resources – IBM estimates it accounts for $3.1 trillion in losses to organisations every year.1 True data integrity goes beyond just quality; it requires that data is attributable, legible, contemporaneous, original, enduring and available across the lifecycle of a research project. By building data integrity into the early phases of R&D and throughout the data lifecycle, life sciences organisations can better steer research – and save both time and cost. Here are five key areas we can transform by adopting data integrity.

The reproducibility crisis

More than 70 percent of researchers have failed to repeat peer experiments, and more than half have failed to reproduce their own.2 Unless we can replicate results with a high degree of reliability, research findings cannot be regarded as scientific knowledge. Data integrity (or lack thereof) has a huge part to play in the reproducibility crisis. The clearest example lies in how most experiment methods are conducted today; when organisations come to reproduce a study, they are often unable to access the full details of a method, which might be written by hand, saved locally, or locked in an electronic lab notebook (ELN). This lack of transparency increases the risk of human error and makes it extremely difficult for a clinical research organisation (CRO) to re-establish and validate the method within the same parameters.