Research misconduct – and other integrity issues – are poised to become a major issue for research publications, eventually forcing major changes in the way laboratory data is summarised, reviewed, and retained. Greater transparency and data lifecycle management will not be a best practice, but a mandate for researchers who want to continue to publish original research.
Allow me a personal moment. By background, I am a scientist whose career has been largely spent in quality control laboratories and lab informatics teams, so I have experienced the lab from both scientific and data management views. For the past several years, I have been deeply involved in data integrity (misconduct) identification and remediation for pharmaceutical labs and manufacturing, so I have a view of this topic from a GMP (good manufacturing practice) perspective. As I read articles about misconduct in research, I see parallels to the numerous data integrity citations given to pharmaceutical and clinical research firms by regulators around the world. These parallels form the basis of my opinion here.
The issue is far larger than believed
There are several reasons to believe this issue will be big: (1) the issue of misconduct was more prevalent in GMP labs and manufacturing than anyone would have believed; (2) there is already indirect evidence that misconduct is widespread in research labs; (3) systematic controls to prevent misconduct are less rigorous in research than in GMP-regulated QC labs; (4) research labs are less likely to be inspected than their GMP counterparts; and (5) research labs have similar motives to GMP-regulated QC labs and manufacturers.
Data integrity among GMP manufacturers
In the past three years, about 130 US FDA Warning Letters have been given to pharmaceutical manufacturers or clinical research organisations for data integrity-related infractions. These are serious infractions that cost companies millions (or even billions) of dollars, and typically two years or more to remediate. So, what caused the spike in these infractions? Regulators learned to do data forensic auditing. Once they understood how to look for data discrepancies, and the ways that data could be manipulated to create a desired outcome, they were able to increasingly find bad data practices that were missed in the past.
They stopped looking at procedures in a conference room and started looking at data in the company’s systems. And they found data manipulation: re-running samples to get a desired (passing) test result, keeping ‘official’ and ‘unofficial’ batch records for manufactured pharmaceutical products, deleting unfavourable data or storing it in other places to hide it from inspectors – these are but a few examples.
The shift in focus to inspecting original data directly in the electronic system exposed an industry-wide issue that will require several more years of efforts to improve.
Indirect evidence in research
Gupta [1] lists two relevant statistics in discussing the matter: nearly 40 per cent of researchers were aware of misconduct and did not report it, and 17 per cent of surveyed authors of clinical drug trials reported that they personally knew of fabrication in research occurring over the previous 10 years.
Systematic controls
By the term ‘systematic controls’, we are describing processes or procedures that are routinely used to assure that reported data values are complete and accurate. GMP regulations require companies to use equipment that is calibrated periodically, use test reference standards, formally train personnel, retain all testing data (even if not used to report a result), have all testing data reviewed by another scientist prior to using the data to make any product decisions.
Computer systems must not allow people to share accounts, and must limit key activities (enhanced access, such as administrator) to a limited set of people who have no conflict of interest in the work they perform.
Manufacturers typically use standard reports to look for product issues or to potential data integrity issues. They are required to record and track any unplanned events, and determine the root cause of the event, so its recurrence is unlikely.
In contrast, research labs will calibrate instruments, provide basic science training for personnel and will use reference standards and controls for tests. But they have no requirements for individual user accounts. Sometimes accounts are shared to save license fees, and many labs allow everyone in the lab to be a system administrator and make system changes as needed. Unplanned events have no requirement for investigation.
While original data exists in computer systems, that is usually not the data reviewed by another scientist; rather, the summarised test data is reviewed before determining the test result as acceptable for use. Data is retained for future use, but it is the summary data, rather than the complete set of original data values collected at the time of testing. Articles submitted for publication are peer reviewed, but the host lab provides the summary data to a peer reviewer. And there is no requirement to save all data – even the results that were not summarised for publications.
The lack of original data review, and the lack of a requirement to retain all data – even data not included in a summary – collectively permit a scientist to test and retest until a desired result is obtained, then to ignore or delete all other data values and report the desired one.
Inspection
GMP manufacturers can be inspected by regulators at any time, with no notice (some notice is required for foreign inspections). Once in the facility, they can look at anything and interview anyone involved in manufacturing. Forensic data inspections can be conducted, and regulatory bodies (such as FDA, MHRA, EMA) have experts they bring for these inspections.
The FDA attempts to inspect firms every two years, although requests to market a new drug nearly always result in an inspection prior to an approval to market. It is not uncommon for large pharmaceuticals to be inspected a dozen or more times by different global regulators within a year.
In contrast, research labs might be inspected for safety by university personnel, by a local committee, or perhaps by a certifying authority (if certified at all), but they have no inspection authority looking at their operation in detail, unless a ‘for cause’ is initiated by a local committee. This lack of direct, independent, detailed oversight provides an environment where misconduct can continue for extended periods without discovery.
Similar motives
Research labs and QC labs that test pharmaceuticals have similar motives, which means that they will have similar reasons to manipulate results to be more favourable to them.
QC labs perform mostly routine tests, using written procedures and analytical equipment often configured to efficiently do one job.
By contrast, research labs reconfigure their lab equipment for each experiment, and seldom use written procedures. So how could they be the same? Motive is the answer, and the problem. People will manipulate data when they are rewarded (or, not punished) for doing it.
A lab scientist can be pushed to misconduct when a deadline looms and the ‘right’ result is needed, or someone’s pay/position will suffer. If the ‘wrong’ result is reported, the material must be discarded (manufacturing), or the experimental thesis abandoned and/or revised (research). For both manufacturer or researcher, money and time are lost. The manufacturer must have new batches of medicines to sell for profit, while the researcher needs data to stay ahead – the ‘publish or perish’ challenge of research. Since both manufacturer and researcher share similar motivations to achieve desirable data on a schedule, data integrity issues in one (manufacturing) makes it likely that similar issues exist in the other on an equal basis (research).
Driving the engine of change
Given the high percentage of indirect evidence of research misconduct, the lack of data forensic inspections and independent oversight of research labs, the lack of requirements for strict security and access controls to data management systems, research labs appear to be more at risk than GMP manufacturers for misconduct, manipulation and hiding of data.
So what will cause the issue to be exposed, and how will the improvements in data integrity be pushed into research? Will it come from governments, NIH, publishers or the universities themselves? Will it be voluntary, tied to standards or external accreditations, or codified as law? Misconduct also raises questions about all those studies that are contradictory: coffee is bad for you, it is good for you, etc. Is the problem a lack of statistical power in the data, or was it due to data selection, trying to support an unsupportable hypothesis? n
Reference
[1] Gupta, Ashwaria. Perspectives in Clinical Research. 2013 Apr-Jun; 4(2): 144–147.
www.ncbi.nlm.nih.gov/pmc/articles/PMC3700330/infractions.