That sunlight is the best disinfectant has become a truism in recent years, in science as much as in professional life in general. As concern has risen about the so-called reproducibility crisis in biomedicine, publication of raw data has been touted as the obvious solution.
Such an 鈥渙pen science鈥 approach makes a lot of sense. At the very least, it makes selective reporting of results easier to detect, ending a practice聽that has, in some cases, distorted science鈥檚 evidence base and impeded progress. Other harmful behaviours, such as hypothesising after results are known or even flat-out research fraud, may also be uncovered earlier.
Open science is not, however, the panacea that some imagine, and it comes with its own challenges.
Biomedical publications encompass a wide spectrum of research activities. At one end, studies using cell lines and animals present few ethical impediments to providing the raw data underlying figures, or the original images of, for example, western blots. At the other end, however, are epidemiological studies and clinical trials. These draw on data from healthy individuals and patients, who expect medical confidentiality to be respected.
探花视频
In a recent letter in , a group of European researchers observed that sharing even apparently innocuous biological data, such as serum values, requires 鈥渆xtreme prudence鈥 since, in combination with other patient data that may also be necessary for replication of the original analysis, such as age, gender and geographic location, it may permit the identification of individuals, violating data protection rules and ethical codes. 鈥淭he larger the number of variables, the greater the risk with modern technology,鈥 the authors note.
The risk is particularly acute regarding clinical trials that contain data at the level of individual participants. Yet pressure to open up even these datasets continues to increase. Late last year, for instance, the US National Institutes of Health released a draft policy on data-sharing that will require that all funded investigators make their datasets available to colleagues.
探花视频
The US Environmental Protection Agency is also proposing to ignore any for major environmental regulations unless scientists disclose all of their raw data, including information obtained from individual medical records. Meanwhile, in the UK, a new 鈥溾 for the reporting of clinical trials has recently been unveiled by the NHS鈥 Health Research Authority, following a recommendation by the Commons Science and Technology Select Committee.
Another concern about open science is that shared data might be used to distort the evidence. For instance, data from a vaccine trial might be misused by an antivaxxer organisation, or a study into the use of new electronic nicotine delivery devices might be cherrypicked by a tobacco company. However, a November event on open science convened by the US National Academy of Sciences 鈥 and attended by journal editors, funders, patients and administrators of data-sharing platforms 鈥 heard repeatedly that such concerns had not yet been realised.
This may be due to the gatekeeping of open clinical data applied by the sharing platforms.聽,聽for example, requires applicants to submit 鈥渁 quality research proposal鈥, which must contain 鈥渁ll the information needed for clinical trial sponsors and independent review panels to make a decision about the request鈥.
Moreover, as suggested in the Lancet letter, it would not be too difficult to pseudonymise individual data in order to make it legally open and accessible. Yes, researchers would need to obtain consent from participants and explain the risk of backwards identification, but this is something that many participants could probably live with. At the NAS event, held in Washington DC, a patient representative was relatively sanguine about the use of his experimental data, pointing out that many patients entered trials for altruistic reasons and would expect data to be used in a way that has the maximum possible beneficial impact.
探花视频
This underlines the faultiness of the current default position regarding clinical trials and epidemiological studies, to the effect that making the raw data widely available is dangerous. Instead, opening up these participant-derived data should be regarded as a worthwhile process that needs to be supported and managed carefully.
In the absence of explicit, advance consent from patients, making anonymised clinically derived data openly available is probably a step too far for most institutions. However, they should actively facilitate the anonymisation and pseudonymisation of patient and participant data underlying their published outputs. And they should establish clear operating procedures on how and with whom these data can be shared.
Moreover, when data are not derived from human subjects, researchers should be routinely expected to upload their underlying raw data and metadata into a database as soon as their publication is listed on their institution鈥檚 repository. This should be done in accordance with the 鈥淔AIR鈥 principles of being findable, accessible, interoperable and reusable.
Such an approach would allow institutions to audit data published in their name and enable rapid screening for research integrity issues.
探花视频
You may not read it in bacteriology textbooks, but sunlight really does have the potential to eradicate most pathological behaviour聽鈥撀爄f only we fully open science鈥檚 curtains.
Jonathan Grigg is professor of paediatric respiratory and environmental medicine at Queen Mary University of London鈥檚 School of Medicine and Dentistry, where he is also deputy dean for research integrity.
探花视频
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to 罢贬贰鈥檚 university and college rankings analysis
Already registered or a current subscriber?








