Accumulating evidence indicates high risk of bias in preclinical animal research, questioning …
Accumulating evidence indicates high risk of bias in preclinical animal research, questioning the scientific validity and reproducibility of published research findings. Systematic reviews found low rates of reporting of measures against risks of bias in the published literature (e.g., randomization, blinding, sample size calculation) and a correlation between low reporting rates and inflated treatment effects. That most animal research undergoes peer review or ethical review would offer the possibility to detect risks of bias at an earlier stage, before the research has been conducted. For example, in Switzerland, animal experiments are licensed based on a detailed description of the study protocol and a harm–benefit analysis. We therefore screened applications for animal experiments submitted to Swiss authorities (n = 1,277) for the rates at which the use of seven basic measures against bias (allocation concealment, blinding, randomization, sample size calculation, inclusion/exclusion criteria, primary outcome variable, and statistical analysis plan) were described and compared them with the reporting rates of the same measures in a representative sub-sample of publications (n = 50) resulting from studies described in these applications. Measures against bias were described at very low rates, ranging on average from 2.4% for statistical analysis plan to 19% for primary outcome variable in applications for animal experiments, and from 0.0% for sample size calculation to 34% for statistical analysis plan in publications from these experiments. Calculating an internal validity score (IVS) based on the proportion of the seven measures against bias, we found a weak positive correlation between the IVS of applications and that of publications (Spearman’s rho = 0.34, p = 0.014), indicating that the rates of description of these measures in applications partly predict their rates of reporting in publications. These results indicate that the authorities licensing animal experiments are lacking important information about experimental conduct that determines the scientific validity of the findings, which may be critical for the weight attributed to the benefit of the research in the harm–benefit analysis. Similar to manuscripts getting accepted for publication despite poor reporting of measures against bias, applications for animal experiments may often be approved based on implicit confidence rather than explicit evidence of scientific rigor. Our findings shed serious doubt on the current authorization procedure for animal experiments, as well as the peer-review process for scientific publications, which in the long run may undermine the credibility of research. Developing existing authorization procedures that are already in place in many countries towards a preregistration system for animal research is one promising way to reform the system. This would not only benefit the scientific validity of findings from animal experiments but also help to avoid unnecessary harm to animals for inconclusive research.
Beginning January 2014, Psychological Science gave authors the opportunity to signal open …
Beginning January 2014, Psychological Science gave authors the opportunity to signal open data and materials if they qualified for badges that accompanied published articles. Before badges, less than 3% of Psychological Science articles reported open data. After badges, 23% reported open data, with an accelerating trend; 39% reported open data in the first half of 2015, an increase of more than an order of magnitude from baseline. There was no change over time in the low rates of data sharing among comparison journals. Moreover, reporting openness does not guarantee openness. When badges were earned, reportedly available data were more likely to be actually available, correct, usable, and complete than when badges were not earned. Open materials also increased to a weaker degree, and there was more variability among comparison journals. Badges are simple, effective signals to promote open practices and improve preservation of data and materials by using independent repositories.
We can regard the wider incentive structures that operate across science, such …
We can regard the wider incentive structures that operate across science, such as the priority given to novel findings, as an ecosystem within which scientists strive to maximise their fitness (i.e., publication record and career success). Here, we develop an optimality model that predicts the most rational research strategy, in terms of the proportion of research effort spent on seeking novel results rather than on confirmatory studies, and the amount of research effort per exploratory study. We show that, for parameter values derived from the scientific literature, researchers acting to maximise their fitness should spend most of their effort seeking novel results and conduct small studies that have only 10%–40% statistical power. As a result, half of the studies they publish will report erroneous conclusions. Current incentive structures are in conflict with maximising the scientific value of research; we suggest ways that the scientific ecosystem could be improved.
Low reproducibility rates within life science research undermine cumulative knowledge production and …
Low reproducibility rates within life science research undermine cumulative knowledge production and contribute to both delays and costs of therapeutic drug development. An analysis of past studies indicates that the cumulative (total) prevalence of irreproducible preclinical research exceeds 50%, resulting in approximately US$28,000,000,000 (US$28B)/year spent on preclinical research that is not reproducible—in the United States alone. We outline a framework for solutions and a plan for long-term improvements in reproducibility rates that will help to accelerate the discovery of life-saving therapies and cures.
We have empirically assessed the distribution of published effect sizes and estimated …
We have empirically assessed the distribution of published effect sizes and estimated power by analyzing 26,841 statistical records from 3,801 cognitive neuroscience and psychology papers published recently. The reported median effect size was D = 0.93 (interquartile range: 0.64–1.46) for nominally statistically significant results and D = 0.24 (0.11–0.42) for nonsignificant results. Median power to detect small, medium, and large effects was 0.12, 0.44, and 0.73, reflecting no improvement through the past half-century. This is so because sample sizes have remained small. Assuming similar true effect sizes in both disciplines, power was lower in cognitive neuroscience than in psychology. Journal impact factors negatively correlated with power. Assuming a realistic range of prior probabilities for null hypotheses, false report probability is likely to exceed 50% for the whole literature. In light of our findings, the recently reported low replication success in psychology is realistic, and worse performance may be expected for cognitive neuroscience.
A focus on novel, confirmatory, and statistically significant results leads to substantial …
A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as “p-hacking,” occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses.
Computers are now essential in all branches of science, but most researchers …
Computers are now essential in all branches of science, but most researchers are never taught the equivalent of basic lab skills for research computing. As a result, data can get lost, analyses can take much longer than necessary, and researchers are limited in how effectively they can work with software and data. Computing workflows need to follow the same practices as lab projects and notebooks, with organized data, documented steps, and the project structured for reproducibility, but researchers new to computing often don't know where to start. This paper presents a set of good computing practices that every researcher can adopt, regardless of their current level of computational skill. These practices, which encompass data management, programming, collaborating with colleagues, organizing projects, tracking work, and writing manuscripts, are drawn from a wide variety of published sources from our daily lives and from our work with volunteer organizations that have delivered workshops to over 11,000 people since 2010.
Despite the potential benefits of sequential designs, studies evaluating treatments or experimental …
Despite the potential benefits of sequential designs, studies evaluating treatments or experimental manipulations in preclinical experimental biomedicine almost exclusively use classical block designs. Our aim with this article is to bring the existing methodology of group sequential designs to the attention of researchers in the preclinical field and to clearly illustrate its potential utility. Group sequential designs can offer higher efficiency than traditional methods and are increasingly used in clinical trials. Using simulation of data, we demonstrate that group sequential designs have the potential to improve the efficiency of experimental studies, even when sample sizes are very small, as is currently prevalent in preclinical experimental biomedicine. When simulating data with a large effect size of d = 1 and a sample size of n = 18 per group, sequential frequentist analysis consumes in the long run only around 80% of the planned number of experimental units. In larger trials (n = 36 per group), additional stopping rules for futility lead to the saving of resources of up to 30% compared to block designs. We argue that these savings should be invested to increase sample sizes and hence power, since the currently underpowered experiments in preclinical biomedicine are a major threat to the value and predictiveness in this research domain.
There is a scarcity of novel treatments to address many unmet medical …
There is a scarcity of novel treatments to address many unmet medical needs. Industry and academia are finally coming to terms with the fact that the prevalent models and incentives for innovation in early stage drug discovery are failing to promote progress quickly enough. Here we will examine how an open model of precompetitive public–private research partnership is enabling efficient derisking and acceleration in the early stages of drug discovery, whilst also widening the range of communities participating in the process, such as patient and disease foundations.
The movement towards open science is a consequence of seemingly pervasive failures …
The movement towards open science is a consequence of seemingly pervasive failures to replicate previous research. This transition comes with great benefits but also significant challenges that are likely to affect those who carry out the research, usually early career researchers (ECRs). Here, we describe key benefits, including reputational gains, increased chances of publication, and a broader increase in the reliability of research. The increased chances of publication are supported by exploratory analyses indicating null findings are substantially more likely to be published via open registered reports in comparison to more conventional methods. These benefits are balanced by challenges that we have encountered and that involve increased costs in terms of flexibility, time, and issues with the current incentive structure, all of which seem to affect ECRs acutely. Although there are major obstacles to the early adoption of open science, overall open science practices should benefit both the ECR and improve the quality of research. We review 3 benefits and 3 challenges and provide suggestions from the perspective of ECRs for moving towards open science practices, which we believe scientists and institutions at all levels would do well to consider.
Policies that mandate public data archiving (PDA) successfully increase accessibility to data …
Policies that mandate public data archiving (PDA) successfully increase accessibility to data underlying scientific publications. However, is the data quality sufficient to allow reuse and reanalysis? We surveyed 100 datasets associated with nonmolecular studies in journals that commonly publish ecological and evolutionary research and have a strong PDA policy. Out of these datasets, 56% were incomplete, and 64% were archived in a way that partially or entirely prevented reuse. We suggest that cultural shifts facilitating clearer benefits to authors are necessary to achieve high-quality PDA and highlight key guidelines to help authors increase their data’s reuse potential and compliance with journal data policies.
Currently, there is a growing interest in ensuring the transparency and reproducibility …
Currently, there is a growing interest in ensuring the transparency and reproducibility of the published scientific literature. According to a previous evaluation of 441 biomedical journals articles published in 2000–2014, the biomedical literature largely lacked transparency in important dimensions. Here, we surveyed a random sample of 149 biomedical articles published between 2015 and 2017 and determined the proportion reporting sources of public and/or private funding and conflicts of interests, sharing protocols and raw data, and undergoing rigorous independent replication and reproducibility checks. We also investigated what can be learned about reproducibility and transparency indicators from open access data provided on PubMed. The majority of the 149 studies disclosed some information regarding funding (103, 69.1% [95% confidence interval, 61.0% to 76.3%]) or conflicts of interest (97, 65.1% [56.8% to 72.6%]). Among the 104 articles with empirical data in which protocols or data sharing would be pertinent, 19 (18.3% [11.6% to 27.3%]) discussed publicly available data; only one (1.0% [0.1% to 6.0%]) included a link to a full study protocol. Among the 97 articles in which replication in studies with different data would be pertinent, there were five replication efforts (5.2% [1.9% to 12.2%]). Although clinical trial identification numbers and funding details were often provided on PubMed, only two of the articles without a full text article in PubMed Central that discussed publicly available data at the full text level also contained information related to data sharing on PubMed; none had a conflicts of interest statement on PubMed. Our evaluation suggests that although there have been improvements over the last few years in certain key indicators of reproducibility and transparency, opportunities exist to improve reproducible research practices across the biomedical literature and to make features related to reproducibility more readily visible in PubMed.
The reliability of experimental findings depends on the rigour of experimental design. …
The reliability of experimental findings depends on the rigour of experimental design. Here we show limited reporting of measures to reduce the risk of bias in a random sample of life sciences publications, significantly lower reporting of randomisation in work published in journals of high impact, and very limited reporting of measures to reduce the risk of bias in publications from leading United Kingdom institutions. Ascertainment of differences between institutions might serve both as a measure of research quality and as a tool for institutional efforts to improve research quality.
Replication is the cornerstone of a cumulative science. However, new tools and …
Replication is the cornerstone of a cumulative science. However, new tools and technologies, massive amounts of data, interdisciplinary approaches, and the complexity of the questions being asked are complicating replication efforts, as are increased pressures on scientists to advance their research. As full replication of studies on independently collected data is often not feasible, there has recently been a call for reproducible research as an attainable minimum standard for assessing the value of scientific claims. This requires that papers in experimental science describe the results and provide a sufficiently clear protocol to allow successful repetition and extension of analyses based on original data. The importance of replication and reproducibility has recently been exemplified through studies showing that scientific papers commonly leave out experimental details essential for reproduction, studies showing difficulties with replicating published experimental results, an increase in retracted papers, and through a high number of failing clinical trials. This has led to discussions on how individual researchers, institutions, funding bodies, and journals can establish routines that increase transparency and reproducibility. In order to foster such aspects, it has been suggested that the scientific community needs to develop a “culture of reproducibility” for computational science, and to require it for published claims. We want to emphasize that reproducibility is not only a moral responsibility with respect to the scientific field, but that a lack of reproducibility can also be a burden for you as an individual researcher. As an example, a good practice of reproducibility is necessary in order to allow previously developed methodology to be effectively applied on new data, or to allow reuse of code and results for new projects. In other words, good habits of reproducibility may actually turn out to be a time-saver in the longer run. We further note that reproducibility is just as much about the habits that ensure reproducible research as the technologies that can make these processes efficient and realistic. Each of the following ten rules captures a specific aspect of reproducibility, and discusses what is needed in terms of information handling and tracking of procedures. If you are taking a bare-bones approach to bioinformatics analysis, i.e., running various custom scripts from the command line, you will probably need to handle each rule explicitly. If you are instead performing your analyses through an integrated framework (such as GenePattern, Galaxy, LONI pipeline, or Taverna), the system may already provide full or partial support for most of the rules. What is needed on your part is then merely the knowledge of how to exploit these existing possibilities.
A study by David Baker and colleagues reveals poor quality of reporting …
A study by David Baker and colleagues reveals poor quality of reporting in pre-clinical animal research and a failure of journals to implement the ARRIVE guidelines. There is growing concern that poor experimental design and lack of transparent reporting contribute to the frequent failure of pre-clinical animal studies to translate into treatments for human disease. In 2010, the Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines were introduced to help improve reporting standards. They were published in PLOS Biology and endorsed by funding agencies and publishers and their journals, including PLOS, Nature research journals, and other top-tier journals. Yet our analysis of papers published in PLOS and Nature journals indicates that there has been very little improvement in reporting standards since then. This suggests that authors, referees, and editors generally are ignoring guidelines, and the editorial endorsement is yet to be effectively implemented.
Open data is a vital pillar of open science and a key …
Open data is a vital pillar of open science and a key enabler for reproducibility, data reuse, and novel discoveries. Enforcement of open-data policies, however, largely relies on manual efforts, which invariably lag behind the increasingly automated generation of biological data. To address this problem, we developed a general approach to automatically identify datasets overdue for public release by applying text mining to identify dataset references in published articles and parse query results from repositories to determine if the datasets remain private. We demonstrate the effectiveness of this approach on 2 popular National Center for Biotechnology Information (NCBI) repositories: Gene Expression Omnibus (GEO) and Sequence Read Archive (SRA). Our Wide-Open system identified a large number of overdue datasets, which spurred administrators to respond directly by releasing 400 datasets in one week.
Science advances through rich, scholarly discussion. More than ever before, digital tools …
Science advances through rich, scholarly discussion. More than ever before, digital tools allow us to take that dialogue online. To chart a new future for open publishing, we must consider alternatives to the core features of the legacy print publishing system, such as an access paywall and editorial selection before publication. Although journals have their strengths, the traditional approach of selecting articles before publication (“curate first, publish second”) forces a focus on “getting into the right journals,” which can delay dissemination of scientific work, create opportunity costs for pushing science forward, and promote undesirable behaviors among scientists and the institutions that evaluate them. We believe that a “publish first, curate second” approach with the following features would be a strong alternative: authors decide when and what to publish; peer review reports are published, either anonymously or with attribution; and curation occurs after publication, incorporating community feedback and expert judgment to select articles for target audiences and to evaluate whether scientific work has stood the test of time. These proposed changes could optimize publishing practices for the digital age, emphasizing transparency, peer-mediated improvement, and post-publication appraisal of scientific articles.
No restrictions on your remixing, redistributing, or making derivative works. Give credit to the author, as required.
Your remixing, redistributing, or making derivatives works comes with some restrictions, including how it is shared.
Your redistributing comes with some restrictions. Do not remix or make derivative works.
Most restrictive license type. Prohibits most uses, sharing, and any changes.
Copyrighted materials, available under Fair Use and the TEACH Act for US-based educators, or other custom arrangements. Go to the resource provider to see their individual restrictions.