UPDATE: Both Ecology Letters and the British Ecological Society journals now allow preprints. Thanks to both groups for listening to the community and supporting the rapid and open exchange of scientific ideas.
Dear Ecology Letters and the British Ecological Society ,
I am writing to ask that you support the scientific good by allowing the submission of papers that have been posted as preprints. I or my colleagues have reached out to you before without success, but I have heard through various grapevines that both of you are discussing this possibility and I want to encourage you to move forward with allowing this important practice.
The benefits of preprints to science are substantial. They include:
- More rapid communication and discussion of important scientific results
- Improved quality of published research by allowing for more extensive pre-publication peer review
- A fair mechanism for establishing precedence that is not contingent the idiosyncrasies of formal peer review
- A way for early-career scientists to demonstrate productivity and impact on a time scale that matches their need to apply for postdoctoral fellowships and jobs
I am writing to you specifically because your journals represent the major stumbling block for those of us interested in improving science by posting preprints. Your journals either explicitly do not allow the submission of papers that have preprints posted online or lack explicit statements that it is OK to do so. This means that if there is any possibility of eventually submitting a paper to one of these journals then researchers must avoid posting preprints.
The standard justification that journals give for not allowing preprints is that they constitute “prior publication”. However, this is not an issue for two reasons. First, preprints are not peer reviewed. They are the equivalent of a long established practice in biology of sending manuscripts to colleagues for friendly review and to make them aware of cutting edge work. They simply take advantage of the internet to scale this to larger numbers of colleagues. Second, the vast majority of publication outlets do not believe that preprints represent prior publication, and therefore the publication ethics of the broader field of academic publishing clearly allows this. In particular Science, Nature, PNAS, the Ecological Society of America, the Royal Society, Springer, and Elsevier all generally allow the posting of preprints. Nature even wrote about this policy nearly a decade ago stating that:
Nature never wishes to stand in the way of communication between researchers. We seek rather to add value for authors and the community at large in our peer review, selection and editing… Communication between researchers includes not only conferences but also preprint servers… As first stated in an editorial in 1997, and since then in our Guide to Authors, if scientists wish to display drafts of their research papers on an established preprint server before or during submission to Nature or any Nature journal, that’s fine by us.
If you’d like to learn more about the value of preprints, and see explanations of why some of the other common concerns about preprints are unjustified, some colleagues and I have published a paper on The Case for Open Preprints in Biology.
So, I am asking that for the good of science, and to bring your journals in line with widely accepted publication practices, that you please move quickly to explicitly allow the submission of papers that have been posted as preprints.
Slides and script from Ethan White’s Ignite talk on Big Data in Ecology from Sandra Chung and Jacquelyn Gill‘s excellent ESA 2013 session on Sharing Makes Science Better. Slides are also archived on figshare.
1. I’m here to talk to you about the use of big data in ecology and to help motivate a lot of the great tools and approaches that other folks will talk about later in the session.
2. The definition of big is of course relative, and so when we talk about big data in ecology we typically mean big relative to our standard approaches based on observations and experiments conducted by single investigators or small teams.
3. And for those of you who prefer a more precise definition, my friend Michael Weiser defines big data and ecoinformatics as involving anything that can’t be successfully opened in Microsoft Excel.
4. Data can be of unusually large size in two ways. It can be inherently large, like citizen science efforts such as Breeding Bird Survey, where large amounts of data are collected in a consistent manner.
5. Or it can be large because it’s composed of a large number of small datasets that are compiled from sources like Dryad, figshare, and Ecological Archives to form useful compilation datasets for analysis.
6. We have increasing amounts of both kinds of data in ecology as a result of both major data collection efforts and an increased emphasis on sharing data.
7-8. But what does this kind of data buy us. First, big data allows us to work at scales beyond those at which traditional approaches are typically feasible. This is critical because many of the most pressing issues in ecology including climate change, biodiversity, and invasive species operate at broad spatial and long temporal scales.
9-10. Second, big data allows us to answer questions in general ways, so that we get the answer today instead of waiting a decade to gradually compile enough results to reach concensus. We can do this by testing theories using large amounts of data from across ecosystems and taxonomic groups, so that we know that our results are general, and not specific to a single system (e.g., White et al. 2012).
11. This is the promise of big data in ecology, but realizing this potential is difficult because working with either truly big data or data compilations is inherently challenging, and we still lack sufficient data to answer many important questions.
12. This means that if we are going to take full advantage of big data in ecology we need 3 things. Training in computational methods for ecologists, tools to make it easier to work with existing data, and more data.
13. We need to train ecologists in the computational tools needed for working with big data, and there are an increasing number of efforts to do this including Software Carpentry (which I’m actively involved in) as well as training initiatives at many of the data and synthesis centers.
14. We need systems for storing, distributing, and searching data like DataONE, Dryad, NEON‘s data portal, as well as the standardized metadata and associated tools that make finding data to answer a particular research question easier.
15. We need crowd-sourced systems like the Ecological Data Wiki to allow us to work together on improving insufficient metadata and understanding what kinds of analyses are appropriate for different datasets and how to conduct them rigorously.
16. We need tools for quickly and easily accessing data like rOpenSci and the EcoData Retriever so that we can spend our time thinking and analyzing data rather than figuring out how to access it and restructure it.
17. We also need systems that help turn small data into big data compilations, whether it be through centralized standardized databases like GBIF or tools that pull data together from disparate sources like Map of Life.
18. And finally we we need to continue to share more and more data and share it in useful ways. With the good formats, standardized metadata, and open licenses that make it easy to work with.
19. And so, what I would like to leave you with is that we live in an exciting time in ecology thanks to the generation of large amounts of data by citizen science projects, exciting federal efforts like NEON, and a shift in scientific culture towards sharing data openly.
20. If we can train ecologists to work with and combine existing tools in interesting ways, it will let us combine datasets spanning the surface of the globe and diversity of life to make meaningful predictions about ecological systems.
Over at Dynamic Ecology this morning Jeremy Fox has a post giving advice on how to decide where to submit a paper. It’s the same basic advice that I received when I started grad school almost 15 years ago and as a result I don’t think it considers some rather significant changes that have happened in academic publishing over the last decade and a half. So, I thought it would be constructive for folks to see an alternative viewpoint. Since this is really a response to Jeremy’s post, not a description of my process, I’m going to use his categories in the same order as the original post and offer my more… youthful… perspective.
- Aim as high as you reasonably can. The crux of Jeremy’s point is “if you’d prefer for more people to read and think highly of your paper, you should aim to publish it in a selective, internationally-leading journal.” From a practical perspective journal reputation used to be quite important. In the days before easy electronic access, good search algorithms, and social networking, most folks found papers by reading the table of contents of individual journals. In addition, before there was easy access to paper level citation data, and alt-metrics, if you needed to make a quick judgment on the quality of someones science the journal name was a decent starting point. But none of those things are true anymore. I use searches, filtered RSS feeds, Google Scholar’s recommendations, and social media to identify papers I want to read. I do still subscribe to tables of contents via RSS, but I watch PLOS ONE and PeerJ just as closely as Science and Nature. If I’m evaluating a CV as a member of a search committee or a tenure committee I’m interested in the response to your work, not where it is published, so in addition to looking at some of your papers I use citation data and alt-metrics related to your paper. To be sure, there are lots of folks like Jeremy that focus on where you publish to find papers and evaluate CVs, but it’s certainly not all of us.
- Don’t just go by journal prestige; consider “fit”. Again, this used to mater more before there were better ways to find papers of interest.
- How much will it cost? Definitely a valid concern, though my experience has been that waivers are typically easy to obtain. This is certainly true for PLOS ONE.
- How likely is the journal to send your paper out for external review? This is a strong tradeoff against Jeremy’s point about aiming high since “high impact” journals also typically have high pre-review rejection rates. I agree with Jeremy that wasting time in the review process is something to be avoided, but I’ll go into more detail on that below.
- Is the journal open access? I won’t get into the arguments for open access here, but it’s worth noting that increasing numbers of us value open access and think that it is important for science. We value open access publications so if you want us to “think highly of your paper” then putting it where it is OA helps. Open access can also be important if you “prefer for more people to read… your paper” because it makes it easier to actually do so. In contrast to Jeremy, I am more likely to read your paper if it is open access than if it is published in a “top” journal, and here’s why: I can do it easily. Yes, my university has access to all of the top journals in my field, but I often don’t read papers while I’m at work. I typically read papers in little bits of spare time while I’m at home in the morning or evenings, or on my phone or tablet while traveling or waiting for a meeting to start. If I click on a link to your paper and I hit a paywall then I have to decide whether it’s worth the extra effort to go to my library’s website, log in, and then find the paper again through that system. At this point unless the paper is obviously really important to my research the activation energy typically becomes too great (or I simply don’t have that extra couple of minutes) and I stop. This is one reason that my group publishes a lot using Reports in Ecology. It’s a nice compromise between being open access and still being in a well regarded journal.
- Does the journal evaluate papers only on technical soundness? The reason that many of us think this approach has some value is simple, it reduces the amount of time and energy spent trying to get perfectly good research published in the most highly ranked journal possible. This can actually be really important for younger researchers in terms of how many papers they produce at certain critical points in the career process. For example, I would estimate that the average amount of time that my group spends getting a paper into a high profile journal is over a year. This is a combination of submitting to multiple, often equivalent caliber, journals until you get the right roll of the dice on reviewers, and the typically extended rounds of review that are necessary to satisfy the reviewers about not only what you’ve done, but satisfying requests for additional analyses that often aren’t critical, and changing how one has described things so that it sits better with reviewers. If you are finishing your PhD then having two or three papers published in a PLOS ONE style journal vs. in review at a journal that filters on “importance” can make a big difference in the prospect of obtaining a postdoc. Having these same papers out for an extra year accumulating citations can make a big difference when applying for faculty positions or going up for tenure if folks who value paper level metrics over journal name are involved in evaluating your packet.
- Is the journal part of a review cascade? I don’t actually know a lot of journals that do this, but I think it’s a good compromise between aiming high and not wasting a lot of time in review. This is why we think that ESA should have a review cascade to Ecosphere.
- Is it a society journal? I agree that this has value and it’s one of the reasons we continue to support American Naturalist and Ecology even though they aren’t quite as open as I would personally prefer.
- Have you had good experiences with the journal in the past? Sure.
- Is there anyone on the editorial board who’d be a good person to handle your paper? Having a sympathetic editor can certainly increase your chances of acceptance, so if you’re aiming high then having a well matched editor or two to recommend is definitely a benefit.
To be clear, there are still plenty of folks out there who approach the literature in exactly the way Jeremy does and I’m not suggesting that you ignore his advice. In fact, when advising my own students about these things I often actively consider and present Jeremy’s perspective. However, there are also an increasing number of folks who think like I do and who have a very different set of perspectives on these sorts of things. That makes life more difficult when strategizing over where to submit, but the truth is that the most important thing is to do the best science possible and publish it somewhere for the world to see. So, go forth, do interesting things, and don’t worry so much about the details.
ESA has just announced that it has changed its policy on preprints and will now allow articles that have been posted on major preprint servers, like arXiv, to be considered for publication in its journals.
I am very excited about this change for two reasons. First, as nicely laid out in INNGE blog post by Philippe Desjardins-Proulx*, there are many positive benefits to science of the preprint culture. They make science more accessible, allow researchers to get feedback from the community prior to peer review, and speed up the scientific process by making ideas available to others as quickly as possible. We should take this opportunity as a community to start developing the kind of vibrant preprint culture that has benefited so many other disciplines. Second, I am encouraged by the rapid response of ESA to the concerns expressed by myself and other members of the community, and take it as a sign that my favorite society is open to making the kinds of changes that are necessary to best facilitate science in the modern era. More work is clearly necessary, but this is a very encouraging start.
UPDATE: Carl Boettiger has posted his very nice letter to Don Strong that played an critical roll in taking this discussion from a bunch of folks talking over social media to something that effected meaningful change.
UPDATE: If you’re looking for publicly available grants go check out our new Open Grants website at https://www.ogrants.org/. It has way more grants and is searchable so that you can quickly find the grants most useful to you.
Recently a bunch of folks in the biological sciences have started sharing their grant proposals openly. Their reasons for doing so are varied (see the links next to their names below), but part of the common justification is a general interest in opening up science so that all stages of the process can benefit from better interaction and communication, and part of it is to provide examples for younger scientists writing grants. To help accomplish both of these goals I’m going to do what Titus Brown suggested and compile a list of all of the available open proposals in the biological sciences (if you’re looking for math proposals they have a list too). Given the limited number of proposals available at the moment I’m just going to maintain the list here, sorted alphabetically by PI. Another way to find proposals is to look at the ‘grant’ and ‘proposal’ tags on figshare, where several of us have been posting proposals. If you know of more proposals, decide to post some yourself, or have corrections to proposal in the list, just let me know in the comments and I’ll keep the list updated. Enjoy!
- 2014 / Postdoctoral fellowships (several) – Making sense of cancer data: Implications for personalized therapy and cancer biology
- 2008 / Tools and Resources Development Fund Application – pubmed2ensembl: a resource for linking biological literature to genome sequences (BBSRC) *funded
- 2008 / New Investigator Grant Application (NERC) *funded
- 2008 / EMBO Young Investigator Programme Application (EMBO)
- 2007 / Responsive Mode Grant Application (BBSRC)
- 2014 / Children’s Foundation Research Institute – Prediction of Weight Gain in Inbred Mouse Strains *funded
- 2012 / NSF Office of Cyberinfrastructure proposal, Materials and Workshops for Cyberinfrastructure Education in Biology supplement to BEACON. *funded
- 2012 / NSF CAREER proposal, Assembling Extremely Large Metagenomes
- 2012 / NSF BIGDATA proposal, Low-memory Streaming Prefilters for Biological Sequencing Data
- 2012 / Moore Foundation proposal on marine metagenomics
- 2011 / NSF CAREER proposal: “Scaling and Improving de Bruijn graph assembly”
- 2010 / Next-gen course (NIH R25) *funded
- 2009 / Web tools for next-gen sequence analysis (USDA) *funded
- 2007 / Cartwheel
- Kathryn Fuller Doctoral Fellowship application (WWF)
- 2010 / Prairie Biotic Research proposal *funded
- 2009 / Ecological and evolutionary impacts of pollinator sharing between cultivated and wild sunflowers (Norman Hackerman Advanced Research Program)
- 2009 / Lewis and Clark grant proposal (American Philosophical Society)
- Doctoral Dissertation Improvement Grant proposal (NSF)
- Forest Shreeve Award proposal
- Ariel Appleton Research Fellowship Proposal – Ecological Networks
- How do crop-mediated changes in mutualist and antagonist communities affect selection on floral and defense traits?
- 2015 / Marie Skłodowska-Curie Individual Fellowship *funded
- 2011 / “Automated and community synthesis of the tree of life” (NSF AVATOL) *funded
- 2010 / “Towards a comprehensive, community-owned and sustainable repository of reusable phylogenetic knowledge” w/Hilmar Lapp (NSF ABI)
- 2009 / “A network for enabling community-driven standards to link evolution into the global web of data (EvoIO)” w/Hilmar Lapp (NSF INTEROP)
- 2009 / NSF Plant Genome *funded
- 2013 / “Reversing long-term experiments to understand regime shifts” (NSF DEB preproposal) *funded
- 2012 / Understanding range shift model error: The inﬂuence of generation time and rate of adaptation on species distribution model predictions. w/Scott Chamberlain (NCEAS proposal).
- 2008 / Evolution under simulated climate change in response to trophic shifts. (NSF DDIG) *funded
- 2010 / Protein Design Using Quantum Mechanics (Danish Center for Supercomputing) *funded
- 2008 / Computational Design of Stable Enzymes (Danish National Science Foundation, DSF-NABIIT) *funded
- 2006 / Modeling pH-Dependence in Drug Design (EU Marie Curie Program) *funded
- 2006 / Computational Prediction and Validation of Protein Structure and Function in Protein Engineering and Rational Drug Design (Danish National Science Foundation, FNU) *funded
- 2006 / Prediction and Interpretation of Protein pKa’s Using QM/MM (US National Science Foundation – MCB; rescinded when I moved to Denmark) *funded
- 2002 / The Prediction and Interpretation of Protein pKa’s Using QM/MM (US National Science Foundation – MCB) *funded
- 2010 / Ontology-enabled reasoning across phenotypes from evolution and model organisms w/Todd Vision (NSF) *funded
- 2013 / NSF Postdoctoral Research Fellowship in Biology
- 2010 / Leakey Foundation General Research Grant
- 2009 / US Student Fulbright
- 2009 / NSF Dissertation Improvement Grant
Heather Piwowar (@researchremix) & Jason Priem (@jasonpriem) (read their thoughts on sharing proposals)
- Uptake proposal (CIHR)
- 2007 / Sxy proposal (CIHR) *funded
- 2001 / CIHR proposal *funded
- 1999 / NIH proposal *funded
- 2009 / USDA/NIFA: “Scanning for yield: high-throughput discovery of candidate agronomic loci for marker-assisted selection in maize” *funded
- 2015 / NSF Plant Genome Research Program: “The genetics of highland adaptation in maize” *funded
- Netherlands organization for scientific research postdoc fellowship
- Netherlands organization for scientific research PhD fellowship
- Netherlands organization for scientific research PhD fellowship
- Netherlands organization for scientific research PhD fellowship *funded
- Netherlands organization for scientific research postdoc fellowship *funded
- 2010 / NSF Graduate Research Fellowship *funded
- 2016 / NSF Postdoctoral Fellowship *funded (associated reviews)
- 2014 / Gene Wiki: Expanding the ecosystem of community intelligence resources (R01 GM089820)
- 2017 / Gene Wiki renewal
- 2014 / Dynamic macroecology: Globally assessing body size diversity response to environmental change (NSF Postdoctoral Fellowship) *funded
- 2012 / Data Management and Computational Skills Training for LTER Scientists w/Ethan White & Greg Wilson (LTER Training Working Groups Proposal)
- 2011 / Fuelwood, Savannas, and Climate Change: Integrating Modeling, Field Experimentation, and Optical and Radar Remote Sensing (NASA Predoctoral Graduate Fellowship) *funded
- 2014 / NSF Postdoctoral Fellowship – Diversity-stability relationships and coexistence: new theory and empirical tests *funded
- 2012 / Genomic tools to study coral reef resilience (University of Melbourne)
- 2012 / Plastid endosymbiosis: a detailed study of genome dynamics (Australian Research Council)
- 2012 / Evolutionary dynamics of the algae: Understanding adaptive potential under environmental change (Australian Research Council) *funded
- Probing key innovations with next generation sequencing
- 2009 / Macroevolutionary dynamics of marine algae
- 2012 / Sustainable and Scalable Infrastructure for the Publication of Data (NSF) *funded
- 2008 / A Digital Repository for Preservation and Sharing of Data Underlying Published Works in Evolutionary Biology (NSF) *funded
- 2013 / ERC AdG IMMUNENESIS *funded
- 2014 / Moore Investigator in Data Driven Discovery proposal *funded
- 2010 / CAREER: Advancing Macroecology Using Informatics and Entropy Maximization (NSF CAREER Award) *funded
- 2005 / Broad-scale patterns of the distribution of body sizes of individuals in ecological communities (NSF Postdoc Fellowship) *funded
- 2008 / Understanding multimodality in animal size distributions (NSF Research Starter Grant) *funded
As I announced on Twitter about a week ago, I am now making all of my grant proposals open access. To start with I’m doing this for all of my sole-PI proposals, because I don’t have to convince my collaborators to participate in this rather aggressively open style of science. At the moment this includes three funded proposals: my NSF Postdoctoral Fellowship proposal, an associated Research Starter Grant proposal, and my NSF CAREER award.
So, why am I doing this, especially with the CAREER award that still has several years left on it and some cool ideas that we haven’t worked on yet. I’m doing it for a few reasons. First, I think that openness is inherently good for science. While there may be benefits for me in keeping my ideas secret until they are published, this certainly doesn’t benefit science more broadly. By sharing our proposals the cutting edge of scientific thought will no longer be hidden from view for several years and that will allow us to make more rapid progress. Second, I think having examples of grants available to young scientists has the potential to help them learn how to write good proposals (and other folks seem to agree) and therefore decrease the importance of grantsmanship relative to cool science in the awarding of limited funds. Finally, I just think that folks deserve to be able to see what their tax dollars are paying for, and to be able to compare what I’ve said I will do to what I actually accomplish. I’ve been influenced in my thinking about this by posts by several of the big open science folks out there including Titus Brown, Heather Piwowar, and Rod Page.
To make my grants open access I chose to use figshare for several reasons.
- Credit. Figshare assigns a DOI to all of its public objects, which means that you can easily cite them in scientific papers. If someone gets an idea out of one of my proposals and works on it before I do, this let’s them acknowledge that fact. Stats are also available for views, shares, and (soon) citations, making it easier to track the impact of your larger corpus of research outputs.
- Open Access. All public written material is licensed under CC-BY (basically just cite the original work) allowing folks to do cool things without asking.
- Permanence. I can’t just change my mind and delete the proposal and I also expect that figshare will be around for a long time.
- Version control. For proposals that are not funded, revised, not funded, revised, etc. figshare allows me to post multiple versions of the proposal while maintaining the previous versions for posterity/citation.
During this process I’ve come across several other folks doing similar things and even inspired others to post their proposals, so I’m in the process of compiling a list of all of the publicly available biology proposals that I’m aware of and will post a list with links soon. It’s my hope that this will serve as a valuable resource for young and old researchers alike and will help to lead the way forward to a more open scientific dialogue.