Preprints are rapidly becoming popular in biology as a way to speed up the process of science, get feedback on manuscripts prior to publication, and establish precedence (Desjardins-Proulx et al. 2013). Since biologists are still learning about preprints I regularly get asked which of the available preprint servers to use. Here’s the long-form version of my response.
The good news is that you can’t go wrong right now. The posting of a preprint and telling people about it is far more important than the particular preprint server you choose. All of the major preprint servers are good choices.Of course you still need to pick one and the best way to do that is to think about the differences between available options. Here’s my take on four of the major preprint servers: arXiv, bioRxiv, PeerJ, and figshare.
arXiv is the oldest of the science preprint servers. As a result it is the most well established, it is well respected, more people have heard of it than any of the other preprint servers, and there is no risk of it disappearing any time soon. The downside to having been around for a long time is that arXiv is currently missing some features that are increasingly valued on the modern web. In particular there is currently no ability to comment on preprints (though they are working on this) and there are no altmetrics (things like download counts that can indicate how popular a preprint is). The other thing to consider is that arXiv’s focus is on the quantitative sciences, which can be both a pro and a con. If you do math, physics, computer science, etc., this is the preprint server for you. If you do biology it depends on the kind of research you do. If your work is quantitative then your research may be seen by folks outside of your discipline working on related quantitative problems. If your work isn’t particularly quantitative it won’t fit in as well. arXiv allows an array of licenses that can either allow or restrict reuse. In my experience it can take about a week for a preprint to go up on arXiv and the submission process is probably the most difficult of the available options (but it’s still far easier than submitting a paper to a journal).
bioRxiv is the new kid on the block having launched less than a year ago. It has both commenting and altmetrics, but whether it will become as established as arXiv and stick around for a long time remains to be seen. It is explicitly biology focused and accepts research of any kind in the biological sciences. If you’re a biologist, this means that you’re less likely to reach people outside of biology, but it may be more likely that biology folks come across your work. bioRxiv allows an array of licenses that can either allow or restrict reuse. However, they explicitly override the less open licenses for text mining purposes, so all preprints there can be text-mined. In my experience it can take about a week for a preprint to go up on bioRxiv.
PeerJ Preprints is another new preprint server that is focused on biology and accepts research from across the biological sciences. Like bioRxiv it has commenting and altmetrics. It is the fastest of the preprint servers, with less than 24 hours from submission to posting in my experience. PeerJ has a strong commitment to open access, so all of it’s preprints are licensed with the Creative Commons Attribution License. PeerJ also publishes an open access journal, but you can post preprints to PeerJ Preprints with out submitting them to the journal (and this is very common). If you do decide to submit your manuscript to the PeerJ journal after posting it as a preprint you can do this with a single click and, should it be published, the preprint will be linked to the paper. PeerJ has the most modern infrastructure of any of the preprint servers, which makes for really pleasant submission, reading, and commenting experiences. You can also earn PeerJ reputation points for posting preprints and engaging in discussions about them. PeerJ is the only major preprint server run by a for-profit company. This is only an issue if you plan to submit your paper to a journal that only allows the posting of non-commercial preprints. I only know of only one journal with this restriction, but it is American Naturalist which can be an important journal in some areas of biology.
figshare is a place to put any kind of research output including data, figures, slides, and preprints. The benefit of this general approach to archiving research outputs is that you can use figshare to store all kinds of research outputs in the same place. The downside is that because it doesn’t focus on preprints people may be less likely to find your manuscript among all of the other research objects. One of the things I like about this broad approach to archiving anything is that I feel comfortable posting that isn’t really manuscripts. For example, I post grant proposals there. figshare accepts research from any branch of science and has commenting and altmetrics. There is no delay from submission to posting. Like PeerJ, figshare is a for-profit company and any document posted there will be licensed with the Creative Commons Attribution License.
Those are my thoughts. I have preprints on all three preprint servers + figshare and I’ve been happy with all three experiences. As I said at the beginning, the most important thing is to help speed up the scientific process by posting your work as preprints. Everything else is just details.
UPDATE: It looks like due to a hiccup with scheduling this post than an early version went out to some folks without the figshare section.
UPDATE: In the comments Richard Sever notes that bioRxiv’s preprints are typically posted within 48 hours of submission and that their interpretation of the text mining clause is that this is covered by fair use. See our discussion in the comments for more details.
A couple of points about bioRxiv. Posting is typically in 48 hours. And as a nonprofit backed by Cold Spring Harbor Laboratory, the intention is most certainly to stick around for a long time.
Also a clarification on text mining: under US law, this should be permitted under ‘fair use’ regardless of license. So it is not a case of ‘overriding’ the license but just making clear text mining is allowed however the author wishes to share the work on bioRxiv.
Great roundup, Ethan!
A bit of background about ArXiv and altmetrics, for those who are interested: it’s not that they don’t collect download and pageview stats, it’s that they’re philosophically opposed to making them public. You can read their thoughts here: http://arxiv.org/help/faq/statfaq.
@Richard – Thanks for the clarification of the average turn around time. I’m glad to hear that it’s a little faster than I thought. I personally don’t doubt that you’ll be around for the long haul, but having seen other efforts in this area fail I think it’s a reasonable think for folks to consider.
Regarding the text mining issue, my impression is that this interpretation of fair use as fully covering text mining is a gray area. Do you have any good references for this in the case of NC & ND licenses?
@ethan I don’t have a specific reference other than the definition of ‘fair use’ under US law. See also this a funnier take.
The problem of invoking a CC BY license (or a variant like NC, ND or NC/ND) as the thing that is allowing text mining is that all of these licenses require ‘attribution’, and most text miners would not want to have to cite every single paper they mined*. It is therefore much better to invoke ‘free use’ as the justification, as this does not require attribution.
@richard – Thanks for the clarification. I certainly agree that the fair use argument is reasonable (and good for science), but in the absence of established legal opinions on this the existince of that statement definitely makes me feel more comfortable mining bioRxiv’s collection. I’ve added up update at the end of the post to make sure that folks see this discussion.
* Note that this is the main reason data sets intended for broad re-use are released under a CC 0 license, which does not require attribution.
In my experience, arXiv seems to post papers online within about 24 to 48 hours.
Also worth noting are the fairly new scientific publishing platforms ScienceOpen (scienceopen.com) and The Winnower (thewinnower.com). These can be seen as preprint servers but there is more to them than that. Both include open review, effectively making them journals in my opinion.
@thomas Apparently I just have bad luck (or a poor perception of time) when it comes to submitting papers to preprint servers 🙂 Faculty of 1000 Research would also be on that list of alternatives, but I agree with you that they really go beyond the standard approach to preprints in that most of them really represent an end point as well as a starting point.
There is another repository to consider for preprints and datasets called zenodo.org, it has github and dropbox integration, flexible licencing and altmetrics.
Thanks for pointing out Zenodo Leonardo. Definitely another option that deserves a look
Pingback: “Old” meets “New”: the role of preprints in natural history « The Lab and Field
Hi Ethan. I would like to suggest you to deposit this post (as a pdf) in FigShare, in order to provide a stable link to this opinion. I want to make a short didactic text discussing the importance to deposit raw field data in Figshare and your post (and further updates to the text you can upload there) would be an interesting source of information on the subject.
All the best,
And/or you could publish the post at http://thewinnower.com which also assigns a DOI. Check it out – it is a very interesting science communication initiative.
@Marcelo – Done! http://dx.doi.org/10.6084/m9.figshare.1157184
@Thomas – I went with figshare out of familiarity and limited time to get this up, but I’m definitely interested in The Winnower and plan to check it out when I have a little more time on my hands.
Pingback: Impact Challenge Day 16: Post your preprints - Impactstory blog
Pingback: Moore Foundation requests grantee feedback on preprint policy | ASAPbio
A preprint was already posted in BioRxiv. Can we post a revision ( minor change) to PeerJ preprint?
I’d recommend sticking with a single preprint server, but you can post revisions directly to bioRxiv. This keeps the history of the document together in one place. By default readers will see the most recent version, but they can look back at the original submission if they want to.