Jabberwocky Ecology

ESA journals do not allow papers with preprints

Over the weekend I saw this great tweet:

by Philippe Desjardins-Proulx and was pleased to see yet another actively open young scientist. Then I saw his follow up tweet:

At first I was confused. I thought ESA’s policy was that preprints were allowed based on the following text on their website (emphasis mine: still available in Google’s Cache):

A posting of a manuscript or thesis on a personal or institutional homepage or ftp site will generally be considered as a preprint; this will not be grounds for viewing the manuscript as published. Similarly, posting of manuscripts in public preprint archives or in an institution’s public archive of unpublished theses will not be considered grounds for declaring a manuscript published. If a manuscript is available as part of a digital publication such as a journal, technical series or some other entity to which a library can subscribe (especially if that publication has an ISSN or ISBN), we will consider that the manuscript has been published and is thus not eligible for consideration by our journals. A partial test for prior publication is whether the manuscript has appeared in some entity with archival value so that it is permanently available to reasonably diligent scholars. A necessary test for prior publication is whether the author can legally transfer copyright to ESA.

So I asked Philippe to explain his tweet:

This got me a little riled up so I broadcast my displeasure:

And then Jarrett Byrnes questioned where this was coming from given the stated policy:

So I emailed ESA to check and, sure enough, preprints on arXiv and similar preprint servers are considered prior publication and therefore cannot be submitted to ESA journals, despite the fact that this isn’t a problem for a few journals you may have heard of including Science, Nature, PNAS, and PLoS Biology. ESA (to their credit) has now clarified this point on their website (emphasis mine; thanks to Jaime Ashander for the heads up):

A posting of a manuscript or thesis on an author‚Äôs personal or home institution‚Äôs website or ftp site generally will not be considered previous publication. Similarly posting of a ‚Äúworking paper‚ÄĚ in an institutional repository is allowed so long as at least one of the authors is affiliated with that institution. However, if a manuscript is available as part of a digital publication such as a journal, technical series, or some other entity to which a library can subscribe (especially if that publication has an ISSN or ISBN), we will consider that the manuscript has been published and is thus not eligible for consideration by our journals. Likewise, if a manuscript is posted in a citable public archive outside the author‚Äôs home institution, then we consider the paper to be self-published and ineligible for submission to ESA journals. Finally, a necessary test for prior publication is whether the author can legally transfer copyright to ESA.

In my opinion the idea that a preprint is “self-published” and therefore represents prior publication is poorly justified* and not in the best interests of science, and I’m not the only one:

So now I’m hoping that Jarrett is right:

and that things might change (and hopefully soon). If you know someone on the ESA board, please point them in the direction of this post.

UPDATE: Just as I was finishing working on this post ESA responded to the tweet stream from the last few days:

I’m very excited that ESA is reviewing their policies in this area. As I should have said in the original post, I have, up until this year, been quite impressed with ESA’s generally open, and certainly pro-science policies. This last year or so has been a bad one, but I’m hoping that’s just a lag in adjusting to the new era in scientific publishing.

UPDATE 2: ESA has announced that they have changed their policy and will now consider articles with preprints.

———————————————————————————————————————————————————————–

*I asked ESA if they wanted to clarify their justification for this policy and haven’t heard back (though it has been less than 2 days). If they get back to me I’ll update or add a new post.
   

Three ways to improve impact factors

It’s that time of year again when the new Impact Factor values are released. This is such a big deal to a lot of folks that it’s pretty hard to avoid hearing about it. We’re not the sort of folks that object to the use of impact factors in general – we are scientists after all and part of being a scientist is quantifying things. However, if we’re going to quantify things it is incumbent upon us to try do it well and there are several things that we need to address if we are going to have faith in our measures of journal quality.

1. Stop using impact factor use Eigenfactor based metrics instead

The impact factor simply determines the number of papers that cite another paper and calculates the average. This might have been a decent approach when the IF was first invented, but it’s a terrible approach now. The problem is that according to network theory, and some important applications thereof (e.g., Google), it is also important to take into account the importance of the papers/journals that are doing the citing. Fortunately we now have metrics that do this properly: the Eigenfactor and associated Article Influence Score. These are even report by ISI right next to the IF.

Here’s a quick way to think about this. You have two papers, one that has been cited 30 times by papers that are never cited, and one that has been cited 30 times by papers that are themselves each cited 30 times. If you think the two papers are equally important, then please continue using the impact factor based metrics. If you think that the second paper is more important then please never mention the words “impact factor” again and start focusing on better approaches for quantifying the influence of nodes in a network.

2. Separate reviews (and maybe methods) from original research

We’ve known pretty much forever that reviews are cited more than original research papers, so it doesn’t make sense to compare review journals to non-review journals. While it’s easy to just say that TREE and Ecology are apples and oranges, the real problem is journals that mix reviews and original research. Since reviews are more highly cited, just changing the mix of these two article types can manipulate the impact factor. Sarah Supp and I have a paper on this is you’re interested in seeing some science and further commentary on the issue. The answer is easy, separate the analyses for review papers. It has also been suggested that methods papers have higher citation rates as well, but as I admit in my back and forth with Bob O’Hara (the relevant part of which is still awaiting moderation as I’m posting) there doesn’t seem to be any actual research on this to back it up.

3. Solve the problem of metrics that are strongly influenced by the number of papers

In the citation analysis of individual scientists there has always been the problem of how to deal with the number of papers. The total number of citations isn’t great since one way to get a large number of citations is to write a lot of not particularly valuable papers. The average number of citations per paper is probably even worse because no one would argue that a scientist who writes a single important paper and then stops publishing is contributing maximally to the progress of science.

In journal level citation analyses these two end points have up until recently been all we had, with ISI choosing to focus on the average number of citations per paper and Eigenfactor the total number of citations [1]. The problem is that these approaches encourage gaming by journals to publish either the most or fewest papers possible. Since the issues with publishing too many papers are obvious I’ll focus on the issue of publishing too few. Assuming that journals have the ability to predict the impact of individual papers [2], the best way to maximize per article measures like the impact factor is to publish as few papers as possible. Adding additional papers simply dilutes the average citation rate. The problem is that by doing so the journal is choosing to have less influence on the field (by adding more, largely equivalent quality, papers) in favor of having a higher perceived impact. Think about it this way. Is a journal that publishes a total of 100 papers that are cited 5 times each, really more important than a journal that publishes 200 papers, 100 of which are cited 5 times each and 100 that are cited 4 times each? I think that the second journal is more important, and that’s why I’m glad to see that Google Scholar is focusing on the kinds of integrative metrics (like the h-index) that we use to evaluate individual researchers.

Moving forward

The good news is that we do have better metrics, that are available right now. The first thing that we should do is start promoting those instead of the metric that shall not be named. We should also think about improving these metrics further. If they’re worth talking about, they are worth improving. I’d love to see a combination of the network approaches in Eigenfactor with the approaches to solving the number of publications problem taken by Google. Of course, more broadly, we are already in the progress of moving away from journal level metrics and focusing more on the impact of individual papers. I personally prefer this approach and think that it’s good for science, but I’ll leave my thoughts on that for another day.

UPDATE: Point 3 relates to two great pieces in Ideas in Ecology and Evolution, one by Lonnie Aarssen and one by David Wardle.

UPDATE 2: Fixed the broken link to the “Why Eigenfactor?” page.

———————————————————————————————————————————

[1] Both sets of metrics include both approaches with total citations from ISI and Article Influence Score, which is the per paper equivalent of the Eigen Factor, it’s just that they don’t seem to get as much… um… attention.

[2] And if they didn’t then all we’re measuring is how well different journals game the system plus some positive feedback where journals that are known to be highly cited garner more readers and therefore more future citations.

Why computer labs should never be controlled by individual colleges/departments

Some time ago in academia we realized that it didn’t make sense for individual scientists or even entire departments to maintain their own high performance computing resources. Use of these resources by an individual is intensive, but sporadic, and maintenance of the resources is expensive [1] so the universities soon realized they were better off having centralized high performance computing centers so that computing resources were available when needed and the averaging effects of having large numbers of individuals using the same computers meant that the machines didn’t spend much time sitting idle. This was obviously a smart decision.

So, why haven’t universities been smart enough to centralize an even more valuable computational resource, their computer labs?

As any student of Software Carpentry will tell you, it is far more important to be able to program well than it is to have access to a really large high performance computing center. This means that the most important computational resource a university has is the classes that teach their students how to program, and the computer labs on which they rely.

At my university [2] all of the computer labs on campus are controlled by either individual departments or individual colleges. This means that if you want to teach a class in one of them you can’t request it as a room through the normal scheduling process, you have to ask the¬†cognizant university¬†fiefdom for permission. This wouldn’t be a huge issue, except that in my experience the answer is typically a resounding no. And it’s not a “no, where really sorry but the classroom is booked solid with our own classes,” it’s “no, that computer lab is ours, good luck” [3].

And this means that we end up wasting a lot¬†of expensive university resources.¬†For example, last year I taught in a computer lab “owned” by another college [4]. I taught in the second class slot of a four slot afternoon. In the slot before my class there was a class that used the room about four times during the semester (out of 48 class periods). There were no classes in the other two afternoon slots [5]. That means that classes were being taught in the lab only 27% of the time or 2% of the time if I hadn’t been granted an exception to use the lab [6].

Since computing skills are increasingly critical to many areas of science (and everything else for that matter) this territoriality with respect to computer labs means that they proliferate across campus. The departments/colleges of Computer Science, Engineering, Social Sciences, Natural Resources and Biology [7] all end up creating and maintaining their own computer labs, and those labs end up sitting empty (or being used by students to send email) most of the time. This is horrifyingly inefficient in an era where funds for higher education are increasingly hard to come by and where technology turns over at an ever increasing rate. Which [8] brings me to the title of this post. The solution to this problem is for universities to stop allowing¬†computer labs to be controlled by individual colleges/departments in exactly the same way that most classrooms are not controlled by colleges/departments. Most universities have a central unit that schedules classrooms and classes are fit into the available spaces. There is of course a highly justified bias to putting classes in the buildings of the cognizant department, but large classes in particular may very well not be in the department’s building. It works this way because if it didn’t then the university would be wasting huge amounts of space having one or more lecture halls in every department, even if they were only needed a few hours a week. The same issue applies to computer labs, only they are also packed full of expensive electronics. So please universities, for the love of all that is good and right and simply fiscally sound in the world, start treating computer labs like what they are: really valuable and expensive classrooms.

—————————————————-

[1] Think of a single scientist who keeps 10 expensive computers, only uses them a total of 1-2 months per year, but when he does the 10 computers aren’t really enough so he has to wait a long time to finish the analysis.

[2] And I think the point I’m about to make is generally true; at least it has been at several other universities I’ve worked over the years.

[3] Or in some cases something more like “Frak you. You fraking biologists have no fraking right to teach anyone a fraking thing about fraking computers.”¬†Needless to say, the individual in question wasn’t actually saying frak, but this is a family blog.

[4] As a result of a personal favor done for one administrator by another administrator.

[5] I know because I took advantage of this to hold my office hours in the computer lab following class.

[6] To be fair it should be noted that this and other computer labs are often used by students for doing homework (along with other less educationally oriented activities) when classes are not using the rooms, but in this case the classroom was a small part of a much larger lab and since I never witnessed the non-classroom portion of the lab being filled to capacity, the argument stands.

[7] etc., etc., etc.

[8] finally…

Putting the first back in first author

UPDATE: As of April 2012 Wiley has now changed their feeds to include the full list of authors. Thanks to Brady Allred for letting us know.

An open letter to John Wiley & Sons Inc.

Dear Wiley,

I like a lot of things that you do, but a few months ago you quietly changed your RSS feeds in a way that is both disrespectful and frankly not good for your business. You started including only the last author’s name in the RSS feed.¬†This is bad idea for three reasons:

  1. It shows a complete lack of respect for (or understanding of) a number of scientific disciplines that do not have a strong last author tradition (including ecology; a field in which you publish a large proportion of the journals). If you do this for a paper from my field then most of the time you are publishing the name of the least significant contributor.
  2. Even in disciplines (or labs) where there is a last author tradition, not including the name of the (often junior) person who did most of the work is just disrespectful. Yes, maybe you’ll attract more click-throughs with a more senior name, but the goal in scientific fields has always been to provide credit where credit is due and you are failing to honor that tradition.
  3. Finally (and worst of all from your perspective), you are costing yourself readers. One of the considerations that I make when deciding to read a paper is based on who the authors are. At least in fields like mine I will rarely see a name associated with a paper that is meaningful to me since the last author may well be an undergraduate or a tech.

In case you think this is just one person’s opinion we took a quick informal poll a little while ago. Of 37 respondents 100% agree that if you are going to list a single author’s name with a paper it should be the first author.

So please, either switch back to using the first author’s name or, better yet, actually list the entire author line. Seeing someone’s name whose work we respect will encourage us to click-through to the paper regardless of where that name occurs in the author line.

Regards,

Ethan White (and the readers of Jabberwocky Ecology)

P.S. Also, we know that the RSS feed includes the abstract. We don’t need it in large, bold, capitalized letters at the top of every feed.

Some days…

Some days I really wonder whether the bureaucratic infrastructure at institutions of higher education has any idea whatsoever that their job is to support the research and teaching missions of the university.

A message to journal editors/managers about RSS feeds

  1. If you don’t have an easily accessible RSS feed available (and by easily accessible I mean in the browser’s address bar on your journal’s main page) for your journal’s Table of Contents (TOCs), there is a certain class of readers who will not keep track of you TOCs. This is because receiving this information via email is outdated and inefficient and if you are in the business of content delivery it is, at this point, incompetent for you to not have this option (it’s kind of like not having a website 10 years ago).
  2. If, for some technophobic reason, you refuse to have an RSS feed, then please, pretty please with suger on top, don’t hide the ability to subscribe to the TOCs behind a username/password wall. All you need is a box for people to add their email addresses to for subscribing and a prominent unsubscribe link in the emails (if you are really paranoid you can add a confirmation email with a link that needs to be followed to confirm the subscription).
  3. Most importantly. Please, for the love of all that is good and right in the world, DO NOT START AN RSS FEED AND THEN STOP UPDATING IT. Those individuals who track a large number of feeds in their feed readers will not notice that you stopped updating your feed for quite some time. You are losing readers when you do this.
  4. If you have an RSS feed that is easily accessible (congratulations, you’re ahead of many Elsevier journals) please try to maximize the amount of information it provides. There are three critical pieces of information that should be included in every TOCs feed:
    1. The title (you all manage to do this one OK)
    2. All of the authors’ names. Not just the first author. Not just the first and last author. All of the authors. Seriously, part of the decision making process when it comes to choosing whether or not to take a closer look at a paper is who the authors are. So, if you want to maximize the readership of papers, include all of the authors’ names in the RSS feed.
    3. The abstract. I cannot fathom why you would exclude the abstract from your feed, other than to generate click throughs to your website. Since those of you doing this (yes, Ecology, I’m talking about you)¬†aren’t running advertising, this isn’t a good reason, since you can communicate the information just as well in the feed (and if you’re using website visits as some kind of metric, don’t worry, you can easily track how many people are subscribed to your feed as well).

If this seems a bit harsh, whiny, etc., then keep this in mind. In the last month I had over 1000 new publications come through my feed reader and another 100 or so in email tables of contents. This is an incredible amount of material just to process, let alone read. If journals want readers to pay attention to their papers it is incumbent upon them to make it as easy as possible to sort through this deluge of information and allow their readership to quickly and easily identify papers of interest. Journals that don’t do this are hurting themselves as well as their readers.

Experimental ecology is dead, long live experimental ecology!

I read a handful of experimental ecology papers the other day. I liked some of them and didn’t like some of them. It wasn’t that there was anything inherently wrong with the ones I didn’t like, they just didn’t fit in with my world view.

Yeah, this doesn’t make any sense to me either, but apparently that’s how we’re using this phrase these days.

P.S. I was going to let this one go until Ecotone used the original post to question “the reality (or not) of macroecology as its own discipline.” There’s nothing wrong with creative titles (we enjoy them here at Jabberwocky), but when contrasted with EEB & Flow’s other posts from ESA it’s not surprising that Ecotone took this as being a passive agressive critique of the state of the field. My main concern is that EEB & Flow seems to conflate an important methodological approach with particular interpretations of ecological process resulting from an application of that approach. Just because I disagree with a particular paper using an experiment doesn’t lead me to have “an unsure feeling about this field.” I mean really.