Monthly Archives: July 2011

The war over academic publishing has officially begun

The last week has been an interesting one for academic publishing. First a 24 year old programmer name Aaron Swartz was arrested for allegedly breaking into MIT’s network and downloading 5 million articles from JSTOR. Given his background it has been surmised that he planned on making the documents publicly available. He faces up to 35 years in federal prison.

In response to the arrest Gregory Maxwell, a “technologist” and hobbyist scientist uploaded nearly 20,000 JSTOR [1] articles from the Philosophical Transactions of the Royal Society to The Pirate Bay, a bittorrent file sharing site infamous for facilitating the illegal sharing of music and movies. As explanation for the upload Maxwell posted a scathing, and generally trenchant, critique of the current academic publishing system that I am going to reproduce here in it’s entirety so that those uncomfortable with [2], or blocked from, visiting The Pirate Bay can read it [3]. In it he notes that since all of the articles he posted were published prior to 1923 they are all in the public domain.

This archive contains 18,592 scientific publications totaling
33GiB, all from Philosophical Transactions of the Royal Society
and which should be  available to everyone at no cost, but most
have previously only been made available at high prices through
paywall gatekeepers like JSTOR.

Limited access to the  documents here is typically sold for $19
USD per article, though some of the older ones are available as
cheaply as $8. Purchasing access to this collection one article
at a time would cost hundreds of thousands of dollars.

Also included is the basic factual metadata allowing you to
locate works by title, author, or publication date, and a
checksum file to allow you to check for corruption.

I've had these files for a long time, but I've been afraid that if I
published them I would be subject to unjust legal harassment by those who
profit from controlling access to these works.

I now feel that I've been making the wrong decision.

On July 19th 2011, Aaron Swartz was criminally charged by the US Attorney
General's office for, effectively, downloading too many academic papers
from JSTOR.

Academic publishing is an odd system - the authors are not paid for their
writing, nor are the peer reviewers (they're just more unpaid academics),
and in some fields even the journal editors are unpaid. Sometimes the
authors must even pay the publishers.

And yet scientific publications are some of the most outrageously
expensive pieces of literature you can buy. In the past, the high access
fees supported the costly mechanical reproduction of niche paper journals,
but online distribution has mostly made this function obsolete.

As far as I can tell, the money paid for access today serves little
significant purpose except to perpetuate dead business models. The
"publish or perish" pressure in academia gives the authors an impossibly
weak negotiating position, and the existing system has enormous inertia.

Those with the most power to change the system--the long-tenured luminary
scholars whose works give legitimacy and prestige to the journals, rather
than the other way around--are the least impacted by its failures. They
are supported by institutions who invisibly provide access to all of the
resources they need. And as the journals depend on them, they may ask
for alterations to the standard contract without risking their career on
the loss of a publication offer. Many don't even realize the extent to
which academic work is inaccessible to the general public, nor do they
realize what sort of work is being done outside universities that would
benefit by it.

Large publishers are now able to purchase the political clout needed
to abuse the narrow commercial scope of copyright protection, extending
it to completely inapplicable areas: slavish reproductions of historic
documents and art, for example, and exploiting the labors of unpaid
scientists. They're even able to make the taxpayers pay for their
attacks on free society by pursuing criminal prosecution (copyright has
classically been a civil matter) and by burdening public institutions
with outrageous subscription fees.

Copyright is a legal fiction representing a narrow compromise: we give
up some of our natural right to exchange information in exchange for
creating an economic incentive to author, so that we may all enjoy more
works. When publishers abuse the system to prop up their existence,
when they misrepresent the extent of copyright coverage, when they use
threats of frivolous litigation to suppress the dissemination of publicly
owned works, they are stealing from everyone else.

Several years ago I came into possession, through rather boring and
lawful means, of a large collection of JSTOR documents.

These particular documents are the historic back archives of the
Philosophical Transactions of the Royal Society - a prestigious scientific
journal with a history extending back to the 1600s.

The portion of the collection included in this archive, ones published
prior to 1923 and therefore obviously in the public domain, total some
18,592 papers and 33 gigabytes of data.

The documents are part of the shared heritage of all mankind,
and are rightfully in the public domain, but they are not available
freely. Instead the articles are available at $19 each--for one month's
viewing, by one person, on one computer. It's a steal. From you.

When I received these documents I had grand plans of uploading them to
Wikipedia's sister site for reference works, Wikisource - where they
could be tightly interlinked with Wikipedia, providing interesting
historical context to the encyclopedia articles. For example, Uranus
was discovered in 1781 by William Herschel; why not take a look at
the paper where he originally disclosed his discovery? (Or one of the
several follow on publications about its satellites, or the dozens of
other papers he authored?)

But I soon found the reality of the situation to be less than appealing:
publishing the documents freely was likely to bring frivolous litigation
from the publishers.

As in many other cases, I could expect them to claim that their slavish
reproduction - scanning the documents - created a new copyright
interest. Or that distributing the documents complete with the trivial
watermarks they added constituted unlawful copying of that mark. They
might even pursue strawman criminal charges claiming that whoever obtained
the files must have violated some kind of anti-hacking laws.

In my discreet inquiry, I was unable to find anyone willing to cover
the potentially unbounded legal costs I risked, even though the only
unlawful action here is the fraudulent misuse of copyright by JSTOR and
the Royal Society to withhold access from the public to that which is
legally and morally everyone's property.

In the meantime, and to great fanfare as part of their 350th anniversary,
the RSOL opened up "free" access to their historic archives - but "free"
only meant "with many odious terms", and access was limited to about
100 articles.

All too often journals, galleries, and museums are becoming not
disseminators of knowledge - as their lofty mission statements
suggest - but censors of knowledge, because censoring is the one thing
they do better than the Internet does. Stewardship and curation are
valuable functions, but their value is negative when there is only one
steward and one curator, whose judgment reigns supreme as the final word
on what everyone else sees and knows. If their recommendations have value
they can be heeded without the coercive abuse of copyright to silence
competition.

The liberal dissemination of knowledge is essential to scientific
inquiry. More than in any other area, the application of restrictive
copyright is inappropriate for academic works: there is no sticky question
of how to pay authors or reviewers, as the publishers are already not
paying them. And unlike 'mere' works of entertainment, liberal access
to scientific work impacts the well-being of all mankind. Our continued
survival may even depend on it.

If I can remove even one dollar of ill-gained income from a poisonous
industry which acts to suppress scientific and historic understanding,
then whatever personal cost I suffer will be justified ΓΓé¼ΓÇ¥it will be one
less dollar spent in the war against knowledge. One less dollar spent
lobbying for laws that make downloading too many scientific papers
a crime.

I had considered releasing this collection anonymously, but others pointed
out that the obviously overzealous prosecutors of Aaron Swartz would
probably accuse him of it and add it to their growing list of ridiculous
charges. This didn't sit well with my conscience, and I generally believe
that anything worth doing is worth attaching your name to.

I'm interested in hearing about any enjoyable discoveries or even useful
applications which come of this archive.

- ----
Greg Maxwell - July 20th 2011
gmaxwell@gmail.com  Bitcoin: 14csFEJHk3SYbkBmajyJ3ktpsd2TmwDEBb

These stories have been covered widely and the discussion has been heavy on Twitter and in the blogosphere. The important part of this discussion for academic publishing is that it has brought many of the absurdities of the current academic publishing system into the public eye, and a lot of people are shocked and unhappy [4]. This is all happening at the same time that Britain is finally standing up to the big publishing companies as their profits [5] and business models increasingly hamper rather than benefit the scientific process, and serious questions are raised about whether we should be publishing in peer-reviewed journals at all. I suspect that we will look back on 2011 as the tipping point year when academic publishing changed forever.

————————————————————————————————————————————————————————————

[1] In an interview with Wired Campus JSTOR claimed that these aren’t technically their articles because even though JSTOR did digitize these files, and each file includes an indication of JSTORs involvement, the files lack JSTOR’s cover page, so it’s not really their files, it’s the Royal Society’s files. Which first made me think “Wow, that’s about the lamest duck and cover excuse I’ve ever heard” and then “Hey, so if I just delete the cover page off a JSTOR file then apparently they surrender all claim to it. Nice!”

[2] In addition to questionable legality of the site some of the advertising there isn’t exactly workplace appropriate.

[3] I think that given the context he would be fine with us reprinting the entire statement. I’ve done some very minor cleaning up of some junk codes for readability. The original is available here.

[4] But also conflicted about the behavior of the individuals in question.

[5] ~$120 million/year for Wiley and ~$1 billion/year for Reed Elsevier (source LibraryJournal.com).

A Plea for Pluralism

As you may have seen earlier either on Jabberwocky, EEB and Flow, or over at Oikos‘ new blog, the most recent piece about how some branch of ecology is ruining ecology has caused some discussion in the blogosphere. Everytime one of these comes out, I tell myself I’m going to write a blog post but then I think, “that’s just one cranky person,” and i get distracted doing science that is killing ecology (Given the plethora of opinions about what is ruining our field, odds are you too are killing ecology, regardless of what type of science you do). But as these opinion pieces keep emerging, I have increasingly come to feel that these debates on the ‘best’ approach reflect a very limited view of the scientific endeavor.  Every approach (field ecology, microcosms, theory, meta-analysis, macroecology, insert your favorite approach that I’ve missed here) is fundamentally limited in its scope, focus, and ability to divine answers from nature, yet has unique strengths in what it allows us to do. Theory is abstracted from nature, but can also provide a concrete set of expectations and processes for empiricists to work with. Microcosms, while similarly critiqued for their abstraction from reality, can also give the clearest indication about whether ideas and theories work (or don’t) under the most ideal scenarios. Field ecology (particularly experimental manipulation) has been considered the gold standard for its ability to show cause and effect in ‘real’ ecosystems, but it is also messy, expensive, time-consuming (I say this thinking of my own field site, perhaps yours is less so) and in a natural setting it is impossible to have control over all of the important (and potentially confounding) variables. Macroecology and meta-analysis allow us to step back from individual systems and taxa to ask whether patterns and processes are general across nature, general within certain subsets of systems, or unpredictably important (and unimportant). However they lack the ability to manipulate nature directly to tease out cause and effect more definitively. Because all approaches have limitations, the exclusive use of any one approach is guaranteed to give us a limited and possibly flawed view of reality. In the scientific utopia that lives in my head, these different approaches to addressing scientific questions live together harmoniously, results from one approach generate questions best addressed with another approach and the cumulative evidence from all approaches give us a more complete understanding of nature. When I read opinion pieces that advocate for a particular approach above all others, I worry that this utopia only exists in my head. After all, those opinion pieces never seem to be balanced by a counter argument  for plurality. But then sometimes I read things – often on the internet – and I think: it may be in my head, but maybe my head is not the only one that dream resides in.

Bridging, not building, divides in ecology [Things you should read]

There is an excellent post over at EEB & Flow on the empirical divide,inspired by an editorial by David Lindenmayer and Gene Likens in the most recent ESA Bulletin, titled “Losing the Culture of Ecology”. It was great to see some thoughtful and data driven consideration of the idea that we should choose to emphasize one broad area of ecology over another. I really like their conclusion that these “divides” are really driven by other things:

The tensions between “indoor ecology” and field ecology have been conflated with changes in the philosophy of modern ecology, in the difficulties of obtaining funding and publishing as a modern ecologist, and some degree of thinking the “grass is always greener” in the other field. In fact, the empirical divide may not be as wide as is often suggested.

This post motivated some discussion in the comments, and on Twitter,

And a nice follow up post by Jeremy Fox at the Oikos blog.

It’s all pretty short and well worth the read.

The Ecological Data Wiki

Here at Weecology we’re really into open science and that’s why we’re excited to announce our first serious attempt to facilitate open science beyond the confines of our own research – The Ecological Data Wiki.

The idea behind this project is simple. There is a large and rapidly increasing amount of ecology related data available thanks to initiatives sponsoring the collection of large-scale data and efforts to increase the publication of already collected datasets. As a result, progress in ecology is increasingly limited by the speed at which we can find and use existing data. The Ecological Data Wiki is intended to serve as a central source for identifying datasets that are useful to the study of ecology and quickly figuring out the best ways to use them. The idea is to use the knowlege and effort of the entire ecological community to compile this information rather than relying on each scientist to contribute information for their own studies. Just think of it as the Wikipedia of ecology data.

We’re just getting things off the ground, but we’d love it if you’d come by, take a look around, and if you think you can be of help sign up, learn how to get started, and contribute. We’re currently in private beta, but you can generally expect to have an account activated within about 24 hours.

Let us know what you think about the site and any suggestions you have in the comments. If you’d like to chat about the wiki (or anything else) in person, Ethan will be presenting on this during the Wednesday poster session at ESA.

Michael Nielsen on the importance and value of Open Science

We are pretty excited about what modern technology can do for science and in particular the potential for increasingly rapid sharing of, and collaboration on, data and ideas. It’s the big picture that explains why we like to blog, tweet, publish data and code, and we’ve benefited greatly from others who do the same. So, when we saw this great talk by Michael Nielsen about Open Science, we just had to share.

(via, appropriately enough, @gvwilson and @TEDxWaterloo on Twitter)

Follow

Get every new post delivered to your Inbox.

Join 578 other followers