The last week has been an interesting one for academic publishing. First a 24 year old programmer name Aaron Swartz was arrested for allegedly breaking into MIT’s network and downloading 5 million articles from JSTOR. Given his background it has been surmised that he planned on making the documents publicly available. He faces up to 35 years in federal prison.

In response to the arrest Gregory Maxwell, a “technologist” and hobbyist scientist uploaded nearly 20,000 JSTOR [1] articles from the Philosophical Transactions of the Royal Society to The Pirate Bay, a bittorrent file sharing site infamous for facilitating the illegal sharing of music and movies. As explanation for the upload Maxwell posted a scathing, and generally trenchant, critique of the current academic publishing system that I am going to reproduce here in it’s entirety so that those uncomfortable with [2], or blocked from, visiting The Pirate Bay can read it [3]. In it he notes that since all of the articles he posted were published prior to 1923 they are all in the public domain.

This archive contains 18,592 scientific publications totaling
33GiB, all from Philosophical Transactions of the Royal Society
and which should be  available to everyone at no cost, but most
have previously only been made available at high prices through
paywall gatekeepers like JSTOR.

Limited access to the  documents here is typically sold for $19
USD per article, though some of the older ones are available as
cheaply as $8. Purchasing access to this collection one article
at a time would cost hundreds of thousands of dollars.

Also included is the basic factual metadata allowing you to
locate works by title, author, or publication date, and a
checksum file to allow you to check for corruption.

I've had these files for a long time, but I've been afraid that if I
published them I would be subject to unjust legal harassment by those who
profit from controlling access to these works.

I now feel that I've been making the wrong decision.

On July 19th 2011, Aaron Swartz was criminally charged by the US Attorney
General's office for, effectively, downloading too many academic papers
from JSTOR.

Academic publishing is an odd system - the authors are not paid for their
writing, nor are the peer reviewers (they're just more unpaid academics),
and in some fields even the journal editors are unpaid. Sometimes the
authors must even pay the publishers.

And yet scientific publications are some of the most outrageously
expensive pieces of literature you can buy. In the past, the high access
fees supported the costly mechanical reproduction of niche paper journals,
but online distribution has mostly made this function obsolete.

As far as I can tell, the money paid for access today serves little
significant purpose except to perpetuate dead business models. The
"publish or perish" pressure in academia gives the authors an impossibly
weak negotiating position, and the existing system has enormous inertia.

Those with the most power to change the system--the long-tenured luminary
scholars whose works give legitimacy and prestige to the journals, rather
than the other way around--are the least impacted by its failures. They
are supported by institutions who invisibly provide access to all of the
resources they need. And as the journals depend on them, they may ask
for alterations to the standard contract without risking their career on
the loss of a publication offer. Many don't even realize the extent to
which academic work is inaccessible to the general public, nor do they
realize what sort of work is being done outside universities that would
benefit by it.

Large publishers are now able to purchase the political clout needed
to abuse the narrow commercial scope of copyright protection, extending
it to completely inapplicable areas: slavish reproductions of historic
documents and art, for example, and exploiting the labors of unpaid
scientists. They're even able to make the taxpayers pay for their
attacks on free society by pursuing criminal prosecution (copyright has
classically been a civil matter) and by burdening public institutions
with outrageous subscription fees.

Copyright is a legal fiction representing a narrow compromise: we give
up some of our natural right to exchange information in exchange for
creating an economic incentive to author, so that we may all enjoy more
works. When publishers abuse the system to prop up their existence,
when they misrepresent the extent of copyright coverage, when they use
threats of frivolous litigation to suppress the dissemination of publicly
owned works, they are stealing from everyone else.

Several years ago I came into possession, through rather boring and
lawful means, of a large collection of JSTOR documents.

These particular documents are the historic back archives of the
Philosophical Transactions of the Royal Society - a prestigious scientific
journal with a history extending back to the 1600s.

The portion of the collection included in this archive, ones published
prior to 1923 and therefore obviously in the public domain, total some
18,592 papers and 33 gigabytes of data.

The documents are part of the shared heritage of all mankind,
and are rightfully in the public domain, but they are not available
freely. Instead the articles are available at $19 each--for one month's
viewing, by one person, on one computer. It's a steal. From you.

When I received these documents I had grand plans of uploading them to
Wikipedia's sister site for reference works, Wikisource - where they
could be tightly interlinked with Wikipedia, providing interesting
historical context to the encyclopedia articles. For example, Uranus
was discovered in 1781 by William Herschel; why not take a look at
the paper where he originally disclosed his discovery? (Or one of the
several follow on publications about its satellites, or the dozens of
other papers he authored?)

But I soon found the reality of the situation to be less than appealing:
publishing the documents freely was likely to bring frivolous litigation
from the publishers.

As in many other cases, I could expect them to claim that their slavish
reproduction - scanning the documents - created a new copyright
interest. Or that distributing the documents complete with the trivial
watermarks they added constituted unlawful copying of that mark. They
might even pursue strawman criminal charges claiming that whoever obtained
the files must have violated some kind of anti-hacking laws.

In my discreet inquiry, I was unable to find anyone willing to cover
the potentially unbounded legal costs I risked, even though the only
unlawful action here is the fraudulent misuse of copyright by JSTOR and
the Royal Society to withhold access from the public to that which is
legally and morally everyone's property.

In the meantime, and to great fanfare as part of their 350th anniversary,
the RSOL opened up "free" access to their historic archives - but "free"
only meant "with many odious terms", and access was limited to about
100 articles.

All too often journals, galleries, and museums are becoming not
disseminators of knowledge - as their lofty mission statements
suggest - but censors of knowledge, because censoring is the one thing
they do better than the Internet does. Stewardship and curation are
valuable functions, but their value is negative when there is only one
steward and one curator, whose judgment reigns supreme as the final word
on what everyone else sees and knows. If their recommendations have value
they can be heeded without the coercive abuse of copyright to silence

The liberal dissemination of knowledge is essential to scientific
inquiry. More than in any other area, the application of restrictive
copyright is inappropriate for academic works: there is no sticky question
of how to pay authors or reviewers, as the publishers are already not
paying them. And unlike 'mere' works of entertainment, liberal access
to scientific work impacts the well-being of all mankind. Our continued
survival may even depend on it.

If I can remove even one dollar of ill-gained income from a poisonous
industry which acts to suppress scientific and historic understanding,
then whatever personal cost I suffer will be justified ΓΓé¼ΓÇ¥it will be one
less dollar spent in the war against knowledge. One less dollar spent
lobbying for laws that make downloading too many scientific papers
a crime.

I had considered releasing this collection anonymously, but others pointed
out that the obviously overzealous prosecutors of Aaron Swartz would
probably accuse him of it and add it to their growing list of ridiculous
charges. This didn't sit well with my conscience, and I generally believe
that anything worth doing is worth attaching your name to.

I'm interested in hearing about any enjoyable discoveries or even useful
applications which come of this archive.

- ----
Greg Maxwell - July 20th 2011  Bitcoin: 14csFEJHk3SYbkBmajyJ3ktpsd2TmwDEBb

These stories have been covered widely and the discussion has been heavy on Twitter and in the blogosphere. The important part of this discussion for academic publishing is that it has brought many of the absurdities of the current academic publishing system into the public eye, and a lot of people are shocked and unhappy [4]. This is all happening at the same time that Britain is finally standing up to the big publishing companies as their profits [5] and business models increasingly hamper rather than benefit the scientific process, and serious questions are raised about whether we should be publishing in peer-reviewed journals at all. I suspect that we will look back on 2011 as the tipping point year when academic publishing changed forever.


[1] In an interview with Wired Campus JSTOR claimed that these aren’t technically their articles because even though JSTOR did digitize these files, and each file includes an indication of JSTORs involvement, the files lack JSTOR’s cover page, so it’s not really their files, it’s the Royal Society’s files. Which first made me think “Wow, that’s about the lamest duck and cover excuse I’ve ever heard” and then “Hey, so if I just delete the cover page off a JSTOR file then apparently they surrender all claim to it. Nice!”

[2] In addition to questionable legality of the site some of the advertising there isn’t exactly workplace appropriate.

[3] I think that given the context he would be fine with us reprinting the entire statement. I’ve done some very minor cleaning up of some junk codes for readability. The original is available here.

[4] But also conflicted about the behavior of the individuals in question.

[5] ~$120 million/year for Wiley and ~$1 billion/year for Reed Elsevier (source

16 Comments on “The war over academic publishing has officially begun

  1. Hi Ethan,

    I feel very strongly that discussions of the appropriate funding model for academic publishing–because *somebody* has to pay for *someone* to perform the publishing functions that we want performed–ought to be divorced from discussion of Swartz’s rather ridiculous stunt, or the ridiculous follow-up stunts of his defenders.

    First of all, why the hell was JSTOR, a *non-profit*, the target? Yes, JSTOR charges money for articles, and subscriptions–because they have costs. Scanning costs money. Paying copyright fees costs money (JSTOR does not own the copyright on most of the articles they host). Maintaining and developing their services costs money. Paying their employees costs money. And as far as I’ve seen, none of Swartz’s defenders have access to JSTOR’s audited financials, so none of them have any standing to claim (as they appear to be doing) that JSTOR is overstating its costs in order to rip users off. And if you honestly believe that JSTOR really is a non-profit but should nevertheless still be lumped into the same boat with Elsevier (as Maxwell apparently does), you have lost all sense of judgment and proportion.

    Second of all, Swartz was employed at Harvard and had access to JSTOR through the Harvard library. He had no connection to MIT. And he chooses to *break into a closet* at MIT and (as a predictable side effect of his actions) take down *their* JSTOR access? I don’t care what injustice you’re protesting or what political point you’re trying to make, you don’t break into your neighbor’s house and hack into your neighbor’s network to make it.

    Third of all, Swartz’s own organization put out a press release immediately after his arrest saying that he’d been arrested for the equivalent of “checking too many books out of the library”. You don’t have to like our current publishing system at all to see that that analogy is totally disingenuous to the point of laughability. I don’t believe in making heroes or martyrs out of people who can’t be bothered to defend themselves honestly.

    Fourth of all, if this was supposed to be some kind of civil disobedience-type publicity stunt, it was the crappiest one in history, and not just because of the bizarre choice of target. Swartz worked hard to try to hide his actions and avoid arrest–the opposite of what someone truly engaged in principled civil disobedience would do–and he also downloaded far more stuff (with collateral damage to MIT) than necessary to make his point. About the only thing Maxwell has going for him is that he’s prepared to take responsibility for his actions.

    Fifth of all, nobody mandates that publishers make their stuff available through JSTOR. If publishers see JSTOR as vulnerable, there’s a risk they pull out, which reduces rather than increases people’s access to the material JSTOR hosts.

    Sixth of all, Maxwell apparently has forgotten what the word “censorship” means, and his usage of that term is not justifiably strong rhetoric, it’s merely an insult to everyone who ever has been truly censored. Again, whatever you think of the current funding model for academic publishing, if you think it amounts to “censorship”, you have lost all sense of judgment and proportion, to the point where you can no longer remember the meaning of common English words.

    Seventh of all, if these stupid stunts have the side effect of causing publishers to pull out of JSTOR, then the actual effect of these stunts will have been to reduce rather than increase access to the scientific literature.

    Eighth of all, countries like Britain seem to making progress in changing the system without any help from hackers. So someone explain to me again why Swartz’s and Maxwell’s stunts are necessary, sufficient, or even likely to achieve, or even likely to help achieve, what can’t be achieved in other ways?

    None of this is to say I think Swartz should spend 35 years in jail (and there’s no chance he will; prosecutors always aim high in order to give themselves room to negotiate down in plea bargaining), or that I think our current way of funding publishing is perfect or even good. In fact, I might have even sympathized with Swartz and Maxwell if they were inclined to put in the organizational effort required to build a ground-up protest movement against the current system. But instead, all they can seemingly be bothered to do is pull pointless, anarchist, lone-wolf hacking stunts against the wrong targets, and then put out press releases stuffed with transparently bad analogies in an attempt to evade responsibility for their actions (Swarz) or sow confusion and misinformation about how academic publishing works and how it could be changed (Maxwell).

    Until I get good answers to each of the points made above, I will have no sympathy for these turkeys.

  2. Whoops, seventh point duplicates the fifth. Sorry.

  3. Ethan,
    Thank you for providing Maxwell’s response.

    Our highly connected and networked society is good for many things, including improving communication. One thing it excels at, however, is knee-jerk responses, which often have the capability of being broadcasted to everyone and archived forever. Sadly, this is what I classify Maxwell’s response as, along with silly. I believe I agree with Jeremy’s points. Furthermore, it is clear that Maxwell’s repsonse was not to advance the ‘dissemination of knowledge’ but to ‘remove even one dollar of ill-gained income’ from publishers (‘the poisonous industry’). The tone of his statements make it appear that he did it out of anger, spite, or even perhaps feelings of jelously that he wasn’t the first ‘martyr.’

    If this is a war, as your title suggests, than I am not sure which side I wish to be on. Illegal actions in warfare can be justified, but I think this one might have been premature and may not accomplish much. A publicity stunt will only take you so far. As Jeremy said, why didn’t Swartz and Maxwell put in some effort and organize something fruitful? Or propose an alternative solution to the status quo? At least a compelling argument to publish in open access journals?

    And because I can’t resist, this blog is more ‘censored’ than anything JSTOR does (censored being used very loosley; moderated is more appropriate). Furthermore, if it is hosted by WORDPRESS, they have complete censoring power, as they can yank anything they choose. And they are making money of off it. A poisonous industry?? 🙂


  4. Wow, I guess I touched a nerve.

    First to respond to Brady’s last paragraph:
    1. We only moderate to keep out spam, which is why we only moderate first time commentors (as you presumably noticed when your comment went through immediately). We never have and never will moderate a real comment (even when people call us jerks or insinuate that we censor our blog).
    2. We control the domain, and the content (which is backed up regularly), and could be back up again in a matter of hours if for some bizarre reason decided to pull the plug.
    3. We pay to be ad free so the only people that is making money off of is the two of us (and regardless that revenue model doesn’t prevent access to information).
    4. Yes, I do think about these things a lot. They’re important. Thanks for asking.

    Second, nowhere in this post did I defend the actions of any of individuals involved. In what circumstances illegal acts are justified is a complicated issue and in this situation I certainly don’t feel like I’m anywhere close to understanding the situation well enough to pass judgement on them. The point is that (whether one likes it or not) events like these tend to serve as flashpoints that lead to issues that have been building blowing up. I am arguing, rightly or wrongly, that that is what is going to happen here.

    Third, the “war” that I am referring to is to the battle to ensure that scientific publishing acts in such as way as to first and foremost maximize the distribution of knowledge. Of course this costs money and I don’t have a problem with publishers covering those costs (and even making a little profit on the side). However, a corporations goal is to maximize profit, and that pretty much by definition means that they aren’t in the business of maximizing the distribution of knowledge (check out some of Mike Rosenzweig’s excellent pieces on EER’s Citizen’s Page). If you don’t think this is an issue try checking out the price differences between independent society journals and the big corporate journals for library subscriptions sometime. I think you’ll find it enlightening. We have been failing to exercise control over our own system for knowledge distribution and in response to these “turkeys” we’re starting to be called out on this. I hope that we respond appropriately, with the good of science and the broader public distribution of knowledge in mind.

  5. As I was finishing my comment I had a sudden sense of deja vu. Ah, debating the finer points of academic publishing with Jeremy Fox right before ESA. Good times, good times.

  6. Hi Ethan,

    I had seen that you were careful not to actually defend Maxwell’s or Swartz’s actions, and I’m glad you didn’t. But posting Maxwell’s entire statement could easily give a different impression. In my view that people who care passionately about these issues, as you do (and as I do, though probably not as passionately), ought to be working hard to dissociate themselves from Maxwell and Swartz, whose actions are likely to inhibit rather than accelerate the change to a different model.

    Yes, I am well aware of the price differences between society and for-profit journals, and yes, it bothers me. But I am not entirely convinced that, because publishers are out to make money, that by definition they’re not out to maximize the distribution of knowledge. Kind of depends what you mean by “maximize”. For instance, the scientific publishing industry was actually one of the first industries to move distribution online in a big way, because their customers wanted it (and are computer-savvy and have fast internet connections). Would that have happened as fast if, say, all scientific publishing was run on a non-profit basis by scientific societies, or if publishers were somehow limited by law to only taking the modest profits you are comfortable with? I don’t actually know the answer, but I don’t think the answer is obvious.

  7. Ethan,

    My comments were specifically concerning Maxwell’s response; I realized you weren’t prosecuting or defending him. My comments about censorship were to simply reiterate Jeremy’s point (the insidious claim that JSTOR or any other publisher/archiver of scientific journals is censoring). I apologize if offense was taken; it was never my intent to do so. I know you think about this a lot.

    I agree with you; this is a complicated issue and deserves more attention. I have seen the subscription costs for society journals vs bigger corporate journals, and I’ve even paid twice this year to publish in open access journals. Yes, it is a big difference, and yes, there are advantages and disadvantages to this. I just don’t think Maxwell’s response was worth anything. What good does it do? Swartz already got the attention, and as you said the flashpoint has perhaps been defined.

    Off to take the kiddos on a walk and enjoy some neighborhood ecology.


  8. @Jeremy – Yes, these things are definitely complicated. Whenever I get into one of these conversations I always hold up Ecology Letters as an example of a for-profit journal that had a hugely positive impact on the field. I’m old enough to remember when 6 months was a reasonable time for a round of review. Ecology Letters changed all of that and we’re much better for it. My point is that the economics of the thing (see Rosenzweig’s papers) put Science’s goals and the for-profit journals’ goals at odds, and so maintaining a (in my opinion) stronger influence as a scientific community is important to make sure that things don’t veer to far in the direction of profit margins (which I believe, at the moment, they have).

    @Brady – No hard feelings, I just felt that the implication that we might be filtering discussion was serious enough that it required a clear and immediate response (regardless of the number of smiley faces).

    And now I too am off to enjoy time enjoy some family time – at the farmer’s market.

  10. Brady,

    While I was out walking the dogs this afternoon I finally realized that I had really misread the intent of the last paragraph of your first comment and I apologize for taking it the wrong way. You were saying things that you viewed as ridiculous to highlight the extreme language in Maxwell’s piece. Having spent a lot of time thinking about how to make sure that our blog was a free, open, and robust mechanism for communication I read it as mean spirited and uninformed. I should have given you more credit.

    This sort of humor is always difficult to convey on the internet and I was reading it late at night (and then early in the morning), but I think there’s something more fundamental here that comes back to the point at hand. Folks have different views on how critical the openness of knowledge and discourse are. I think that you, Jeremy and I are probably all very close to one another on this, and yet we were far enough apart that I couldn’t even understand that you were making a joke. Folks like Swartz and (presumably) Maxwell are much further away, but that doesn’t mean they are inherently wrong. There is a well established culture and ethics in corners of the computing community (one that I don’t even pretend to really understand) that places extremely high value on freedom of information and knowledge. We should all be careful about too quickly judging the actions or statements of individuals that we may not be readily able to understand.


  11. Thanks for your comments Ethan, I agree we’re not that far apart, and that we’re much further from Swartz and Maxwell. Which is kind of the point: I’m not sure it makes sense to talk about the “ethics” of a small portion of the computing community, the implication (presumably) being that that ethic merits some respect and consideration even if we don’t agree with it. Yes, some hackers put a high value on “free” information (meaning “information I personally didn’t have to pay for”)–but I don’t see why that value is owed any particular respect. Indeed, the fact that (in Swartz’s case) they’re willing to commit crimes like trespassing, breaking and entering, and evading arrest in the name of that value makes me less rather than more inclined to respect their values. I have no problem with someone who wants to refrain from judging them because all the facts have yet to be aired in court. But it sounds like you might be inclined to refrain from judging them even after the facts are aired in court?

    Sam Fleischacker’s (unfortunately out of print) book, The Ethics of Culture, is a thoughtful account of what a “culture” is and why its “ethics” might be worth taking seriously by members of different cultures. “Hackers” lack pretty much all of the elements that one might plausibly consider as marking out a separate “culture” worthy of moral consideration by non-members. They’re not a separate “culture”; they’re just people who are mostly like you and me, except they have different opinions about who should pay for information distribution.

  12. I’ll save my ‘ethics’ comments for another time, when my thoughts are better articulated in my own mind and well enough to broadcast to the public.

    Ethan’s remark concerning Ecology Letters brings up an excellent point: a problem in the ecological sciences was identified (lengthy review times), a solution was proposed (create a high quality journal with rapid review) and acted upon (first issue in July 1998), and the severity of the problem lessened (?; actual data on this would be useful, I’m sure it is out there somewhere). Contrasting this response to Swartz and Maxwell, specifically Maxwell’s knee-jerk torrent file upload, shows some long term commitment by scientists wishing to solve specific problems and improve their discipline, versus short term action-reaction publicity stunts (at least in my mind). But, the Ecology Letters example can help us a little more, I believe.

    What exactly is the problem/issue that we are struggling with?

    Toward the end of his response, Maxwell states that the ‘liberal dissemination of knowledge is essential to scientific inquiry’ and that ‘liberal access to scientific work impacts the well-being of all mankind. Our continued survival may even depend on it.’ I can’t argue too much with that, but I don’t think dissemination of knowledge is the issue here. My experience is limited, but at the three universities I’ve spent time at (all in the United States), I have had complete access to anything and everything I’ve needed. Of course things will vary from place to place, but I think it would be safe to say that at most major research institutions, access is not an issue; those that are involved in ‘our continued survival’ probably have access to just about every bit of scientific information that they need. But, it is important to remember that it is paid for.

    Is the issue a financial one? Judging by our comments, I think we agree that we don’t want this responsibility ourselves, and that paying someone to catalog, print, distribute, archive, etc. our results and conclusions is a good thing. Is it outrageously priced? Perhaps, and this would be an issue if money is restricted, or for smaller universities or companies. Data showing increases/decreases in library subscriptions with overall budget allocations would be interesting, particularly within the last couple years. If access decreases with budget cuts, then we do have a problem, albeit a financial one. On the other hand, library subscriptions are just a drop in the bucket compared to other university expenses (e.g. landscaping).

    If it isn’t a financial issue, is it the principle that we have to pay to access information? This would parallel the thinking of ‘open science’ and ‘free and open source’ software development; productivity and creativity is increased when information, data, ideas, etc. is appropriately shared. I tend to think that this is what many of these arguments are truly centered on, whether specifically stated or not. There are some that don’t like the idea of handing the copyright of their hard work over to a publisher, which then restricts access and makes money, even though they and probably the far majority of their colleagues have access to it. A current solution for this is open access journals or articles. Is it the best solution? I don’t know, but in my view it is more productive than a likely illegal torrent file upload.

    I’m sure we’ve all thought of this before, but writing it out seemed to simplify it for me. So perhaps we just ought to identify the true problem (if there is one) and come up with some real world working solutions. And maybe, as Ethan initially suggested, the recent actions of others will prompt us to do so.

  13. Great questions Brady. My take on this needs a full post on its own, but ever so briefly:

    1. It’s definitely an access issue. Folks working at high powered research universities aren’t the only ones who need/want access to the literature and it is not an uncommon complaint that tax payers are footing the bill for the research but don’t have access to the results. Even if we disregard the general public (which I think would be a mistake) lots of teaching focused colleges have access to only a tiny fraction of the literature because…

    2. It’s definitely a financial issue. There has been a lot written about this (I’ll run down links for a full post, but SPARC’s site and the previously mentioned Rosenzweig articles should get you started; or try chatting with your collections person sometime, they tend to have a lot to say about this sort of thing). Here at Utah State we have an annual exercise where the library asks if there are any new journals that we need access to and then asks us to help them figure out what to cut to free up the funds. With the number of journals expanding rapidly and the library budgets flat or declining this definitely adds up to access issues. Right now we don’t have access to key emerging journals like Methods in Ecology and Evolution because we don’t have the money and haven’t figured out what to cut yet. We also regularly run into specialized journals that we don’t have access to when trying to compile data from the literature. And (to re-emphasize my point about smaller colleges) one of my students is constantly in awe of how great out journal access is.

    3. There are certainly some principle issues related to copyright and open science.I particularly don’t like restrictions being placed on my ability to distribute my own work, and lots of academics apparently feel the same way since they post final versions of their papers to their own websites in clear violation of publication agreements. But, for me anyway, these are minor issues when distribution isn’t being excessively hampered by profit. This all costs money and open access just moves the filter to the other end of the pipe. We just need to find some balance to help address issues 1 and 2. I think that will come from scientists actively negotiating with the journals that they publish in, and volunteer for, so that Science maintains its seat at the table.

