Author Archives: Ethan White
Over at Dynamic Ecology this morning Jeremy Fox has a post giving advice on how to decide where to submit a paper. It’s the same basic advice that I received when I started grad school almost 15 years ago and as a result I don’t think it considers some rather significant changes that have happened in academic publishing over the last decade and a half. So, I thought it would be constructive for folks to see an alternative viewpoint. Since this is really a response to Jeremy’s post, not a description of my process, I’m going to use his categories in the same order as the original post and offer my more… youthful… perspective.
- Aim as high as you reasonably can. The crux of Jeremy’s point is “if you’d prefer for more people to read and think highly of your paper, you should aim to publish it in a selective, internationally-leading journal.” From a practical perspective journal reputation used to be quite important. In the days before easy electronic access, good search algorithms, and social networking, most folks found papers by reading the table of contents of individual journals. In addition, before there was easy access to paper level citation data, and alt-metrics, if you needed to make a quick judgment on the quality of someones science the journal name was a decent starting point. But none of those things are true anymore. I use searches, filtered RSS feeds, Google Scholar’s recommendations, and social media to identify papers I want to read. I do still subscribe to tables of contents via RSS, but I watch PLOS ONE and PeerJ just as closely as Science and Nature. If I’m evaluating a CV as a member of a search committee or a tenure committee I’m interested in the response to your work, not where it is published, so in addition to looking at some of your papers I use citation data and alt-metrics related to your paper. To be sure, there are lots of folks like Jeremy that focus on where you publish to find papers and evaluate CVs, but it’s certainly not all of us.
- Don’t just go by journal prestige; consider “fit”. Again, this used to mater more before there were better ways to find papers of interest.
- How much will it cost? Definitely a valid concern, though my experience has been that waivers are typically easy to obtain. This is certainly true for PLOS ONE.
- How likely is the journal to send your paper out for external review? This is a strong tradeoff against Jeremy’s point about aiming high since “high impact” journals also typically have high pre-review rejection rates. I agree with Jeremy that wasting time in the review process is something to be avoided, but I’ll go into more detail on that below.
- Is the journal open access? I won’t get into the arguments for open access here, but it’s worth noting that increasing numbers of us value open access and think that it is important for science. We value open access publications so if you want us to “think highly of your paper” then putting it where it is OA helps. Open access can also be important if you “prefer for more people to read… your paper” because it makes it easier to actually do so. In contrast to Jeremy, I am more likely to read your paper if it is open access than if it is published in a “top” journal, and here’s why: I can do it easily. Yes, my university has access to all of the top journals in my field, but I often don’t read papers while I’m at work. I typically read papers in little bits of spare time while I’m at home in the morning or evenings, or on my phone or tablet while traveling or waiting for a meeting to start. If I click on a link to your paper and I hit a paywall then I have to decide whether it’s worth the extra effort to go to my library’s website, log in, and then find the paper again through that system. At this point unless the paper is obviously really important to my research the activation energy typically becomes too great (or I simply don’t have that extra couple of minutes) and I stop. This is one reason that my group publishes a lot using Reports in Ecology. It’s a nice compromise between being open access and still being in a well regarded journal.
- Does the journal evaluate papers only on technical soundness? The reason that many of us think this approach has some value is simple, it reduces the amount of time and energy spent trying to get perfectly good research published in the most highly ranked journal possible. This can actually be really important for younger researchers in terms of how many papers they produce at certain critical points in the career process. For example, I would estimate that the average amount of time that my group spends getting a paper into a high profile journal is over a year. This is a combination of submitting to multiple, often equivalent caliber, journals until you get the right roll of the dice on reviewers, and the typically extended rounds of review that are necessary to satisfy the reviewers about not only what you’ve done, but satisfying requests for additional analyses that often aren’t critical, and changing how one has described things so that it sits better with reviewers. If you are finishing your PhD then having two or three papers published in a PLOS ONE style journal vs. in review at a journal that filters on “importance” can make a big difference in the prospect of obtaining a postdoc. Having these same papers out for an extra year accumulating citations can make a big difference when applying for faculty positions or going up for tenure if folks who value paper level metrics over journal name are involved in evaluating your packet.
- Is the journal part of a review cascade? I don’t actually know a lot of journals that do this, but I think it’s a good compromise between aiming high and not wasting a lot of time in review. This is why we think that ESA should have a review cascade to Ecosphere.
- Is it a society journal? I agree that this has value and it’s one of the reasons we continue to support American Naturalist and Ecology even though they aren’t quite as open as I would personally prefer.
- Have you had good experiences with the journal in the past? Sure.
- Is there anyone on the editorial board who’d be a good person to handle your paper? Having a sympathetic editor can certainly increase your chances of acceptance, so if you’re aiming high then having a well matched editor or two to recommend is definitely a benefit.
To be clear, there are still plenty of folks out there who approach the literature in exactly the way Jeremy does and I’m not suggesting that you ignore his advice. In fact, when advising my own students about these things I often actively consider and present Jeremy’s perspective. However, there are also an increasing number of folks who think like I do and who have a very different set of perspectives on these sorts of things. That makes life more difficult when strategizing over where to submit, but the truth is that the most important thing is to do the best science possible and publish it somewhere for the world to see. So, go forth, do interesting things, and don’t worry so much about the details.
It’s that time of year again when we’re all busy working on preproposals for the National Science Foundation, and just like last year it’s more difficult than you would think to track down the official guidelines using Google. So, for all of you who don’t have a minute to spare, here they are:
Also, remember that Biographical Sketches are different than for full proposals:
Biographical Sketches (2-page limit for each) should be included for each person listed on the Personnel page. It should include the individual’s expertise as related to the proposed research, professional preparation, professional appointments, five relevant publications, five additional publications, and up to five synergistic activities. Advisors, advisees, and collaborators should not be listed on this document, but in a separate table (see below).
And that there is a big stack of things that should not be included at this stage in the process:
Budget, Budget Justification, Facilities, Equipment and Other Resources, Current and Pending Support, Letters of Collaboration, Data Management Plan, Postdoctoral Mentoring Plan, RUI Impact Statement, Certification of RUI Eligibility, or any other Supplementary Documents.
UPDATE: Included separate links for DEB and IOS.
We’re looking for a new student to join our interdisciplinary research group. The opening is in Ethan’s lab, but the faculty, students, and postdocs in Weecology interact seamlessly among groups. If you’re interested in macroecology, community ecology, or just about anything with a computational/quantitative component to it, we’d love to hear from you. The formal ad is included below (and yes, we did include links to our blog, twitter, and our GitHub repositories in the ad). Please forward this to any students who you think might be a good fit, and let us know if you have any questions.
GRADUATE STUDENT OPENING
The White Lab at Utah State University has an opening for a graduate student with interests in Macroecology, Community Ecology, or Ecological Theory/Modeling. Active areas of research in the White lab include broad scale patterns related to biodiversity, abundance and body size, ecological dynamics, and the use of sensor networks for studying ecological systems. We use computational, mathematical, and advanced statistical methods in much of our work, so students with an interest in these kinds of methods are encouraged to apply. Background in these quantitative techniques is not necessary, only an interest in learning and applying them. While students interested in one of the general areas listed above are preferred, students are encouraged to develop their own research projects related to their interests. The White Lab is part of an interdisciplinary ecology research group (http://weecology.org) whose goal is to facilitate the broad training of ecologists in areas from field work to quantitative methods. Students with broad interests are jointly trained in an interdisciplinary setting. We are looking for students who want a supportive environment in which to pursue their own ideas. Graduate students are funded through a combination of research assistantships, teaching assistantships, and fellowships. Students interested in pursuing a PhD are preferred. Utah State University has an excellent graduate program in ecology with over 50 faculty and 80+ graduate students across campus affiliated with the USU Ecology Center (http://www.usu.edu/ecology/).
Additional information about the position and Utah State University is available at:
Interested students can find more information about our group by checking out:
Our websites: http://whitelab.weecology.org, http://weecology.org
Our code repositories: http://github.com/weecology
Our blog: http://jabberwocky.weecology.org
And Twitter: http://twitter.com/ethanwhite
Interested students should contact Dr. Ethan White (firstname.lastname@example.org) by December 1st, 2012 with their CV, GPA, GRE scores (if available), and a brief statement of research interests.
ESA has just announced that it has changed its policy on preprints and will now allow articles that have been posted on major preprint servers, like arXiv, to be considered for publication in its journals.
All ESA journals accept submissions of ms that have previously been posted to a preprint service such as arXiv! A great way to get feedback!—
Scott L Collins (@ESA_Prez2013) September 05, 2012
I am very excited about this change for two reasons. First, as nicely laid out in INNGE blog post by Philippe Desjardins-Proulx*, there are many positive benefits to science of the preprint culture. They make science more accessible, allow researchers to get feedback from the community prior to peer review, and speed up the scientific process by making ideas available to others as quickly as possible. We should take this opportunity as a community to start developing the kind of vibrant preprint culture that has benefited so many other disciplines. Second, I am encouraged by the rapid response of ESA to the concerns expressed by myself and other members of the community, and take it as a sign that my favorite society is open to making the kinds of changes that are necessary to best facilitate science in the modern era. More work is clearly necessary, but this is a very encouraging start.
UPDATE: Carl Boettiger has posted his very nice letter to Don Strong that played an critical roll in taking this discussion from a bunch of folks talking over social media to something that effected meaningful change.
Recently a bunch of folks in the biological sciences have started sharing their grant proposals openly. Their reasons for doing so are varied (see the links next to their names below), but part of the common justification is a general interest in opening up science so that all stages of the process can benefit from better interaction and communication, and part of it is to provide examples for younger scientists writing grants. To help accomplish both of these goals I’m going to do what Titus Brown suggested and compile a list of all of the available open proposals in the biological sciences (if you’re looking for math proposals they have a list too). Given the limited number of proposals available at the moment I’m just going to maintain the list here, sorted alphabetically by PI. Another way to find proposals is to look at the ‘grant’ and ‘proposal’ tags on figshare, where several of us have been posting proposals. If you know of more proposals, decide to post some yourself, or have corrections to proposal in the list, just let me know in the comments and I’ll keep the list updated. Enjoy!
- 2008 / New Investigator Grant Application (NERC) *funded
- 2008 / EMBO Young Investigator Programme Application (EMBO)
- 2007 / Responsive Mode Grant Application (BBSRC)
- 2012 / NSF Office of Cyberinfrastructure proposal, Materials and Workshops for Cyberinfrastructure Education in Biology supplement to BEACON. *funded
- 2012 / NSF CAREER proposal, Assembling Extremely Large Metagenomes
- 2012 / NSF BIGDATA proposal, Low-memory Streaming Prefilters for Biological Sequencing Data
- 2012 / Moore Foundation proposal on marine metagenomics
- 2011 / NSF CAREER proposal: “Scaling and Improving de Bruijn graph assembly”
- 2010 / Next-gen course (NIH R25) *funded
- 2009 / Web tools for next-gen sequence analysis (USDA) *funded
- 2007 / Cartwheel
- Kathryn Fuller Doctoral Fellowship application (WWF)
- 2010 / Prairie Biotic Research proposal *funded
- 2009 / Ecological and evolutionary impacts of pollinator sharing between cultivated and wild sunflowers (Norman Hackerman Advanced Research Program)
- 2009 / Lewis and Clark grant proposal (American Philosophical Society)
- Doctoral Dissertation Improvement Grant proposal (NSF)
- Forest Shreeve Award proposal
- Ariel Appleton Research Fellowship Proposal – Ecological Networks
- How do crop-mediated changes in mutualist and antagonist communities affect selection on floral and defense traits?
- 2011 / “Automated and community synthesis of the tree of life” (NSF AVATOL) *funded
- 2010 / “Towards a comprehensive, community-owned and sustainable repository of reusable phylogenetic knowledge” w/Hilmar Lapp (NSF ABI)
- 2009 / “A network for enabling community-driven standards to link evolution into the global web of data (EvoIO)” w/Hilmar Lapp (NSF INTEROP)
- 2012 / Understanding range shift model error: The inﬂuence of generation time and rate of adaptation on species distribution model predictions. w/Scott Chamberlain (NCEAS proposal).
- 2008 / Evolution under simulated climate change in response to trophic shifts. (NSF DDIG) *funded
- 2010 / Protein Design Using Quantum Mechanics (Danish Center for Supercomputing) *funded
- 2008 / Computational Design of Stable Enzymes (Danish National Science Foundation, DSF-NABIIT) *funded
- 2006 / Modeling pH-Dependence in Drug Design (EU Marie Curie Program) *funded
- 2006 / Computational Prediction and Validation of Protein Structure and Function in Protein Engineering and Rational Drug Design (Danish National Science Foundation, FNU) *funded
- 2006 / Prediction and Interpretation of Protein pKa’s Using QM/MM (US National Science Foundation – MCB; rescinded when I moved to Denmark) *funded
- 2002 / The Prediction and Interpretation of Protein pKa’s Using QM/MM (US National Science Foundation – MCB) *funded
- 2010 / Ontology-enabled reasoning across phenotypes from evolution and model organisms w/Todd Vision (NSF) *funded
Heather Piwowar (@researchremix) & Jason Priem (@jasonpriem) (read their thoughts on sharing proposals)
- Uptake proposal (CIHR)
- 2007 / Sxy proposal (CIHR) *funded
- 2001 / CIHR proposal *funded
- 1999 / NIH proposal *funded
- 2012 / Data Management and Computational Skills Training for LTER Scientists w/Ethan White & Greg Wilson (LTER Training Working Groups Proposal)
- 2011 / Fuelwood, Savannas, and Climate Change: Integrating Modeling, Field Experimentation, and Optical and Radar Remote Sensing (NASA Predoctoral Graduate Fellowship) *funded
- 2012 / Genomic tools to study coral reef resilience (University of Melbourne)
- 2012 / Plastid endosymbiosis: a detailed study of genome dynamics (Australian Research Council)
- 2012 / Evolutionary dynamics of the algae: Understanding adaptive potential under environmental change (Australian Research Council) *funded
- Probing key innovations with next generation sequencing
- 2009 / Macroevolutionary dynamics of marine algae
- 2012 / Sustainable and Scalable Infrastructure for the Publication of Data (NSF) *funded
- 2008 / A Digital Repository for Preservation and Sharing of Data Underlying Published Works in Evolutionary Biology (NSF) *funded
- 2010 / CAREER: Advancing Macroecology Using Informatics and Entropy Maximization (NSF CAREER Award) *funded
- 2005 / Broad-scale patterns of the distribution of body sizes of individuals in ecological communities (NSF Postdoc Fellowship) *funded
- 2008 / Understanding multimodality in animal size distributions (NSF Research Starter Grant) *funded
As I announced on Twitter about a week ago, I am now making all of my grant proposals open access. To start with I’m doing this for all of my sole-PI proposals, because I don’t have to convince my collaborators to participate in this rather aggressively open style of science. At the moment this includes three funded proposals: my NSF Postdoctoral Fellowship proposal, an associated Research Starter Grant proposal, and my NSF CAREER award.
So, why am I doing this, especially with the CAREER award that still has several years left on it and some cool ideas that we haven’t worked on yet. I’m doing it for a few reasons. First, I think that openness is inherently good for science. While there may be benefits for me in keeping my ideas secret until they are published, this certainly doesn’t benefit science more broadly. By sharing our proposals the cutting edge of scientific thought will no longer be hidden from view for several years and that will allow us to make more rapid progress. Second, I think having examples of grants available to young scientists has the potential to help them learn how to write good proposals (and other folks seem to agree) and therefore decrease the importance of grantsmanship relative to cool science in the awarding of limited funds. Finally, I just think that folks deserve to be able to see what their tax dollars are paying for, and to be able to compare what I’ve said I will do to what I actually accomplish. I’ve been influenced in my thinking about this by posts by several of the big open science folks out there including Titus Brown, Heather Piwowar, and Rod Page.
To make my grants open access I chose to use figshare for several reasons.
- Credit. Figshare assigns a DOI to all of its public objects, which means that you can easily cite them in scientific papers. If someone gets an idea out of one of my proposals and works on it before I do, this let’s them acknowledge that fact. Stats are also available for views, shares, and (soon) citations, making it easier to track the impact of your larger corpus of research outputs.
- Open Access. All public written material is licensed under CC-BY (basically just cite the original work) allowing folks to do cool things without asking.
- Permanence. I can’t just change my mind and delete the proposal and I also expect that figshare will be around for a long time.
- Version control. For proposals that are not funded, revised, not funded, revised, etc. figshare allows me to post multiple versions of the proposal while maintaining the previous versions for posterity/citation.
During this process I’ve come across several other folks doing similar things and even inspired others to post their proposals, so I’m in the process of compiling a list of all of the publicly available biology proposals that I’m aware of and will post a list with links soon. It’s my hope that this will serve as a valuable resource for young and old researchers alike and will help to lead the way forward to a more open scientific dialogue.
Over the weekend I saw this great tweet:
by Philippe Desjardins-Proulx and was pleased to see yet another actively open young scientist. Then I saw his follow up tweet:
…hopefully Ecology Letters and the Ecological Society of America will reverse their conservative anti-arXiv policy soon :P—
P Desjardins-Proulx (@phdpqc) July 14, 2012
At first I was confused. I thought ESA’s policy was that preprints were allowed based on the following text on their website (emphasis mine: still available in Google’s Cache):
A posting of a manuscript or thesis on a personal or institutional homepage or ftp site will generally be considered as a preprint; this will not be grounds for viewing the manuscript as published. Similarly, posting of manuscripts in public preprint archives or in an institution’s public archive of unpublished theses will not be considered grounds for declaring a manuscript published. If a manuscript is available as part of a digital publication such as a journal, technical series or some other entity to which a library can subscribe (especially if that publication has an ISSN or ISBN), we will consider that the manuscript has been published and is thus not eligible for consideration by our journals. A partial test for prior publication is whether the manuscript has appeared in some entity with archival value so that it is permanently available to reasonably diligent scholars. A necessary test for prior publication is whether the author can legally transfer copyright to ESA.
So I asked Philippe to explain his tweet:
This got me a little riled up so I broadcast my displeasure:
And then Jarrett Byrnes questioned where this was coming from given the stated policy:
So I emailed ESA to check and, sure enough, preprints on arXiv and similar preprint servers are considered prior publication and therefore cannot be submitted to ESA journals, despite the fact that this isn’t a problem for a few journals you may have heard of including Science, Nature, PNAS, and PLoS Biology. ESA (to their credit) has now clarified this point on their website (emphasis mine; thanks to Jaime Ashander for the heads up):
A posting of a manuscript or thesis on an author’s personal or home institution’s website or ftp site generally will not be considered previous publication. Similarly posting of a “working paper” in an institutional repository is allowed so long as at least one of the authors is affiliated with that institution. However, if a manuscript is available as part of a digital publication such as a journal, technical series, or some other entity to which a library can subscribe (especially if that publication has an ISSN or ISBN), we will consider that the manuscript has been published and is thus not eligible for consideration by our journals. Likewise, if a manuscript is posted in a citable public archive outside the author’s home institution, then we consider the paper to be self-published and ineligible for submission to ESA journals. Finally, a necessary test for prior publication is whether the author can legally transfer copyright to ESA.
In my opinion the idea that a preprint is “self-published” and therefore represents prior publication is poorly justified* and not in the best interests of science, and I’m not the only one:
So now I’m hoping that Jarrett is right:
and that things might change (and hopefully soon). If you know someone on the ESA board, please point them in the direction of this post.
UPDATE: Just as I was finishing working on this post ESA responded to the tweet stream from the last few days:
I’m very excited that ESA is reviewing their policies in this area. As I should have said in the original post, I have, up until this year, been quite impressed with ESA’s generally open, and certainly pro-science policies. This last year or so has been a bad one, but I’m hoping that’s just a lag in adjusting to the new era in scientific publishing.
UPDATE 2: ESA has announced that they have changed their policy and will now consider articles with preprints.
———————————————————————————————————————————————————————–*I asked ESA if they wanted to clarify their justification for this policy and haven’t heard back (though it has been less than 2 days). If they get back to me I’ll update or add a new post.
It’s that time of year again when the new Impact Factor values are released. This is such a big deal to a lot of folks that it’s pretty hard to avoid hearing about it. We’re not the sort of folks that object to the use of impact factors in general – we are scientists after all and part of being a scientist is quantifying things. However, if we’re going to quantify things it is incumbent upon us to try do it well and there are several things that we need to address if we are going to have faith in our measures of journal quality.
1. Stop using impact factor use Eigenfactor based metrics instead
The impact factor simply determines the number of papers that cite another paper and calculates the average. This might have been a decent approach when the IF was first invented, but it’s a terrible approach now. The problem is that according to network theory, and some important applications thereof (e.g., Google), it is also important to take into account the importance of the papers/journals that are doing the citing. Fortunately we now have metrics that do this properly: the Eigenfactor and associated Article Influence Score. These are even report by ISI right next to the IF.
Here’s a quick way to think about this. You have two papers, one that has been cited 30 times by papers that are never cited, and one that has been cited 30 times by papers that are themselves each cited 30 times. If you think the two papers are equally important, then please continue using the impact factor based metrics. If you think that the second paper is more important then please never mention the words “impact factor” again and start focusing on better approaches for quantifying the influence of nodes in a network.
2. Separate reviews (and maybe methods) from original research
We’ve known pretty much forever that reviews are cited more than original research papers, so it doesn’t make sense to compare review journals to non-review journals. While it’s easy to just say that TREE and Ecology are apples and oranges, the real problem is journals that mix reviews and original research. Since reviews are more highly cited, just changing the mix of these two article types can manipulate the impact factor. Sarah Supp and I have a paper on this is you’re interested in seeing some science and further commentary on the issue. The answer is easy, separate the analyses for review papers. It has also been suggested that methods papers have higher citation rates as well, but as I admit in my back and forth with Bob O’Hara (the relevant part of which is still awaiting moderation as I’m posting) there doesn’t seem to be any actual research on this to back it up.
3. Solve the problem of metrics that are strongly influenced by the number of papers
In the citation analysis of individual scientists there has always been the problem of how to deal with the number of papers. The total number of citations isn’t great since one way to get a large number of citations is to write a lot of not particularly valuable papers. The average number of citations per paper is probably even worse because no one would argue that a scientist who writes a single important paper and then stops publishing is contributing maximally to the progress of science.
In journal level citation analyses these two end points have up until recently been all we had, with ISI choosing to focus on the average number of citations per paper and Eigenfactor the total number of citations . The problem is that these approaches encourage gaming by journals to publish either the most or fewest papers possible. Since the issues with publishing too many papers are obvious I’ll focus on the issue of publishing too few. Assuming that journals have the ability to predict the impact of individual papers , the best way to maximize per article measures like the impact factor is to publish as few papers as possible. Adding additional papers simply dilutes the average citation rate. The problem is that by doing so the journal is choosing to have less influence on the field (by adding more, largely equivalent quality, papers) in favor of having a higher perceived impact. Think about it this way. Is a journal that publishes a total of 100 papers that are cited 5 times each, really more important than a journal that publishes 200 papers, 100 of which are cited 5 times each and 100 that are cited 4 times each? I think that the second journal is more important, and that’s why I’m glad to see that Google Scholar is focusing on the kinds of integrative metrics (like the h-index) that we use to evaluate individual researchers.
The good news is that we do have better metrics, that are available right now. The first thing that we should do is start promoting those instead of the metric that shall not be named. We should also think about improving these metrics further. If they’re worth talking about, they are worth improving. I’d love to see a combination of the network approaches in Eigenfactor with the approaches to solving the number of publications problem taken by Google. Of course, more broadly, we are already in the progress of moving away from journal level metrics and focusing more on the impact of individual papers. I personally prefer this approach and think that it’s good for science, but I’ll leave my thoughts on that for another day.
UPDATE 2: Fixed the broken link to the “Why Eigenfactor?” page.
 Both sets of metrics include both approaches with total citations from ISI and Article Influence Score, which is the per paper equivalent of the Eigen Factor, it’s just that they don’t seem to get as much… um… attention.
 And if they didn’t then all we’re measuring is how well different journals game the system plus some positive feedback where journals that are known to be highly cited garner more readers and therefore more future citations.
People find blog posts in different ways. Some visit the website regularly, some subscribe to email updates, and some subscribe using the blog’s feed. Feeds can be a huge time saver for processing the ever increasing amount of information that science generates, by placing much of that information in a single place in a simple, standardized, format. It also lets you consume one piece of information at a time and keeps your inbox relatively free of clutter (for more about why using a feed reader is awesome see this post).
When setting up their feeds bloggers can choose to either provide the entire content of the post, or just a small teaser that contains just the first few sentences of the post. In this post I am going to argue that science bloggers should choose to provide full posts.
The core reason is that we are are doing this to facilitate scientific dialog, and we are all very busy. In addition to the usual academic work load of teaching, doing research, and helping our departments and universities function, we are now dealing with keeping up with a rapidly expanding literature plus a bloom of scientific blogs, tweets, and status updates (and oh yeah, some of us even have personal lives). This means that we are consuming a massive amount of information on a daily basis and we need to be able to do so quickly. I squeeze this in during small windows of time (bus rides home, gaps between meetings, while I’m running my toddler’s bath) and often on a mobile device.
I can do this easily if I have full feeds. I open my feed reader, open the first item, read it, move on to the next one. My brain knows exactly what format to expect, cognitive load is low, and the information is instantly available. If instead I encounter a teaser, I first have to make a conscious decision about whether or not I want to click through to the actual post, then I have to hit the link, wait for the page to load (which can still be a fairly long time on a phone), adjust to a format that varies widely across blogs, often adjust the zoom and rotate my screen (if I’m reading on my phone), read the item, and then return to my reader. This might not seem like a huge deal for a handful of items, but multiply the lost time by a few hundred or a few thousand items a week and it adds up in a hurry. On top of that I store and tag full-text, searchable, copies of posts for all of the blogs I follow in my feed reader so that I can find posts again. This is handy when I remember there is a post I want to either share with someone or link to, but can’t remember who wrote it.
So, if your blog doesn’t provide full feeds this means three things. First, I am less likely to read a post if it’s a teaser. It costs me extra time, so the threshold for how interesting it needs to be goes up. Second, if I do read it I now have less time to do other things. Third, if I want to find your post again to recommend it to someone or link to it, the chances of my doing so successfully are decreased. So, if your goal is science communication, or even just not being disrespectful of your readers’ time, full feeds are the way to go.
This all goes for journal tables of contents as well. As I’ve mentioned before, if the journal feed doesn’t include the abstracts and the full author line, it is just costing the papers readers, and the journal’s readers time, and therefore making the scientific process run more slowly than it could.
So, bloggers and journal editors, for your readers sake, for sciences sake, please turn on full feeds. It will only take you two minutes. It will save science hundreds of hours. It will probably be this most productive thing you do for science all week.
Characterizing the species-abundance distribution with only information on richness and total abundance [Research Summary]
This is the first of a new category of posts here at Jabberwocky Ecology called Research Summaries. We like the idea of communicating our research more broadly than to the small number of folks who have the time, energy, and interest to read through entire papers. So, for every paper that we publish we will (hopefully) also do a blog post communicating the basic idea in a manner targeted towards a more general audience. As a result these posts will intentionally skip over a lot of detail (technical and otherwise), and will intentionally use language that is less precise, in order to communicate more broadly. We suspect that it will take us quite a while to figure out how to do this well. Feedback is certainly welcome.
This is a Research Summary of: White, E.P., K.M. Thibault, and X. Xiao. 2012. Characterizing species-abundance distributions across taxa and ecosystems using a simple maximum entropy model. Ecology. http://dx.doi.org/10.1890/11-2177.1*
The species-abundance distribution describes the number of species with different numbers of individuals. It is well known that within an ecological community most species are relatively rare and only a few species are common, and understanding the detailed form of this distribution of individuals among species has been of interest in ecology for decades. This distribution is considered interesting both because it is a complete characterization of the commonness and rarity of species and because the distribution can be used to test and parameterize ecological models.
Numerous mathematical descriptions of this distribution have been proposed and much of the research into this pattern has focused on trying to figure out which of these descriptions is “the best” for a particular group of species at a small number of sites. We took an alternative approach to this pattern and asked: Can we explain broad scale, cross-taxonomic patterns in the general shape of the abundance distribution using a simple model that requires only knowledge of the species richness and total abundance (summed across all species) at a site?
To do this we used a model that basically describes the most likely form of the distribution if the average number of individuals in a species is fixed (which turns out to be a slightly modified version of the classic log-series distribution; see the paper or John Harte’s new book for details). As a result this model involves no detailed biological processes and if we know richness and total abundance we can predicted the abundance of each species in the community (i.e., the abundance of the most common species, second most common species… rarest species).
Since we wanted to know how well this works in general (not how well it works for birds in Utah or trees in Panama) we put together a a dataset of more than 15,000 communities. We did this by combining 6 major datasets that are either citizen science, big government efforts, or compilations from the literature. This compilation includes data on birds, trees, mammals, and butterflies. So, while we’re missing the microbes and aquatic species, I think that we can be pretty confident that we have an idea of the general pattern.
In general, we can do an excellent job of predicting the abundance of each rank of species (most abundant, second most abundant…) at each site using only information on the species richness and total abundance at the site. Here is a plot of the observed number of individuals in a given rank at a given site against the number predicted. The plot is for Breeding Bird Survey data, but the rest of the datasets produce similar results.
The model isn’t perfect of course (they never are and we highlight some of its failures in the paper), but it means that if we know the richness and total abundance of a site then we can capture over 90% of the variation in the form of the species-abundance distribution across ecosystems and taxonomic groups.
This result is interesting for two reasons:
First, it suggests that the species-abundance distribution, on its own, doesn’t tell us much about the detailed biological processes structuring a community. Ecologists have know that it wasn’t fully sufficient for distinguishing between different models for a while (though we didn’t always act like it), but our results suggest that in fact there is very little additional information in the distribution beyond knowing the species richness and total abundance. As such, any model that yields reasonable richness and total abundance values will probably produce a reasonable species-abundance distribution.
Second, this means that we can potentially predict the full distribution of commonness and rarity even at locations we have never visited. This is possible because richness and total abundance can, at least sometimes, be well predicted using remotely sensed data. These predictions could then be combined with this model of the species-abundance distribution to make predictions for things like the number of rare species at a site. In general, we’re interested in figuring out how much ecological pattern and process can be effectively characterized and predicted at large spatial scales, and this research helps expand that ability.
So, that’s the end of our first Research Summary. I hope it’s a useful thing that folks get something out of. In addition to the science in this paper, I’m also really excited about the process that we used to accomplish this research and to make it as reproducible as possible. So, stay tuned for some follow up posts on big data in ecology, collaborative code development, and making ecological research more reproducible.
*The paper will be Open Access once it is officially published but ,for reasons that don’t make a lot of sense to me, it is behind a paywall until it comes out in print.