Category Archives: things you should use

A new database for mammalian community ecology and macroecology

There are a number of great datasets available for doing macroecology and community ecology at broad spatial scales. These include data on birds (Breeding Bird Survey, Christmas Bird Count), plants (Forest Inventory & Analysis, Gentry’s transects), and insects (North American Butterfly Association Counts). However, if you wanted to do work that relied on knowing the presence or abundance of individuals at particular sites (i.e., you’re looking for something other than range maps) there has never been a decent dataset to work with for mammals.

Announcing the Mammal Community Database (MCDB)

Over the past couple of years we’ve been working to fill that gap as best we could. Since coordinated continental scale surveys of mammals don’t yet exist [1] we dug into the extensive mammalogy literature and compiled a database of 1000 globally distributed communities. Thanks to Kate Thibault‘s leadership and the hard work of Sarah Supp and Mikaelle Giffen, we are happy to announce that this data is now freely available as a data paper on Ecological Archives.

In addition to containing species lists for 1000 locales, there is abundance data for 940 of the locations, some site level body size data (~50 sites) and a handful of reasonably long (> 10 yr) time-series as well. Most of the data is restricted to the particular mode of sampling that an individual mammalogist uses and as a result much of the data is for small mammals captured in Sherman traps.

Working with data compilations like this is always difficult because the differences in sampling intensity and approaches between studies can make it very difficult to compare data across sites. We’ve put together a detailed table of information on how sampling was conducted to help folks break the data into comparable subsets and/or attempt to control for the influence of sampling differences in their statistical models.

The joys of Open Science

We’ve been gradually working on making the science that we do at Weecology more and more open, and the MCDB is an example of that. We submitted the database to Ecological Archives before we had actually done much of anything with it ourselves [2], because the main point of collecting the data was to provide a broadly useful resource to the ecological community, not to answer a specific question. We were really excited to see that as soon as we announced it on Twitter

folks started picking it up and doing cool things with it [3]. We hope that folks will find all sorts of uses for it going forward.

Going forward

We know that there is tons more data out there on mammal communities. Some of it is unpublished, or not published in enough detail for us to include. Some of it has licenses that mean that we can’t add it to the MCDB without special permission (e.g., there is a lot of great LTER mammal data out there). Lots of it we just didn’t find while searching through the literature.

If folks know of more data we’d love to hear about it. If you can give us permission to add data that has more restrictive licensing then we’d love to do so [4]. If you’re interested in collaborating on growing the database let us know. If there’s enough interest we can invest some time in developing a public portal.

The footnotes [5]

[1] We are anxiously awaiting NEON’s upcoming surveys, headed up by former Weecology postdoc Kate Thibault.

[2] We have a single paper that is currently in review that uses the data.

[3] Thanks to Scott Chamberlain and Markus Gesmann. You guys are awesome!

[4] To be clear, we haven’t been asking for permission yet, so no one has turned us down. We wanted to get the first round of data collection done first to show that this was a serious effort.

[5] Because anything that David Foster Wallace loved has to be a good thing.

Learning to program like a professional using Software Carpentry

An increasingly large number of folks doing research in ecology and other biological disciplines spend a substantial portion of their time writing computer programs to analyze data and simulate the outcomes of biological models. However, most ecologists have little formal training in software development¹. A recent survey suggests that we are not only; with 96% of scientists reporting that they are mostly self-taught when it comes to writing code. This makes sense because there are only so many hours in the day, and scientists are typically more interested in answering important questions in their field than in sitting through a bachelors degree worth of computer science classes. But, it also means that we spend longer than necessary writing our software, it contains more bugs, and it is less useful to other scientists than it could be².

Software Carpentry to the Rescue

Fortunately you don’t need to go back college and get another degree to substantially improve your knowledge and abilities when it comes to scientific programming, because with a few weeks of hard work Software Carpentry will whip you into shape. Software Carpentry was started back in 1997 to teach scientists “the concepts, skills, and tools they need to use and build software more productively” and it does a great job. The newest version of the course is composed of a combination of video lectures and exercises, and provides quick and to the point information on such critical things as:

along with lots of treatment of best practices for writing code that is clear and easy to read both for other people and for yourself a year from now when you sit down and try to figure out exactly what you did³.

The great thing about Software Carpentry is that it skips over all of the theory and detail that you’d get when taking the relevant courses in computer science and gets straight to crux - how to use the available tools most effectively to conduct scientific research. This means that in about 40 hours of lecture and 100-200 hours of practice you can be a much, much, better programmer who rights code more quickly, with fewer bugs, that be easily reused. I think of it as boot camp for scientific software development. You won’t be an expert marksman or a black belt in Jiu-Jitsu when you’re finished, but you will know how to fire a gun and throw a punch.

I can say without hesitation that taking this course is one of the most important things I’ve done in terms of tool development in my entire scientific career. If you are going to write more than 100 lines of code per year for your research then you need to either take this course or find someone to offer something equivalent at your university. Watch the lectures, do the exercises, and it will save you time and energy on programming; giving you more of both to dedicate to asking and answering important scientific questions.

______________________________________________________

¹I took 3 computer science courses in college and I get the impression that that is about 2-3 more courses than most ecologists have taken.

²I don’t know of any data on this, but my impression is that over 90% of code written by ecologists is written by a single individual and never read or used by anyone else. This is in part because we have no culture of writing code in such a way that other people can understand what we’ve done and therefore modify it for their own use.

³I know that I’ve decided that it was easier to “just start from scratch” rather than reusing my own code on more than one occasion. That won’t be happening to me again thanks to Software Carpentry

RStudio [Things you should use]

If you use R (and it seems like everybody does these days) then you should check out RStudio – an easy to install, cross-platform IDE for R. Basically it’s a seamless integration of all of the aspects of R (including scripts, the console, figures, help, etc.) into a single easy to use package. For those of you are familiar with Matlab, it’s a very similar interface. It’s not a full blown IDE yet (no debugger; no lint) but what this actually means is that it’s simple and easy to use. If you use R I can’t imagine that you won’t love this new (and open source!) tool.

UPDATE: Check out another nice article on RStudio over at i’m a chordata! urochordata!

Follow

Get every new post delivered to your Inbox.

Join 685 other followers