New journals that are changing the way we publish

Academic publishing is in a dynamic state these days with large numbers of new journals popping up on a regular basis. Some of these new journals are actively experimenting with changing traditional approaches to publication and peer review in potentially important ways. So, I thought I’d provide a quick introduction to some of the new kids on the block that I think have the potential to change our approach to academic publishing.

PeerJ

PeerJ is in some ways a fairly standard PLOS One style open access journal. Like PLOS One they only publish primary research (no reviews or opinion pieces) and that research is evaluated only on the quality of the science not on its potential impact. However, what makes PeerJ different (and the reason that I’m volunteering my time as an associate editor for them) is their philosophy that in the era of the modern web it should it should be both cheap and easy to publish scientific papers:

We aim to drive the costs of publishing down, while improving the overall publishing experience, and providing authors with a publication venue suitable for the 21st Century.

The pricing model is really interesting. Instead of a flat fee per paper PeerJ uses a lifetime author memberships. For $99 (total for life) you can publish 1 paper/year. For $199 you can publish 2 papers/year and for $299 you can publish unlimited papers for life. Every author has to have a membership so for a group of 5 authors publishing in PeerJ for the first time it would cost $495, but that’s still about 1/3 of what you’d pay at PLOS One and 1/6 of what you’d pay to make a paper open access at a Wiley journal. And that same group of authors can publish again next year for free. How can they publish for so much less than anyone else (and whether it is sustainable) is a bit of open question, but they have clearly spent a lot of time (and serious publishing experience) thinking about how to automate and scale publication in an affordable manner both technically and in terms things like typesetting (since single column text no attempt to wrap text around tables and figures is presumably much easier to typeset). If you “follow the money” as Brian McGill suggests then the path may well lead you to PeerJ.

Other cool things about PeerJ:

  • Optional open review (authors decide whether reviews are posted with accepted manuscripts, reviewers decide whether to sign reviews)
  • Ability to comment on manuscripts with points being given for good comments.
  • A focus on making life easy for authors, reviewers, and editors, including a website that is an absolute joy compared to interact with and a lack of rigid formatting guidelines that have to be satisfied for a paper to be reviewed.

We want authors spending their time doing science, not formatting. We include reference formatting as a guide to make it easier for editors, reviewers, and PrePrint readers, but will not strictly enforce the specific formatting rules as long as the full citation is clear. Styles will be normalized by us if your manuscript is accepted.

Now there’s a definable piece of added value.

Faculty of 1000 Research

Faculty of 1000 Research‘s novelty comes from a focus on post-publication peer review. Like PLOS One & PeerJ it reviews based on quality rather than potential impact, and it has a standard per paper pricing model. However, when you submit a paper to F1000 it is immediately posted publicly online, as a preprint of sorts. They then contact reviewers to review the manuscript. Reviews are posted publicly with the reviewers names. Each review includes a status designation of “Approved” (similar to Accept or Minor Revisions), “Approved with Reservations” (similar to Major Revisions), and “Not Approved” (similar to Reject). Authors can upload new versions of the paper to satisfy reviewers comments (along with a summary/explanation of the changes made), and reviewers can provide new reviews and new ratings. If an article receives two “Approved” ratings or one “Approved” and two “Approved with Reservations” ratings then it is considered accepted. It is then identified on the site as having passed peer review, and is indexed in standard journal databases. The peer review process is also open to anyone, so if you want to write a review of a paper you can, no invite required.

It’s important to note that the individuals who are invited to review the paper are recommended by the authors. They are checked to make sure that they don’t have conflicts of interest and are reasonably qualified before being invited, but there isn’t a significant editorial hand in selecting reviewers. This could be seen as resulting in biased reviews, since one is likely to select reviewers that may be biased towards liking you work. However, this is tempered by the fact that the reviewers name and review are publicly attached to the paper, and therefore they are putting their scientific reputation on the line when they support a paper (as argued more extensively by Aarssen & Lortie 2011).

In effect, F1000 is modeling a system of exclusively post-publication peer review, with a slight twist of not considering something “published/accepted” until a minimum number of positive reviews are received. This is a bold move since many scientists are not comfortable with this model of peer review, but it has the potential to vastly speed up the rate of scientific communication in the same way that preprints do. So, I for one think this is an experiment worth conducting, which is why I recently reviewed a paper there.

Oh, and ecologists can currently publish there for free (until the end of the year).

Frontiers in X

I have the least personal experience with the Frontiers’ journals (including the soon to launch Frontiers in Ecology & Evolution). Like F1000Research the ground breaking nature of Frontiers is in peer review, but instead of moving towards a focus on post-publication peer review they are attempting to change how pre-publication review works. They are trying to make review a more collaborative effort between reviewers and authors to improve the quality of the paper.

As with PeerJ and F1000Research, Frontiers is open access and has a review process that focuses on “the accuracy and validity of articles, not on evaluating their significance”. What makes Frontiers different is their two step review process. The first step appears to be a fairly standard pre-publication peer review, where “review editors” provide independent assessments of the paper. The second step (the “Interactive Review phase”) is where the collaboration comes in. Using an “Interactive Review Forum” the authors and all of the reviewers (and if desirable the associate editor and even the editor in chief for the subdiscipline) work collaboratively to improve the paper to the point that the reviewers support its publication. If disagreements arise the associate editor is tasked with acting as a mediator in the conversation. If a paper is eventually accepted then the reviewers names are included with the paper and taken as indicating that they sign off on the quality of the paper (see Aarssen & Lortie 2011 for more discussion of this idea; reviewers can withdraw from the process at any point in which case their names are not included).

I think this is an interesting approach because it attempts to make the review process a friendlier and more interactive process that focuses on quickly converging through conversation on acceptable solutions rather than slow long-form exchanges through multiple rounds of conventional peer review that can often end up focusing as much on judging as improving. While I don’t have any personal experiences with this system I’ve seen a number of associate editors talk very positively about the process at Frontiers.

Conclusions

This post isn’t intended to advocate for any of these particular journals or approaches. These are definitely experimental and we may find that some of them have serious limitations. What I do advocate for is that we conduct these kinds of experiments with academic publishing and support the folks who are taking the lead by developing and test driving these systems to see how they work. To do anything else strikes me as accepting that current academic publishing practices are at their global optimum. That seems fairly unlikely to me, which makes the scientist in me want to explore different approaches so that we can find out how to best evaluate and improve scientific research.

UPDATE: Fixed link to the Faculty of 1000 Research paper that I reviewed. Thanks Jeremy!

UPDATE 2: Added a missing link to Faculty of 1000 Research’s main site.

UPDATE 3: Fixed the missing link to Frontiers in Ecology & Evolution. Apparently I was seriously linking challenged this morning.

The best way to not get a job: don’t apply

It’s job season. It’s that time of year again when our young scientists pour over a wide variety of job ads and ask themselves that critically important question: do I apply?

In some ways, this is the most critical step in the entire job application process. Yes, your job packet is important. There’s the goldilocks problem of conveying your awesome in the cover letter but worrying about sounding conceited. There’s writing your CV. There’s thinking about the institution your application is going to and what it values. Yadda yadda yadda. There are lots of great resources to get advice on these things. No, I’m here to talk about one of the lesser discussed issues of the job packet: Choosing to send an application in.

I’m going to give my advice through a little story. It’s a story only one other person knows in its complete form.

Many years ago, I was a young post-doc desperately applying for jobs. I had interviews, but no offers. My postdoc funding was running out (again) and I was pretty demoralized. I saw the following job ad:

ASSISTANT PROFESSOR — SPATIAL ECOLOGY

Department of Biology and Ecology Center

Utah State University

 The Department of Biology (http://www. biology.usu.edu) and the Ecology Center (http://www.usu.edu/ecology) at Utah State University seek a tenure-track assistant professor in spatial ecology. Candidates must have a Ph.D. or equivalent in biology, ecology, or a related field; show evidence of the ability to sustain an extramurally funded research program; and be able to teach effectively at the undergraduate and graduate levels. Postdoctoral experience is preferred.  We seek an ecologist investigating the effects of global change on the patterns, processes, and mechanisms of the spatial distributions of populations and communities.  The research must complement current ecological and evolutionary research at USU.  We prefer a person that can collaborate with one or more projects in landscape ecology, conservation biology, pollination biology, invasion ecology, and ecosystem ecology and modeling. Applicants with the ability to integrate mechanisms at the organismal level with patterns and predictions of range shifts at the regional to global scales will be given favorable consideration. The teaching assignment is open depending on research specialty. Deadline: Feb 1, 2004.

The description had little overlap with what I did. I didn’t (and still don’t) do spatial distributions of anything. I didn’t do (on my own or in collaboration) any work on landscape ecology, applied conservation biology, pollination biology, invasion ecology, or ecosystem ecology or modeling. I didn’t do range shifts at any scale. My work was in the realm of understanding how global change impacts communities and I do compare community structure and dynamics across space and time. I did feel like I could make an argument that I integrated organismal level mechanisms with higher levels of biological organization. I also felt like my research was well suited for collaborating broadly with people doing landscape ecology, conservation biology, etc. In short, this job ad was clearly not a perfect match for me but I felt like I could make an argument that I fit pieces of what they were looking for.  I decided to go ahead and apply. My references sent in my letters of recommendation. But as the deadline approached, I had a serious case of imposter syndrome. I wasn’t a perfect fit for that job ad and all the rejections were really damaging my limited self-esteem. Why apply for something that was pretty much guaranteed to give me yet another rejection? So I didn’t send in my application. You heard me, my letters went in, but I did not apply.

A little while later, one of my references contacted me. Someone at USU had contacted them because my application was missing and they were worried it had gotten lost in transit.  I muttered something about that being strange, assured my letter writer that I would get on that. I was too embarrassed to admit I hadn’t sent it (this is the part that only one other person knew). So I sent it immediately so I could say I did.

How did it end?  I am now a tenured faculty member at Utah State University.

There’s a couple of morals from this tale:

1)      By not sending in my application, I was rejecting myself for that job. Plain and simple. End of story. By rejecting myself, I almost caused myself to lose a job.

2)      the job ad doesn’t have to be a perfect match for you to be a good match for the department. You’d think the job ad accurately described what a department was looking for, but a department is not a monolithic entity. Ever leave a committee meeting frustrated by the conflicting advice you received on your proposal/coursework plan/thesis?  Imagine writing something that incorporated the impassioned feedback from 20+ committee members.  That’s the job ad. Since the job ad is imperfect, this means your perfect (or imperfect) fit with it is an unreliable indicator of whether or not you should apply. Do you see something in the ad that reflects what you do? Is it a job you’re interested in? Then apply.

When I give this advice to students, I often hear, “But won’t they get mad/irritated/think poorly of me for wasting their time?” Maybe, but they won’t remember you. I have been on a number of search committees. There are always applications that have no relevance to the job description. I’ve seen medical cellular-molecular types applying for organismal evolutionary positions and landscape hydrologists applying for wildlife animal population ecology positions. I can’t remember the names of any of them. The search committee may giggle, but they’ll never remember it was you.

And, just to be clear, I’m not advocating completely ignoring the job ad. Even though no one would classify me as a spatial ecologist, there were definitely aspects of that job ad that fit me. I’m just saying: don’t be scared off by an imperfect fit.

The “2+n Body” Problem: Sabbatical planning with kids

If you follow Ethan (@ethanwhite) or I (@skmorgane) on twitter, you are probably aware that we are on sabbatical right now out at the University of North Carolina Chapel Hill (go, tarheels!) along with our pre-school aged daughter. Anyway a tweet of mine recently elicited a post request:

 

 

So, as requested, here are some things to think about when you have 2 academics plus child(ren). This is just my experience, so hopefully others will pop up with their own.

1)      Check institutional rules: I received tenure a couple years before Ethan. So for us to take sabbatical together, I needed to delay my sabbatical until he was eligible. I have heard that not all institutions allow you to do this. I also know that there are some departments that don’t guarantee that academic spouses can take sabbatical at the same time. Neither of these applied to us, but you should check your faculty code and talk to your Department Head. Do this well in advance.

2)      Picking a spot: This is the chance for both of you to recharge, explore new ideas, and gear up for another productive 7 years. It’s important that you find someplace that will work for both of you. We initially planned on going to Europe for sabbatical, but places that were awesome for one of us, had fewer opportunities for the other. That’s part of the reason we ended up in the Research Triangle. We both have a colleague we really like working with here and there’s lots of great people in both of our areas of interest.

3)      Finding a Daycare: Many universities have websites to help their faculty find quality daycare. Website quality varies, so check out the daycare websites for all the higher ed institutions in your sabbatical area. I used Duke’s very helpful website that told me about the North Carolina accrediting system and provided basic information about daycares in the area that they had partnered with. I used it and the State of North Carolina’s childcare facility search site to generate a list of daycare’s that fit our search criteria (type of daycare, location, accreditation, costs). I also poured over reviews in Yelp and Angie’s List and scoured daycare websites to generate a list of about 10 places. We applied to all of them. Most of them either never got back to us or told us we had little chance of getting in. But we did have 3 that offered us a slot and a choice is a nice thing to have.

4)      Daycare timing – contacting: Honestly, we probably waited too long on this. For an August arrival, we started contacting daycares in June. We’re happy with the daycare we got into, but I probably should have started the process earlier so that we were further up in the waiting lists.

5)      Daycare timing – start: The other thing to consider is when your child will start relative to when you will be arriving. We had two and half weeks between when we arrived and the earliest date our daughter could start. Because we were juggling a variety of things (renters moving into our house, going to ESA, grant deadlines), we didn’t have a lot of ability to either shorten or lengthen that. But I think it worked well. Our daughter had time to get comfortable with her new home before also being tossed into a new daycare situation. Besides, it gave us time to actually visit the new daycare before she started to make sure all of us were happy with it.

6)      Housing: Colleagues at UNC forwarded us some sabbatical home ads, but we eventually found one we liked on SabbaticalHomes.com.  Many of the homes are furnished and being rented by other academics. We had our good friend (and sabbatical host) check out our home to make sure it wasn’t next door to a crack den. If you aren’t fortunate enough to have someone to ask, then there are some resources on the web that can help you figure out the crime rate of your prospective neighborhood, though you may need to subscribe to get access to fine scale info (e.g., neighborhoodscout.com).

So, in < 1000 words, that’s what I could think to convey. We also spent a lot of time selling the positives of the “really big trip” and making sure she understood that we would be coming home at the end. I suspect there are additional complexities when kids are K-12, but hopefully someone will comment with advice. Finally, this post is aimed at people moving for their sabbatical. In this modern world, moving to a different location may not be feasible for a long list of reasons. If someone has advice for making the most of conducting sabbatical at your home institution, I suspect it would find an eager audience.

Happy planning and feel free to leave a comment if I didn’t cover something you wanted to hear about or you have stuff to add!

EcoBloggers: The ecology blog aggregator

Screenshot of EcoBloggers website

EcoBloggers is a relatively new blog aggregator started by the awesome International Network of Next-Generation Ecologists (INNGE). Blog aggregators pull together posts from a number of related blogs to provide a one stop shop for folks interested in that topic. The most famous example of a blog aggregator in science is probably Research Blogging. I’m a big fan of EcoBloggers for three related reasons.

  1. It provides easy access to the conversations going on in the ecology blogosphere for folks who don’t have a well organized system for keeping up with blogs. If your only approach to keeping up with blogs is to check them yourself via your browser when you have a few spare minutes (or when you’re procrastinating on writing that next paper or grant) it really helps if you don’t have to remember to check a dozen or more sites, especially since some of those sites won’t post particularly frequently. Just checking EcoBloggers can quickly let you see what everyone’s been talking about over the last few days or weeks. Of course, I would really recommend using a feed reader both for tracking blogs and journal tables of contents, but lots of folks aren’t going to do that and blog aggregators are the next best thing.
  2. EcoBloggers helps new blogs, blogs with smaller audiences, and those that don’t post frequently, reach the broader community of ecologists. This is important for building a strong ecological blogging community by keeping lots of bloggers engaged and participating in the conversation.
  3. It helps expose readers to the breadth of conversations happening across ecology. This helps us remember that not everyone thinks like us or is interested in exactly the same things.

The site is also nicely implemented so that it respects the original sources of the content

  1. It’s opt-in
  2. Each post lists the name of the originating blog and the original author
  3. All links take you to the original source
  4. It aggregates using RSS feeds you can set your site so that only partial articles show up on EcoBloggers (of course this requires you to ignore my advice on providing full feeds)

Are there any downsides to having your blog on EcoBloggers? I don’t think so. The one issue that might be raised is that if someone reads your article on EcoBloggers, then they may not actually visit your site and your stats could end up being lower than they would have otherwise. If any of the ecology blogs were making a lot of money off of advertising I could see this being an issue, but they aren’t. We’re presumably all here to engage in scientific dialogue and to communicate our ideas as brobably as possible. This is only aided by participating in an aggregator because your writing will reach more people than it would otherwise.

So, checkout EcoBloggers, use it to keep up with what’s going on in the ecology blogosphere, and sign up your blog today.

UPDATE: According to a short chat on Twitter, EcoBloggers will soon be automatically shortening the posts on their site even if your blog is providing full feeds. This means that if you didn’t buy my arguments above and were worried about loosing page views, there’s nothing to worry about. If the first paragraph or so of your post is interesting enough to get people hooked they’ll have to come over to your blog to read the rest.

An open letter to Ecology Letters and the British Ecological Society about preprints

Dear Ecology Letters and the British Ecological Society ,

I am writing to ask that you support the scientific good by allowing the submission of papers that have been posted as preprints. I or my colleagues have reached out to you before without success, but I have heard through various grapevines that both of you are discussing this possibility and I want to encourage you to move forward with allowing this important practice.

The benefits of preprints to science are substantial. They include:

  1. More rapid communication and discussion of important scientific results
  2. Improved quality of published research by allowing for more extensive pre-publication peer review
  3. A fair mechanism for establishing precedence that is not contingent the idiosyncrasies of formal peer review
  4. A way for early-career scientists to demonstrate productivity and impact on a time scale that matches their need to apply for postdoctoral fellowships and jobs

I am writing to you specifically because your journals represent the major stumbling block for those of us interested in improving science by posting preprints. Your journals either explicitly do not allow the submission of papers that have preprints posted online or lack explicit statements that it is OK to do so. This means that if there is any possibility of eventually submitting a paper to one of these journals then researchers must avoid posting preprints.

The standard justification that journals give for not allowing preprints is that they constitute “prior publication”. However, this is not an issue for two reasons. First, preprints are not peer reviewed. They are the equivalent of a long established practice in biology of sending manuscripts to colleagues for friendly review and to make them aware of cutting edge work. They simply take advantage of the internet to scale this to larger numbers of colleagues. Second, the vast majority of publication outlets do not believe that preprints represent prior publication, and therefore the publication ethics of the broader field of academic publishing clearly allows this. In particular Science, Nature, PNAS, the Ecological Society of America, the Royal Society, Springer, and Elsevier all generally allow the posting of preprints. Nature even wrote about this policy nearly a decade ago stating that:

Nature never wishes to stand in the way of communication between researchers. We seek rather to add value for authors and the community at large in our peer review, selection and editing… Communication between researchers includes not only conferences but also preprint servers… As first stated in an editorial in 1997, and since then in our Guide to Authors, if scientists wish to display drafts of their research papers on an established preprint server before or during submission to Nature or any Nature journal, that’s fine by us.

If you’d like to learn more about the value of preprints, and see explanations of why some of the other common concerns about preprints are unjustified, some colleagues and I have published a paper on The Case for Open Preprints in Biology.

So, I am asking that for the good of science, and to bring your journals in line with widely accepted publication practices, that you please move quickly to explicitly allow the submission of papers that have been posted as preprints.

Regards,
Ethan White

Going big with data in ecology

A friend of mine once joked that doing ecological informatics meant working with data that was big enough that you couldn’t open it in an Excel spreadsheet. At the time (~6 years ago) that meant a little over 64,000 rows in a table). Times have changed a bit since then, We now talk about “big data” instead of “informatics”, Excel can open a table with a little over 1,000,000 rows of data, and most importantly there is an ever increasing amount of publicly available ecological, evolutionary, and environmental data that we can use for tackling ecological questions.

I’ve been into using relatively big data since I entered graduate school in the late 1990s. My dissertation combined analyses of the Breeding Bird Survey of North America (several thousand sites) and assembling hundreds of other databases to understand how patterns varied across ecosystems and taxonomic groups.

One of the reasons that I like using large amounts of data is that has the potential to gives us general answers to ecological questions quickly. The typical development of an ecological idea over the last few decades can generally be characterized as:

  1. Come up with an idea
  2. Test it with one or a few populations, communities, etc.
  3. Publish (a few years ago this would often come even before Step 2)
  4. In a year or two test it again with a few more populations, communities, etc.
  5. Either find agreement with the original study or find a difference
  6. Debate generality vs. specificity
  7. Lather, rinse, repeat

After a few rounds of this, taking roughly a decade, we gradually started to have a rough idea of whether the initial result was general and if not how it varied among ecosystems, taxonomic groups, regions, etc.

This is fine, and in cases where new data must be generated to address the question this is pretty much what we have to do, but wouldn’t it be better if we could ask and answer the question more definitely with the first paper. This would allow us to make more rapid progress as a science because instead of repeatedly testing and reevaluating the original analysis we would be moving forward and building on the known results. And even if it still takes time to get to this stage, as with meta-analyses that build on decades of individual tests, using all of the available data still provides us with a general answer that is clearer and more (or at least differently) informative than simply reading the results of dozens of similar papers.

So, to put it simply, one of the benefits of using “big data” is to get the most general answer possible to the question of interest.

Now, it’s clear that this idea doesn’t sit well with some folks. Common responses to the use of large datasets (or compilations of small ones) include concerns about the quality of large datasets or the ability of individuals who haven’t collected the data to fully understand it. My impression is that these concerns stem from a tendancy to associate “best” with “most precise”. My personal take is that being precise is only half of the problem. If I collect the best dataset imaginable for characterizing pattern/process X, but it only provides me with information on a single taxonomic group at a single site, then, while I can have a lot of confidence in my results, I have no idea whether or not my results apply beyond my particular system. So, precision is great, but so is getting genearlizable results, and these two things trade off against one another.

Which leads me to what I increasingly consider to be the ideal scenario for areas of ecological research where some large datasets (either inherently large or assembled from lots of small datasets) can be applied to the question of interest. I think the ideal scenario is a combination of “high quality” and “big” data. By analyzing these two sets of data separately, and determining if the results are consistent we can have the maximum confidence in our understanding of the pattern/process. This is of course not trivial to do. First it requires a clear idea of what is high quality for a particular question and what isn’t. In my experience folks rarely agree on this (which is why I built the Ecological Data Wiki). Second, it further increases the amount of time, effort, and knowledge that goes into the ideal study, and finding the resources to identify and combine these two kinds of data will not be easy. But, if we can do this (and I think I remember seeing it done well in some recent ecological meta-analyses that I can’t seem to find at the moment) then we will have the best possible answer to an ecological question.

Further reading:

Four basic skill areas for a macroecologist [Guest post]

This is a guest post by Elita Baldridge (@elitabaldridge), a graduate student in Ethan White’s lab in the Ecology Center at Utah State University.

As a budding macroecologist, I have thought a lot about what skills I need to acquire during my Ph.D. This is my model of the four basic attributes for a macroecologist, although I think it is more generally applicable to many ecologists as well:

  • Data
  • Statistics
  • Math
  • Programming

Data:

  • Knowledge of SQL
  • Dealing with proper database format and structure
  • Finding data
  • Appropriate treatments of data
  • Understanding what good data are

Statistics:

  • Bayesian
  • Monte Carlo methods
  • Maximum likelihood methods
  • Power analysis
  • etc.

Math:

  • Higher level calculus
  • Should be able to derive analytical solutions for problems
  • Modelling

Programming:

  • Should be able to write programs for analysis, not just simple statistics and simple graphs.
  • Able to use version control
  • Once you can program in one language, you should be able to program in other languages without much effort, but should be fluent in at least one language.

General recommendations:

Achieve expertise in at least 2 out of the 4 basic areas, but be able to communicate with people who have skills in the other areas.  However, if you are good at collaboration and come up with really good questions, you can make up for skill deficiencies by collaborating with others who possess those skills.  Start with smaller collaborations with the people in your lab, then expand outside your lab or increase the number of collaborators as your collaboration skills improve.

Gaining skills:

Achieving proficiency in an area is best done by using it for a project that you are interested in.  The more you struggle with something, the better you understand it eventually, so working on a project is a better way to learn than trying to learn by completing exercises.

The attribute should be generalizable to other problems:  For example, if you need to learn maximum likelihood for your project, you should understand how to apply it to other questions.  If you need to run an SQL query to get data from one database, you should understand how to write an SQL query to get data from a different database.

In graduate school:

Someone who wants to compile their own data or work with existing data sets needs to develop a good intuitive feel for data; even if they cannot write SQL code, they need to understand what good and bad databases look like and develop a good sense for questionable data, and how known issues with data could affect the appropriateness of data for a given question. The data skill is also useful if a student is collecting field data, because a little bit of thought before data collection goes a long way toward preventing problems later on.

A student who is getting a terminal master’s and is planning on using pre-existing data should probably be focusing on the data skill (because data is a highly marketable skill, and understanding data prevents major mistakes).  If the data are not coming from a central database, like the BBS, where the quality of the data is known, additional time will have to be added for time to compile data, time to clean the data, and time to figure out if the data can be used responsibly, and time to fill holes in the data.

Master’s students who want to go on for a Ph.D. should decide what questions they are interested in and should try to pick a project that focuses on learning a good skill that will give them a headstart- more empirical (programming or stats), more theoretical (math), more applied (math (e.g., for developing models), stats(e.g., applying pre-existing models and evaluating models, etc.), or programming (e.g. making tools for people to use)).

Ph.D. students need to figure out what types of questions they are interested in, and learn those skills that will allow them to answer those questions.  Don’t learn a skill because it is trendy or you think it will help you get a job later if you don’t actually want to use that skill.  Conversely, don’t shy away from learning a skill if it is essential for you to pursue the questions you are interested in.

Right now, as a Ph.D. student, I am specializing in data and programming.  I speak enough math and stats that I can communicate with other scientists and learn the specific analytical techniques I need for a given project.  For my interests (testing questions with large datasets), I think that by the time I am done with my Ph.D., I will have the skills I need to be fairly independent with my research.

Open talks and posters from Weecology at #ESA2013

We had a great time at ESA this year and enjoyed getting to interact with lots of both old and new friends and colleagues. Since we’re pretty into open science here at Weecology, it’s probably not surprising that we have a lot of slides (and even scripts) from our many and varied talks and posters posted online, and we thought it might be helpful to aggregate them all in one place. Enjoy.

Thanks to Dan McGlinn for help to assembling the links.

Ignite Talk: Why constraint based approaches to ecology?

Slides and script from Morgan Ernest’s Ignite talk on Why constraint based approaches to ecology from Elita Baldridge and Ethan White’s thought provoking ESA 2013 session on Constraints in Ecology. Slides are also archived on figshare.

Slide 1-3:

Slide1

As this coral reef food web so aptly demonstrates, nature is complex. There is a dizzying array of diversity across species, across ecosystems and even across individuals within species and ecosystems.

To grapple with this complexity, ecology has a long tradition of using a reductionist approach. The hallmark of this approach is the belief that if we can understand the dynamics of each of the pieces of this complicated machinery, then we can understand how the machine as a whole works.

But as we delve into these complex systems, we have generally found that this is easier said than done. This food web only documents the direct trophic interactions.

Slide 4:

Slide4

If we wanted to completely model this ecosystem, then on top of this feeding network, we would need to add population regulation for each of the species, competitive interactions, predator/prey relationships, mutualisms, indirect interactions, abiotic dynamics….

Slide 5:

Slide5

But say we do this. Say we completely model all the biotic interactions in an ecosystem. What then? Well, we generally have to start all over again if we want to study a completely different coral reef, much less a desert or grassland or tropical forest.

Slide 6:

Slide6

What if instead of modelling all the complexity, it was possible to distill a complicated system down to a few core principles. Principles that constrain the possible states that an ecosystem can even exhibit? What if those constraints not only limit the possible diversity we have to think about, but actually help us better understand how and why a system can seem to break those constraints?

Slide 7

Slide7

So let’s talk briefly about how a constraint works. In biology, a common constraint that arises is because something important is finite. The trick with something finite is that it sets a constraint on how much if available to be used.

Slide 8:

Slide8

There are lots of things in biology that are finite. Unless you have a time machine or an ability to defy the laws of thermodynamics, typically time and resources are finite. Amount constraints are important because they often limit what an individual can do or the productivity of ecosystems

Slide 9:

Slide9

If there is an amount constraint it will often also give rise to a partitioning constraint. As your finite amount of stuff gets allocated, its not available for other uses. To allocate more time to knitting, I’d have to take time away from other activities

Slide 10:

Slide10

The only way to increase investment in one activity w/o negatively impacting another is by increasing the size of the pie. Sadly, my time machine has been acting up lately.

Slide 11:

Slide11

Biologically, this holds for both the individual and the ecosystem levels. For an organism, investment in reproduction may reduce resources available for other functions. At the ecosystem-level, resources used by one species aren’t available for other species to use

Slide 12-13:

Slide12

So, how can thinking about a constraint change your view of your favorite system? Here’s an example from my favorite system involving desert rodents. In 2003, Ethan White, Kate Thibault and I began to wonder why our long-term field site always setting new rodent records high levels of rodent abundances. When we plotted the data it was clear that the number of rodents caught in a year has been increasing since the study started in 1977

Slide 14:

Slide14

But if we try to estimate the amount of resources being used by the community, by summing an index of metabolic rate across all individuals, it hasn’t been increasing at all. This suggests that somehow the community is violating what I just told you. They are supporting more individuals on the same amount of resources.

Slide 15&16:

Slide15

The answer goes back to the pie. When you divide up the same pie into smaller pieces, the pie can support more slices. In our case, the community has shifted from large to small species. Small individuals require less resources oer individual than larger species…hence more individuals on the same amount of resource.

Slide 17:

Slide17

One of the cool things about constraints is that in biology they’re kinda like the pirate code in Pirates of the Caribbean: they are guidelines that evolution cleverly tries to get around. Need more nitrogen? Make friends w/ a microbe that can fix atmospheric nitrogen for you. Need more resources to devote to reproduction? Convince your relatives to help out.

Slide 18:

Slide18

And that’s why I really love thinking about constraints in biology. They can really help us do two things: take a bewildering array of complexity and provide an ordered expectation of how the world should look.

Slide 19:

Slide19

And by understanding how the world should look, it helps us better understand and examine those individuals, species, or ecosystems that seem to be doing things a little differently. It is those who do things differently that can provide us with the best insights into cool biology.

Slide 20:

Slide20

The talks that come after me will explain different types of constraints and the cool things that understanding those constraints have allowed them to ask. And hopefully by the end of this session we will have convinced you to start looking at your system through constraint-based eyes and see if cool new questions pop out at you too!

Ignite Talk: Big Data in Ecology

Slides and script from Ethan White’s Ignite talk on Big Data in Ecology from Sandra Chung and Jacquelyn Gill‘s excellent ESA 2013 session on Sharing Makes Science Better. Slides are also archived on figshare.

Title slide

1.  I’m here to talk to you about the use of big data in ecology and to help motivate a lot of the great tools and approaches that other folks will talk about later in the session.

Photos of field work

2.  The definition of big is of course relative, and so when we talk about big data in ecology we typically mean big relative to our standard approaches based on observations and experiments conducted by single investigators or small teams.

Image of Microsoft Excel

3.  And for those of you who prefer a more precise definition, my friend Michael Weiser defines big data and ecoinformatics as involving anything that can’t be successfully opened in Microsoft Excel.

Map of Breeding Bird Survey

4.  Data can be of unusually large size in two ways. It can be inherently large, like citizen science efforts such as Breeding Bird Survey, where large amounts of data are collected in a consistent manner.

Images of Dryad, figshare, and Ecological Archives

5.  Or it can be large because it’s composed of a large number of small datasets that are compiled from sources like Dryad, figshare, and Ecological Archives to form useful compilation datasets for analysis.

Dataset logos

6.  We have increasing amounts of both kinds of data in ecology as a result of both major data collection efforts and an increased emphasis on sharing data.

Maps and quote about large scale ecology from NEON

7-8.  But what does this kind of data buy us. First, big data allows us to work at scales beyond those at which traditional approaches are typically feasible. This is critical because many of the most pressing issues in ecology including climate change, biodiversity, and invasive species operate at broad spatial and long temporal scales.

Map and results of general analysis

9-10.  Second, big data allows us to answer questions in general ways, so that we get the answer today instead of waiting a decade to gradually compile enough results to reach concensus. We can do this by testing theories using large amounts of data from across ecosystems and taxonomic groups, so that we know that our results are general, and not specific to a single system (e.g., White et al. 2012).

The most interesting man in the worlds says: I don't always analyze data, but when I do, I prefer a lot of it

11. This is the promise of big data in ecology, but realizing this potential is difficult because working with either truly big data or data compilations is inherently challenging, and we still lack sufficient data to answer many important questions.

Bullett points: 1. Training, 2. Tools, 3. More data.

12. This means that if we are going to take full advantage of big data in ecology we need 3 things. Training in computational methods for ecologists, tools to make it easier to work with existing data, and more data.

Logos of groups running training initiatives

13. We need to train ecologists in the computational tools needed for working with big data, and there are an increasing number of efforts to do this including Software Carpentry (which I’m actively involved in) as well as training initiatives at many of the data and synthesis centers.

Logos for DataONE, Dryad, NEON, Morpho, and DataUP

14. We need systems for storing, distributing, and searching data like DataONE, Dryad, NEON‘s data portal, as well as the standardized metadata and associated tools that make finding data to answer a particular research question easier.

Screenshot of Ecological Data Wiki

15. We need crowd-sourced systems like the Ecological Data Wiki to allow us to work together on improving insufficient metadata and understanding what kinds of analyses are appropriate for different datasets and how to conduct them rigorously.

rOpenSci and EcoData Retriever logos

16. We need tools for quickly and easily accessing data like rOpenSci and the EcoData Retriever so that we can spend our time thinking and analyzing data rather than figuring out how to access it and restructure it.

Map of Life, GBIF, and EcoData Retriever logos

17. We also need systems that help turn small data into big data compilations, whether it be through centralized standardized databases like GBIF or tools that pull data together from disparate sources like Map of Life.

Screen shot of preprint, and Morpho, DataUP, and CC0 logos

18. And finally we we need to continue to share more and more data and share it in useful ways. With the good formats, standardized metadata, and open licenses that make it easy to work with.

Dataset logos

19. And so, what I would like to leave you with is that we live in an exciting time in ecology thanks to the generation of large amounts of data by citizen science projects, exciting federal efforts like NEON, and a shift in scientific culture towards sharing data openly.

River Ernest-White saying "Aw Dad, Big Data s sch a buzz word"

20. If we can train ecologists to work with and combine existing tools in interesting ways, it will let us combine datasets spanning the surface of the globe and diversity of life to make meaningful predictions about ecological systems.

Follow

Get every new post delivered to your Inbox.

Join 1,963 other followers