EcoBloggers: The ecology blog aggregator
EcoBloggers is a relatively new blog aggregator started by the awesome International Network of Next-Generation Ecologists (INNGE). Blog aggregators pull together posts from a number of related blogs to provide a one stop shop for folks interested in that topic. The most famous example of a blog aggregator in science is probably Research Blogging. I’m a big fan of EcoBloggers for three related reasons.
- It provides easy access to the conversations going on in the ecology blogosphere for folks who don’t have a well organized system for keeping up with blogs. If your only approach to keeping up with blogs is to check them yourself via your browser when you have a few spare minutes (or when you’re procrastinating on writing that next paper or grant) it really helps if you don’t have to remember to check a dozen or more sites, especially since some of those sites won’t post particularly frequently. Just checking EcoBloggers can quickly let you see what everyone’s been talking about over the last few days or weeks. Of course, I would really recommend using a feed reader both for tracking blogs and journal tables of contents, but lots of folks aren’t going to do that and blog aggregators are the next best thing.
- EcoBloggers helps new blogs, blogs with smaller audiences, and those that don’t post frequently, reach the broader community of ecologists. This is important for building a strong ecological blogging community by keeping lots of bloggers engaged and participating in the conversation.
- It helps expose readers to the breadth of conversations happening across ecology. This helps us remember that not everyone thinks like us or is interested in exactly the same things.
The site is also nicely implemented so that it respects the original sources of the content
- It’s opt-in
- Each post lists the name of the originating blog and the original author
- All links take you to the original source
- It aggregates using RSS feeds you can set your site so that only partial articles show up on EcoBloggers (of course this requires you to ignore my advice on providing full feeds)
Are there any downsides to having your blog on EcoBloggers? I don’t think so. The one issue that might be raised is that if someone reads your article on EcoBloggers, then they may not actually visit your site and your stats could end up being lower than they would have otherwise. If any of the ecology blogs were making a lot of money off of advertising I could see this being an issue, but they aren’t. We’re presumably all here to engage in scientific dialogue and to communicate our ideas as brobably as possible. This is only aided by participating in an aggregator because your writing will reach more people than it would otherwise.
So, checkout EcoBloggers, use it to keep up with what’s going on in the ecology blogosphere, and sign up your blog today.
UPDATE: According to a short chat on Twitter, EcoBloggers will soon be automatically shortening the posts on their site even if your blog is providing full feeds. This means that if you didn’t buy my arguments above and were worried about loosing page views, there’s nothing to worry about. If the first paragraph or so of your post is interesting enough to get people hooked they’ll have to come over to your blog to read the rest.
Some meandering thoughts on the difference between EcologicalData.org and DataONE
In the comments of my post on the Ecological Data Wiki Jarrett Byrnes asked an excellent question:
Very cool. I’m curious, how do you think this will compare/contrast/fight with the Data One project – https://www.dataone.org/ – or is this a different beast altogether?
As I started to answer it I realized that my thoughts on the matter were better served by a full post, both because they are a bit lengthy and because I don’t actually know much about DataONE and would love to have some of their folks come by, correct my mistaken impressions, and just chat about this stuff in general.
To begin with I should say that I’m still trying to figure this out myself, both because I’m still figuring out exactly what DataONE is going to be, and because EcologicalData is still evolving. I think that both projects goals could be largely defined as “Organizing Ecology’s Data,” but that’s a pretty difficult task, involving a lot of components and a lot of different ways to tackle them. So, my general perspective is that the more folks we have trying the merrier. I suspect there will be plenty of room for multiple related projects, but I’d be just as happy (even happier probably) if we could eventually find a single centralized location for handling all of this. All I want is solution to the challenge.
But, to get to the question at hand, here are the differences I see based on my current understanding of DataONE:
1. Approach. There are currently two major paradigms for organizing large amounts of information. The first is to figure out a way to tell computers how to do it for us (e.g., Google), the second is to crowdsource it’s development and curation (e.g., Wikipedia). DataONE is taking the computer based approach. It’s heavy on metadata, ontologies, etc. The goal is to manage the complexities of ecological data by providing the computer with very detailed descriptions of the data that it can understand. We’re taking the human approach, keeping things simple and trying to leverage the collective knowledge and effort of the field. As part of this difference in approach I suspect that EcologicalData will be much more interactive and community driven (the goal is for the community to actually run the site, just like Wikipedia) whereas DataONE will tend to be more centralized and hierarchical. I honestly couldn’t tell you which will turn out better (perhaps the two approaches will each turn out to be better for different things) but I’m really glad that we’re trying both at the same time to figure out what will work and where their relative strengths might be.
2. Actually serving data. DataONE will do this; we won’t. This is part of the difference in approach. If the computer can handle all of the thinking with respect to the data then you want it to do that and just spit out what you want. Centralizing the distribution of heterogeneous data is a really complicated task and I’m excited the folks at DataONE are tackling the challenge.
a. One of the other challenges for serving data is that is that you have to get all of the folks who “own” the data to let you provide it. This is one of the reasons I came up with the Data Wiki idea. By serving as a portal it helps circumvent the challenges of getting all of the individual stake holders to agree to participate.
b. We do provide a tool for data acquisition, the EcoData Retriever, that likewise focuses on circumventing the need to negotiate with data providers by allowing each individual investigator to automatically download the data from the source. But, this just sets up each dataset independently, whereas I’m presuming that DataONE will let you just run one big query of all the data (which I’m totally looking forward to by the way) .
3. Focus. The primary motivation behind the Data Wiki goes beyond identifying datasets and really focuses on how you should use them. Having worked with other folks’ data for a number of years I can say that the biggest challenging (for me anyway) is actually figuring out all of the details of when and how the dataset should be used. This isn’t just a question of reading metadata either. It’s a question of integrating thoughts and approaches from across the literature. What I would like to see develop on the Data Wiki pages is the development of concise descriptions for how to go about using these datasets in the best way possible. This is a very difficult task to automate and one where I think a crowdsourced solution is likely the most effective. We haven’t done a great job of this yet, but Allen Hurlbert and I have some plans to develop a couple of good examples early in the fall to help demonstrate the idea.
4. We’re open for business. Ha ha, eat our dust DataONE. But seriously, we’ve taken a super simple approach which means we can get up and running quickly. DataONE is doing something much more complicated and so things may take some time to roll out. I’m hoping to get a better idea of what their time lines look like at ESA. I’m sure their tools will be well worth the wait.
5. Oh, and their budget is a little over $2,000,000/year, which is just slightly larger than our budget of around $5,000/year.
So, there is my lengthy and meandering response to Jarrett’s question. I’m looking forward to chatting with DataONE folks at ESA to find out more about what they are up to, and I’d love to have them stop by here to chat and clear up my presumably numerous misconceptions.
 Though we do have some ideas for managing something somewhat similar, so stay tuned for EcoData Retriever 2.0. Hopefully coming to an internet near you sometime this spring.
Learning to program like a professional using Software Carpentry
An increasingly large number of folks doing research in ecology and other biological disciplines spend a substantial portion of their time writing computer programs to analyze data and simulate the outcomes of biological models. However, most ecologists have little formal training in software development¹. A recent survey suggests that we are not only; with 96% of scientists reporting that they are mostly self-taught when it comes to writing code. This makes sense because there are only so many hours in the day, and scientists are typically more interested in answering important questions in their field than in sitting through a bachelors degree worth of computer science classes. But, it also means that we spend longer than necessary writing our software, it contains more bugs, and it is less useful to other scientists than it could be².
Software Carpentry to the Rescue
Fortunately you don’t need to go back college and get another degree to substantially improve your knowledge and abilities when it comes to scientific programming, because with a few weeks of hard work Software Carpentry will whip you into shape. Software Carpentry was started back in 1997 to teach scientists “the concepts, skills, and tools they need to use and build software more productively” and it does a great job. The newest version of the course is composed of a combination of video lectures and exercises, and provides quick and to the point information on such critical things as:
along with lots of treatment of best practices for writing code that is clear and easy to read both for other people and for yourself a year from now when you sit down and try to figure out exactly what you did³.
The great thing about Software Carpentry is that it skips over all of the theory and detail that you’d get when taking the relevant courses in computer science and gets straight to crux – how to use the available tools most effectively to conduct scientific research. This means that in about 40 hours of lecture and 100-200 hours of practice you can be a much, much, better programmer who rights code more quickly, with fewer bugs, that be easily reused. I think of it as boot camp for scientific software development. You won’t be an expert marksman or a black belt in Jiu-Jitsu when you’re finished, but you will know how to fire a gun and throw a punch.
I can say without hesitation that taking this course is one of the most important things I’ve done in terms of tool development in my entire scientific career. If you are going to write more than 100 lines of code per year for your research then you need to either take this course or find someone to offer something equivalent at your university. Watch the lectures, do the exercises, and it will save you time and energy on programming; giving you more of both to dedicate to asking and answering important scientific questions.
¹I took 3 computer science courses in college and I get the impression that that is about 2-3 more courses than most ecologists have taken.
²I don’t know of any data on this, but my impression is that over 90% of code written by ecologists is written by a single individual and never read or used by anyone else. This is in part because we have no culture of writing code in such a way that other people can understand what we’ve done and therefore modify it for their own use.
³I know that I’ve decided that it was easier to “just start from scratch” rather than reusing my own code on more than one occasion. That won’t be happening to me again thanks to Software Carpentry
Ecology on the Web [Things you should read]
Jarrett Byrnes‘ first turn at the helm of the ESA Bulletin’s Ecology on the Web feature is now up. It’s definitely worth a look. I learned about Scratchpads, a really cool looking project which automatically sets up an “easy to use, social networking application that enable communities of researchers to manage, share and publish taxonomic data online.”
If you like to contribute information about your web-based efforts to further the field of ecology, check out Jarrett’s blog post on how to contribute. Thanks Jarrett.
UPDATE: Corrected the link to Ecology on the Web
Thoughts on developing a digital presence
A while ago there was a bit of discussion around the academic blogosphere recently regarding the importance of developing a digital presence and what the best form of that presence might be. Recently as I’ve been looking around at academics’ websites as part of faculty, postdoc and graduate student searchers going on in my department/lab I’ve been reminded of the importance of having a digital presence.
It seems pretty clear to me that the web is the primary source of information acquisition for most academics, at least up through the young associate professors. There are no doubt some senior folk who would still rather have a paper copy of a journal sent to them via snail mail and who rarely open their currently installed copy of Internet Explorer 6, but I would be very surprised if most folks who are evaluating graduate student, postdoctoral and faculty job candidates aren’t dropping the name of the applicant into their favorite search engines and seeing what comes up. They aren’t looking around for dirt like all those scary news stories that were meant to stop college students from posting drunken photos of themselves on social networking sites. They’re just
procrastinating looking for more information to get a clearer picture of you as a scientist/academic. I also do a quick web search when I meet someone interesting at a conference, get a paper/grant to review with authors I haven’t heard of before, read an interesting study by someone I don’t know, etc. Many folks who apply to join my lab for graduate school find me through the web.
When folks go looking around for you on the web you want them to find something (not finding anything is the digital equivalent of “being a nobody”), and better yet you want them to find something that puts your best foot forward. But what should this be? Should you Tweet, Buzz, be LinkedIn, start a Blog, have a Wiki*, or maybe just get freaked out by all of this technology and move to the wilderness somewhere and never speak to anyone ever again.
I think the answer here is simple: start with a website. This is the simplest way to present yourself to the outside world and you can (and should) start one as soon as you begin graduate school. The website can be very simple. All you need is a homepage of some kind, a page providing more detailed descriptions of your research interests, a CV, a page listing your publications†, and a page with your contact information. Keep this updated and looking decent and you’ll have as good an online presence as most academics.
While putting together your own website might seem a little intimidating it’s actually very easy these days. The simplest approach is to use one of the really easy hosted solutions out there. These include things like Google Sites, which are specifically designed to let you make websites; or you can easily turn a hosted blogging system into a website (WordPress.com is often used for this). There are lots of other good options out there (let us know about your favorites in the comments). In addition many universities have some sort of system set up for letting you easily make websites, just ask around. Alternatively, you can get a static .html based template and then add your own content to it. Open Source Web Design is the best place I’ve found for templates. You can either open up the actual html files or you can use a WYSIWYG editor to replace the sample text with your own content. SeaMonkey is a good option for a WYSIWYG editor. Just ask your IT folks how to get these files up on the web when you’re done.
So, setting up a website is easy, but should you be doing other things as well and if so what. At the moment I would say that if you’re interested in trying out a new mode of academic communication then you should pick one that sounds like fun to you and give it a try; but this is by no means a necessity as an academic at the moment. If you do try to do some of these other things, then do them in moderation. It’s easy to get caught up in the rapid rewards of finishing a blog post or posting a tweet on Twitter, not to mention keeping up with others blogs and tweets, but this stuff can rapidly eat up your day and for the foreseeable future you won’t be getting a job based on your awesome stream of 140 character or less insights.*Yep, that’s right, it’s a link to the Wikipedia page on Wiki’s. †And links to copies of them if you are comfortable flaunting the absurd copyright/licensing policies of many of the academic publishers (or if you only published in open access journals).
Getting your own domain name: a recommendation, justification, and brief tutorial
I have been very encouraged of late to see more and more ecologists embracing the potential of the web for communication and interaction. I’ve recently blogrolled some graduate student blogs and in the last few weeks I’ve come across American Naturalist’s trial run of a forum system, Ecological Monographs’ blog, and a blog soliciting feedback on a new initiative to digitize existing biological collections.
Journal Article 2.0
Cell Press has recently announced what I considered to be the most interesting advance in journal publishing since articles started being posted online. Basically they have started to harness the power of the web to aggregate the information present in in articles in more useful and efficient ways. For example, there is a Data tab for each article that provides an overview of all figures, and large amounts of information on the selected figure including both it’s caption and the actual context for its citation from the text. Raw data files are also readily accessible from this same screen. References are dynamically expandable to show their context in the text (without refreshing, which is awesome), filterable by year or author, and linked directly to the original publication. You’ll also notice an comments tab where editor moderated comments related to be paper will be posted (showing the kind of integrated commenting system that I expect we will see everywhere eventually).
I have seen a lot of discussion of how the web is going to revolutionize publishing, but to quote one of my favorite movies “Talking ain’t doing.” Cell Press is actually doing.
I’d strongly encourage you to check out their blog post and video and then go play around with one of the articles in the new format. This is really exciting stuff.
Why you should use a feed reader to monitor journal table of contents
A couple of weeks ago we made it possible for folks to subscribe to JE using email. We did this because we realized that many scientists, even those who are otherwise computationally savvy, really haven’t embraced feed readers as a method of tracking information. When I wrote that post I promised to return with an argument for why you should start using a feed reader instead – so here it is. If anyone is interested in a more instructional post about how to do this then let us know in the comments.
The main argument
I’m going to base my argument on something that pretty much all practicing scientists do – keeping track of the current scientific literature by reading Tables of Contents (TOCs). Back in the dark ages the only way to get these TOCs was to either have a personal subscription to the journal or to leave the office and walk the two blocks to the library (I wonder if anyone has done a study on scientists getting fatter now that they don’t have to go to the library anymore). About a decade ago (I’m not really sure when, but this seems like it’s in the right ballpark) journals started offering email subscriptions to their TOCs. Every time a new issue was published you’d receive an email that included the titles and authors of each contribution and links to the papers (once the journal actually had the papers online of course). This made it much easier to keep track of the papers being published in a wide variety of journals by speeding up the process of determining if there was anything of interest in a given issue. While the increase in convenience of using a feed reader may not be on quite the same scale as that generated by the email TOCs, it is still fairly substantial.
The nice thing about feed readers is that they operate one item at a time. So, instead of receiving one email with 10-100 articles in it, you receive 10-100 items in your feed reader. This leads to the largest single advantage of feeds over email for tracking TOCs. You only need to process one article at a time. Just think about the last time you had 5 minutes before lunch and you decided to try to clear an email or two out of your inbox. You probably opened up a TOC email and started going through it top to bottom. If you were really lucky then maybe there were only a dozen papers and none of them were of interest and you could finish going through the email and delete it. Most of the time however there are either too many articles or you want to look at at least one so you go to the website, read the abstract, maybe download the paper, and the next thing you know it’s time for lunch and you haven’t finished going through the table so it continues to sit in your inbox. Then, of course, by the time you get back to it you probably don’t even remember where you left off and you basically have to start back at the beginning again. I don’t know about you but this process typically resulted in my having dozens of emailed TOCs lying around my inbox at any one time.
With a feed reader it’s totally different. If you have five minutes you start going through the posts for individual articles one at a time. If you have five minutes you can often clear out 5 or 10 articles (or even 50 if the feed is well tagged like PNAS’s feed), which means that you can use your small chunks of free time much more effectively for keeping up with the literature. In addition, all major feed readers allow you to ‘star’ posts – in other words you can mark them in such a way that you can go back to them later and look at them in more detail. So, instead of the old system where if you were interested in looking at a paper you had to stop going through the table of contents, go to the website, decide from the abstract if you wanted to actually look at the paper, and then either download or print a copy of the paper to look at later, with a feed reader you achieve the same thing with a one second click. This means that you can often go through a fairly large TOCs in less than 10 minutes.
Of course much of this utility depends on the journals actually providing feeds that include all of the relevent information.
Keeping your TOCs and other feeds outside of your email allows for greater separation of different aspects of online communication. If you monitor your email fairly continuously, the last thing you need is to receive multiple TOC emails each day that could distract you from actually getting work done. Having a separate feed reader let’s you actually decide when you want to look at this information (like in those 5 minutes gaps before lunch or at the end of the day when you’re too brain dead to do anything else).
Now that journals post many of their articles online as soon as the proofs stage is complete, it can be advantageous to know about these articles as soon as they are available. Most journal feeds do exactly this, posting a few papers at a time as they are uploaded to the online-early site.
Sharing – want to tell your friends about a cool paper you just read. You could copy the link, open a new email, paste the link and then send it on to them. Or, you could accomplish this with a single click (NB: this technology is still developing and varies among feed readers).
And then of course there are blogs
I’ve attempted to appeal to our non-feedreader-readers by focusing on a topic that they can clearly identify with. That said, the world of academic communication is rapidly expanding beyond the walls of the journal article. Blogs play an increasingly important role in scientific discourse and if you’re going to follow blogs you really need a feed reader. Why? Because while some blogs update daily (e.g., most of the blogs over at ScienceBlogs) many good blogs update at an average rate of once a week, or once a month. You don’t want to have to check the webpage of one of these blogs every day just to see if something new has been posted, so subscribe to its feed, kick back, and let the computer tell you what’s going on in the world.
A message to journal editors/managers about RSS feeds
- If you don’t have an easily accessible RSS feed available (and by easily accessible I mean in the browser’s address bar on your journal’s main page) for your journal’s Table of Contents (TOCs), there is a certain class of readers who will not keep track of you TOCs. This is because receiving this information via email is outdated and inefficient and if you are in the business of content delivery it is, at this point, incompetent for you to not have this option (it’s kind of like not having a website 10 years ago).
- If, for some technophobic reason, you refuse to have an RSS feed, then please, pretty please with suger on top, don’t hide the ability to subscribe to the TOCs behind a username/password wall. All you need is a box for people to add their email addresses to for subscribing and a prominent unsubscribe link in the emails (if you are really paranoid you can add a confirmation email with a link that needs to be followed to confirm the subscription).
- Most importantly. Please, for the love of all that is good and right in the world, DO NOT START AN RSS FEED AND THEN STOP UPDATING IT. Those individuals who track a large number of feeds in their feed readers will not notice that you stopped updating your feed for quite some time. You are losing readers when you do this.
- If you have an RSS feed that is easily accessible (congratulations, you’re ahead of many Elsevier journals) please try to maximize the amount of information it provides. There are three critical pieces of information that should be included in every TOCs feed:
- The title (you all manage to do this one OK)
- All of the authors’ names. Not just the first author. Not just the first and last author. All of the authors. Seriously, part of the decision making process when it comes to choosing whether or not to take a closer look at a paper is who the authors are. So, if you want to maximize the readership of papers, include all of the authors’ names in the RSS feed.
- The abstract. I cannot fathom why you would exclude the abstract from your feed, other than to generate click throughs to your website. Since those of you doing this (yes, Ecology, I’m talking about you) aren’t running advertising, this isn’t a good reason, since you can communicate the information just as well in the feed (and if you’re using website visits as some kind of metric, don’t worry, you can easily track how many people are subscribed to your feed as well).
If this seems a bit harsh, whiny, etc., then keep this in mind. In the last month I had over 1000 new publications come through my feed reader and another 100 or so in email tables of contents. This is an incredible amount of material just to process, let alone read. If journals want readers to pay attention to their papers it is incumbent upon them to make it as easy as possible to sort through this deluge of information and allow their readership to quickly and easily identify papers of interest. Journals that don’t do this are hurting themselves as well as their readers.