Category Archives: computers
Advertisements for three exciting postdoctoral positions came out in the last week.
Interface between ecology, evolution and mathematics
The first is with Hélène Morlon’s group in Paris. Hélène and I were postdocs in Jessica Green’s lab at the same time. She is both very smart and extremely nice, oh, and did I mention, her lab is in PARIS. Here’s the ad. If it’s a good fit then you couldn’t go wrong with this postdoc.
A postdoctoral position is available in my new lab at the Ecole Polytechnique and/or at the Museum of Natural History in Paris to work at the interface between ecology-evolution and mathematics. Candidates with a background in biology and a strong interest in modeling, or with a theoretical background and a strong interest in biology, are encouraged to apply. More information is available here. Potential candidates should feel free to contact me. The deadline for application is May 8th.
The other two postdocs are associated with Tim Keitt’s lab (which I consider to be one of the top quantitative ecology groups out there).
Mechanistic niche modeling and climate change impacts
A postdoctoral position is anticipated as part of a collaborative project to develop and evaluate mechanistic niche models that incorporate geographic variation in physiological traits. The post doc will be based in Michael Angilletta’s laboratory at Arizona State University, but will interact with members of Lauren Buckley’s lab at the University of North Carolina in Chapel Hill and Tim Keitt’s lab at the University of Texas in Austin. The post doc will be expected to engage in modeling activities and coordinate lab studies of thermal physiology. Experience with mathematical modeling in C++, MATLAB, Python or R is beneficial and familiarity with environmental data and biophysical ecology is beneficial. More here.
Ecological forecasting or statistical landscape genetics
The Keitt Lab at the University of Texas at Austin seeks a postdoctoral investigator to join an interdisciplinary NSF-funded project linking ecophysiology, genomics and climate change. The position requires excellent modeling skills and the ability to engage in multidisciplinary research. Research areas of interest include either ecological forecasting or statistical landscape genomics. More here.
So, if you’re looking for a job go check out these great opportunities.
An increasingly large number of folks doing research in ecology and other biological disciplines spend a substantial portion of their time writing computer programs to analyze data and simulate the outcomes of biological models. However, most ecologists have little formal training in software development¹. A recent survey suggests that we are not only; with 96% of scientists reporting that they are mostly self-taught when it comes to writing code. This makes sense because there are only so many hours in the day, and scientists are typically more interested in answering important questions in their field than in sitting through a bachelors degree worth of computer science classes. But, it also means that we spend longer than necessary writing our software, it contains more bugs, and it is less useful to other scientists than it could be².
Software Carpentry to the Rescue
Fortunately you don’t need to go back college and get another degree to substantially improve your knowledge and abilities when it comes to scientific programming, because with a few weeks of hard work Software Carpentry will whip you into shape. Software Carpentry was started back in 1997 to teach scientists “the concepts, skills, and tools they need to use and build software more productively” and it does a great job. The newest version of the course is composed of a combination of video lectures and exercises, and provides quick and to the point information on such critical things as:
along with lots of treatment of best practices for writing code that is clear and easy to read both for other people and for yourself a year from now when you sit down and try to figure out exactly what you did³.
The great thing about Software Carpentry is that it skips over all of the theory and detail that you’d get when taking the relevant courses in computer science and gets straight to crux – how to use the available tools most effectively to conduct scientific research. This means that in about 40 hours of lecture and 100-200 hours of practice you can be a much, much, better programmer who rights code more quickly, with fewer bugs, that be easily reused. I think of it as boot camp for scientific software development. You won’t be an expert marksman or a black belt in Jiu-Jitsu when you’re finished, but you will know how to fire a gun and throw a punch.
I can say without hesitation that taking this course is one of the most important things I’ve done in terms of tool development in my entire scientific career. If you are going to write more than 100 lines of code per year for your research then you need to either take this course or find someone to offer something equivalent at your university. Watch the lectures, do the exercises, and it will save you time and energy on programming; giving you more of both to dedicate to asking and answering important scientific questions.
¹I took 3 computer science courses in college and I get the impression that that is about 2-3 more courses than most ecologists have taken.
²I don’t know of any data on this, but my impression is that over 90% of code written by ecologists is written by a single individual and never read or used by anyone else. This is in part because we have no culture of writing code in such a way that other people can understand what we’ve done and therefore modify it for their own use.
³I know that I’ve decided that it was easier to “just start from scratch” rather than reusing my own code on more than one occasion. That won’t be happening to me again thanks to Software Carpentry
If you use R (and it seems like everybody does these days) then you should check out RStudio – an easy to install, cross-platform IDE for R. Basically it’s a seamless integration of all of the aspects of R (including scripts, the console, figures, help, etc.) into a single easy to use package. For those of you are familiar with Matlab, it’s a very similar interface. It’s not a full blown IDE yet (no debugger; no lint) but what this actually means is that it’s simple and easy to use. If you use R I can’t imagine that you won’t love this new (and open source!) tool.
A while ago there was a bit of discussion around the academic blogosphere recently regarding the importance of developing a digital presence and what the best form of that presence might be. Recently as I’ve been looking around at academics’ websites as part of faculty, postdoc and graduate student searchers going on in my department/lab I’ve been reminded of the importance of having a digital presence.
It seems pretty clear to me that the web is the primary source of information acquisition for most academics, at least up through the young associate professors. There are no doubt some senior folk who would still rather have a paper copy of a journal sent to them via snail mail and who rarely open their currently installed copy of Internet Explorer 6, but I would be very surprised if most folks who are evaluating graduate student, postdoctoral and faculty job candidates aren’t dropping the name of the applicant into their favorite search engines and seeing what comes up. They aren’t looking around for dirt like all those scary news stories that were meant to stop college students from posting drunken photos of themselves on social networking sites. They’re just
procrastinating looking for more information to get a clearer picture of you as a scientist/academic. I also do a quick web search when I meet someone interesting at a conference, get a paper/grant to review with authors I haven’t heard of before, read an interesting study by someone I don’t know, etc. Many folks who apply to join my lab for graduate school find me through the web.
When folks go looking around for you on the web you want them to find something (not finding anything is the digital equivalent of “being a nobody”), and better yet you want them to find something that puts your best foot forward. But what should this be? Should you Tweet, Buzz, be LinkedIn, start a Blog, have a Wiki*, or maybe just get freaked out by all of this technology and move to the wilderness somewhere and never speak to anyone ever again.
I think the answer here is simple: start with a website. This is the simplest way to present yourself to the outside world and you can (and should) start one as soon as you begin graduate school. The website can be very simple. All you need is a homepage of some kind, a page providing more detailed descriptions of your research interests, a CV, a page listing your publications†, and a page with your contact information. Keep this updated and looking decent and you’ll have as good an online presence as most academics.
While putting together your own website might seem a little intimidating it’s actually very easy these days. The simplest approach is to use one of the really easy hosted solutions out there. These include things like Google Sites, which are specifically designed to let you make websites; or you can easily turn a hosted blogging system into a website (WordPress.com is often used for this). There are lots of other good options out there (let us know about your favorites in the comments). In addition many universities have some sort of system set up for letting you easily make websites, just ask around. Alternatively, you can get a static .html based template and then add your own content to it. Open Source Web Design is the best place I’ve found for templates. You can either open up the actual html files or you can use a WYSIWYG editor to replace the sample text with your own content. SeaMonkey is a good option for a WYSIWYG editor. Just ask your IT folks how to get these files up on the web when you’re done.
So, setting up a website is easy, but should you be doing other things as well and if so what. At the moment I would say that if you’re interested in trying out a new mode of academic communication then you should pick one that sounds like fun to you and give it a try; but this is by no means a necessity as an academic at the moment. If you do try to do some of these other things, then do them in moderation. It’s easy to get caught up in the rapid rewards of finishing a blog post or posting a tweet on Twitter, not to mention keeping up with others blogs and tweets, but this stuff can rapidly eat up your day and for the foreseeable future you won’t be getting a job based on your awesome stream of 140 character or less insights.*Yep, that’s right, it’s a link to the Wikipedia page on Wiki’s. †And links to copies of them if you are comfortable flaunting the absurd copyright/licensing policies of many of the academic publishers (or if you only published in open access journals).
The Ecological Database Toolkit
Large amounts of ecological and environmental data are becoming increasingly available due to initiatives sponsoring the collection of large-scale data and efforts to increase the publication of already collected datasets. As a result, ecology is entering an era where progress will be increasingly limited by the speed at which we can organize and analyze data. To help improve ecologists’ ability to quickly access and analyze data we have been developing software that designs database structures for ecological datasets and then downloads the data, processes it, and installs it into several major database management systems (at the moment we support Microsoft Access, MySQL, PostgreSQL, and SQLite). The database toolkit system can substantially reduce hurdles to scientists using new databases, and save time and reduce import errors for more experienced users.
The database toolkit can download and install small datasets in seconds and large datasets in minutes. Imagine being able to download and import the newest version of the Breeding Bird Survey of North America (a database with 4 major tables and over 5 million records in the main table) in less than five minutes. Instead of spending an afternoon setting up the newest version of the dataset and checking your import for errors you could spend that afternoon working on your research. This is possible right now and we are working on making this possible for as many major public/semi-public ecological databases as possible. The automation of this process reduces the time for a user to get most large datasets up and running by hours, and in some cases days. We hope that this will make it much more likely that scientists will use multiple datasets in their analyses; allowing them to gain more rapid insight into the generality of the pattern/process they are studying.
We need your help
We have done quite a bit of testing on this system including building in automated tests based on manual imports of most of the currently available databases, but there are always bugs and imperfections in code that cannot be identified until the software is used in real world situations. That’s why we’re looking for folks to come try out the Database Toolkit and let us know what works and what doesn’t, what they’d like to see added or taken away, and if/when the system fails to work properly. So if you’ve got a few minutes to have half a dozen ecological databases automatically installed on your computer for you stop by the Database Toolkit page at EcologicalData.org, give it a try, and let us know what you think.
We have a postdoc position available for someone interested in the general areas of macroecology, quantitative ecology, and ecoinformatics. Here’s the short ad with links to the full job description:
Ethan White’s lab at Utah State University is looking for a postdoc to collaborate on research studying approaches for unifying macroecological patterns (e.g., species abundance distributions and species-area relationships) and predicting variation in these patterns using ecological and environmental variables. The project aims to 1) evaluate the performance of models that link ecological patterns by using broad scale data on at least three major taxonomic groups (birds, plants, and mammals); and 2) combine models with ecological and environmental factors to explain continental scale variation in community structure. Models to be explored include maximum entropy models, neutral models, fractal based models, and statistical models. The postdoc will also be involved in an ecoinformatics initiative developing tools to facilitate the use of existing ecological data. There will be ample opportunity for independent and collaborative research in related areas of macroecology, community ecology, theoretical ecology, and ecoinformatics. The postdoc will benefit from interactions with researchers in Dr. White’s lab, the Weecology Interdisciplinary Research Group, and with Dr. John Harte’s lab at the University of California Berkeley. Applicants from a variety of backgrounds including ecology, mathematics, statistics, physics and computer science are encouraged to apply. The position is available for 1 year with the possibility for renewal depending on performance, and could begin as early as September 2010 and no later than May 2011. Applications will begin to be considered starting on September 1, 2010. Go to the USU job page to see the full advertisement and to apply.
If you’re interested in the position and are planning to be at ESA please leave a comment or drop me an email (email@example.com) and we can try to set up a time to talk while we’re in Pittsburgh. Questions about the position and expressions of interest are also welcome.
UPDATE: This position has been filled.
I have been very encouraged of late to see more and more ecologists embracing the potential of the web for communication and interaction. I’ve recently blogrolled some graduate student blogs and in the last few weeks I’ve come across American Naturalist’s trial run of a forum system, Ecological Monographs’ blog, and a blog soliciting feedback on a new initiative to digitize existing biological collections.
Cell Press has recently announced what I considered to be the most interesting advance in journal publishing since articles started being posted online. Basically they have started to harness the power of the web to aggregate the information present in in articles in more useful and efficient ways. For example, there is a Data tab for each article that provides an overview of all figures, and large amounts of information on the selected figure including both it’s caption and the actual context for its citation from the text. Raw data files are also readily accessible from this same screen. References are dynamically expandable to show their context in the text (without refreshing, which is awesome), filterable by year or author, and linked directly to the original publication. You’ll also notice an comments tab where editor moderated comments related to be paper will be posted (showing the kind of integrated commenting system that I expect we will see everywhere eventually).
I have seen a lot of discussion of how the web is going to revolutionize publishing, but to quote one of my favorite movies “Talking ain’t doing.” Cell Press is actually doing.
I’ve recently started reading two scientific programming blogs that I think are well worth paying attention to, so I’m blogrolling them and offering a brief introduction here.
Serendipity is Steve Easterbrook’s blog about the interface between software engineering and climate science. Steve has a realistic and balanced viewpoint regarding the reality of programming in scientific disciplines. The blog is well written, insightful, etc., but I think the thing that really won me over were his sharp witted responses to the periodically asinine comments he receives. For example:
I’d care a lot less about seeing all the source and data if I could just ignore climate scientists and shop elsewhere. But since I’m expected to hand over $$$ and change my lifestyle because of this research, your arguments ring hollow…
[You can shop elsewhere – there are thousands of climate scientists across the world. If you don’t like the CRU folks, go to any one of a large number of climate science labs elsewhere (start here: http://www.realclimate.org/index.php/data-sources/). An analogy: Imagine your doctor told you that you have to change your eating habits, or your heart is unlikely to last out the year. You would go and get a second opinion from another doctor. And maybe a third. But when every qualified doctor tells you the same thing, do you finally accept their advice, or do you go around claiming that all doctors are corrupt? – Steve]
Software Carpentry is the sister blog to an excellent online (and occasionally in person) course on basic software development for scientists. I strongly recommend the course to anyone who is interested in getting more serious about their programming and the blog is a nice complement pointing readers to other resources and discussions related to scientific programming.
A couple of weeks ago we made it possible for folks to subscribe to JE using email. We did this because we realized that many scientists, even those who are otherwise computationally savvy, really haven’t embraced feed readers as a method of tracking information. When I wrote that post I promised to return with an argument for why you should start using a feed reader instead – so here it is. If anyone is interested in a more instructional post about how to do this then let us know in the comments.
The main argument
I’m going to base my argument on something that pretty much all practicing scientists do – keeping track of the current scientific literature by reading Tables of Contents (TOCs). Back in the dark ages the only way to get these TOCs was to either have a personal subscription to the journal or to leave the office and walk the two blocks to the library (I wonder if anyone has done a study on scientists getting fatter now that they don’t have to go to the library anymore). About a decade ago (I’m not really sure when, but this seems like it’s in the right ballpark) journals started offering email subscriptions to their TOCs. Every time a new issue was published you’d receive an email that included the titles and authors of each contribution and links to the papers (once the journal actually had the papers online of course). This made it much easier to keep track of the papers being published in a wide variety of journals by speeding up the process of determining if there was anything of interest in a given issue. While the increase in convenience of using a feed reader may not be on quite the same scale as that generated by the email TOCs, it is still fairly substantial.
The nice thing about feed readers is that they operate one item at a time. So, instead of receiving one email with 10-100 articles in it, you receive 10-100 items in your feed reader. This leads to the largest single advantage of feeds over email for tracking TOCs. You only need to process one article at a time. Just think about the last time you had 5 minutes before lunch and you decided to try to clear an email or two out of your inbox. You probably opened up a TOC email and started going through it top to bottom. If you were really lucky then maybe there were only a dozen papers and none of them were of interest and you could finish going through the email and delete it. Most of the time however there are either too many articles or you want to look at at least one so you go to the website, read the abstract, maybe download the paper, and the next thing you know it’s time for lunch and you haven’t finished going through the table so it continues to sit in your inbox. Then, of course, by the time you get back to it you probably don’t even remember where you left off and you basically have to start back at the beginning again. I don’t know about you but this process typically resulted in my having dozens of emailed TOCs lying around my inbox at any one time.
With a feed reader it’s totally different. If you have five minutes you start going through the posts for individual articles one at a time. If you have five minutes you can often clear out 5 or 10 articles (or even 50 if the feed is well tagged like PNAS’s feed), which means that you can use your small chunks of free time much more effectively for keeping up with the literature. In addition, all major feed readers allow you to ‘star’ posts – in other words you can mark them in such a way that you can go back to them later and look at them in more detail. So, instead of the old system where if you were interested in looking at a paper you had to stop going through the table of contents, go to the website, decide from the abstract if you wanted to actually look at the paper, and then either download or print a copy of the paper to look at later, with a feed reader you achieve the same thing with a one second click. This means that you can often go through a fairly large TOCs in less than 10 minutes.
Of course much of this utility depends on the journals actually providing feeds that include all of the relevent information.
Keeping your TOCs and other feeds outside of your email allows for greater separation of different aspects of online communication. If you monitor your email fairly continuously, the last thing you need is to receive multiple TOC emails each day that could distract you from actually getting work done. Having a separate feed reader let’s you actually decide when you want to look at this information (like in those 5 minutes gaps before lunch or at the end of the day when you’re too brain dead to do anything else).
Now that journals post many of their articles online as soon as the proofs stage is complete, it can be advantageous to know about these articles as soon as they are available. Most journal feeds do exactly this, posting a few papers at a time as they are uploaded to the online-early site.
Sharing – want to tell your friends about a cool paper you just read. You could copy the link, open a new email, paste the link and then send it on to them. Or, you could accomplish this with a single click (NB: this technology is still developing and varies among feed readers).
And then of course there are blogs
I’ve attempted to appeal to our non-feedreader-readers by focusing on a topic that they can clearly identify with. That said, the world of academic communication is rapidly expanding beyond the walls of the journal article. Blogs play an increasingly important role in scientific discourse and if you’re going to follow blogs you really need a feed reader. Why? Because while some blogs update daily (e.g., most of the blogs over at ScienceBlogs) many good blogs update at an average rate of once a week, or once a month. You don’t want to have to check the webpage of one of these blogs every day just to see if something new has been posted, so subscribe to its feed, kick back, and let the computer tell you what’s going on in the world.