The Ecological Database Toolkit
Large amounts of ecological and environmental data are becoming increasingly available due to initiatives sponsoring the collection of large-scale data and efforts to increase the publication of already collected datasets. As a result, ecology is entering an era where progress will be increasingly limited by the speed at which we can organize and analyze data. To help improve ecologists’ ability to quickly access and analyze data we have been developing software that designs database structures for ecological datasets and then downloads the data, processes it, and installs it into several major database management systems (at the moment we support Microsoft Access, MySQL, PostgreSQL, and SQLite). The database toolkit system can substantially reduce hurdles to scientists using new databases, and save time and reduce import errors for more experienced users.
The database toolkit can download and install small datasets in seconds and large datasets in minutes. Imagine being able to download and import the newest version of the Breeding Bird Survey of North America (a database with 4 major tables and over 5 million records in the main table) in less than five minutes. Instead of spending an afternoon setting up the newest version of the dataset and checking your import for errors you could spend that afternoon working on your research. This is possible right now and we are working on making this possible for as many major public/semi-public ecological databases as possible. The automation of this process reduces the time for a user to get most large datasets up and running by hours, and in some cases days. We hope that this will make it much more likely that scientists will use multiple datasets in their analyses; allowing them to gain more rapid insight into the generality of the pattern/process they are studying.
We need your help
We have done quite a bit of testing on this system including building in automated tests based on manual imports of most of the currently available databases, but there are always bugs and imperfections in code that cannot be identified until the software is used in real world situations. That’s why we’re looking for folks to come try out the Database Toolkit and let us know what works and what doesn’t, what they’d like to see added or taken away, and if/when the system fails to work properly. So if you’ve got a few minutes to have half a dozen ecological databases automatically installed on your computer for you stop by the Database Toolkit page at EcologicalData.org, give it a try, and let us know what you think.
We have a postdoc position available for someone interested in the general areas of macroecology, quantitative ecology, and ecoinformatics. Here’s the short ad with links to the full job description:
Ethan White’s lab at Utah State University is looking for a postdoc to collaborate on research studying approaches for unifying macroecological patterns (e.g., species abundance distributions and species-area relationships) and predicting variation in these patterns using ecological and environmental variables. The project aims to 1) evaluate the performance of models that link ecological patterns by using broad scale data on at least three major taxonomic groups (birds, plants, and mammals); and 2) combine models with ecological and environmental factors to explain continental scale variation in community structure. Models to be explored include maximum entropy models, neutral models, fractal based models, and statistical models. The postdoc will also be involved in an ecoinformatics initiative developing tools to facilitate the use of existing ecological data. There will be ample opportunity for independent and collaborative research in related areas of macroecology, community ecology, theoretical ecology, and ecoinformatics. The postdoc will benefit from interactions with researchers in Dr. White’s lab, the Weecology Interdisciplinary Research Group, and with Dr. John Harte’s lab at the University of California Berkeley. Applicants from a variety of backgrounds including ecology, mathematics, statistics, physics and computer science are encouraged to apply. The position is available for 1 year with the possibility for renewal depending on performance, and could begin as early as September 2010 and no later than May 2011. Applications will begin to be considered starting on September 1, 2010. Go to the USU job page to see the full advertisement and to apply.
If you’re interested in the position and are planning to be at ESA please leave a comment or drop me an email (firstname.lastname@example.org) and we can try to set up a time to talk while we’re in Pittsburgh. Questions about the position and expressions of interest are also welcome.
UPDATE: This position has been filled.
I’ve recently started reading two scientific programming blogs that I think are well worth paying attention to, so I’m blogrolling them and offering a brief introduction here.
Serendipity is Steve Easterbrook’s blog about the interface between software engineering and climate science. Steve has a realistic and balanced viewpoint regarding the reality of programming in scientific disciplines. The blog is well written, insightful, etc., but I think the thing that really won me over were his sharp witted responses to the periodically asinine comments he receives. For example:
I’d care a lot less about seeing all the source and data if I could just ignore climate scientists and shop elsewhere. But since I’m expected to hand over $$$ and change my lifestyle because of this research, your arguments ring hollow…
[You can shop elsewhere – there are thousands of climate scientists across the world. If you don’t like the CRU folks, go to any one of a large number of climate science labs elsewhere (start here: http://www.realclimate.org/index.php/data-sources/). An analogy: Imagine your doctor told you that you have to change your eating habits, or your heart is unlikely to last out the year. You would go and get a second opinion from another doctor. And maybe a third. But when every qualified doctor tells you the same thing, do you finally accept their advice, or do you go around claiming that all doctors are corrupt? – Steve]
Software Carpentry is the sister blog to an excellent online (and occasionally in person) course on basic software development for scientists. I strongly recommend the course to anyone who is interested in getting more serious about their programming and the blog is a nice complement pointing readers to other resources and discussions related to scientific programming.