Monthly Archives: June 2009
We just read this great piece from the Huffington Post by Todd Palmer and Rob Pringle on why including funds for NSF and NIH in the stimulus bill was a good idea (thanks to Ecotone for pointing us to the article). The great thing about the piece is that it doesn’t just make a cogent argument for the stimulus funds, but for why funding basic science is economically beneficial in general. Probably the high point of the article was this little gem:
Truthfully, the return on our relatively modest investment in basic research over the last half-century is so astronomical that it’s impossible to calculate. Science hasn’t just stimulated the economy; it has revolutionized the economy, and our lives along with it.
which seems like it must be hyperbole, but at least from our perspective it certainly is not. However, if we had to pick our favorite moment in the article it would definitely be the paraphrase of Paul Baskin’s concern about the utility of this funding:
Aren’t we just subsidizing a bunch of nerds who already have cushy academic jobs and buy fancy Japanese-made instruments? No.
This is definitely one of the clearest, best, and funniest explanations of why funding basic science is critical to the economy and to society in general. Go check it out.
Dealing with frequency distribution data is something that we as ecologists haven’t typically done in a very sophisticated way. This isn’t really our fault. Proper methods aren’t typically taught in undergraduate statistics courses or in the graduate level classes targeted at biologists. That said, as ecology becomes a more quantitative science it becomes increasingly important to analyze data carefully so that we can understand its precise quantitative structure and its relationship to theoretical predictions.
Frequency distribution data is basically any data that you would think about making a histogram out of. Any time you have a single value that you (or someone else) has measured, for example the size or abundance of a species, and you are interested in how the number of occurrences changes as a function of that value, for example – are there more small species than large species or more small patches than large patches, then you are talking about a frequency distribution. Technically what we’re often interested in is the probability distribution underlying the data and you will often have more luck using this term when looking for information. Many major ecological patterns are probability/frequency distributions including the species-abundance distribution, species size distribution (also known as the body size distribution), individual size distribution (also known as the size spectrum), Levy flights, and many others.
Last year I wrote a paper with Jessica Green and Brian Enquist on one of the problems that can result from the approaches to this kind of data typically employed by ecologists and the more sophisticated methods available for addressing the question. As a result I’ve been receiving a fair bit of email recently about related problems; enough that I thought it might be worth a couple of posts to lay out some of the basic ideas regarding the analysis of frequency distribution data. Over the next week or so I’ll try to cover what I’ve learned about basic data visualization, parameter estimation, and comparing the fits of different models to the data. Along the way I may have a couple of things to say about some recently published papers that have the potential to cause confusion with respect to these subject.
Please keep in mind that I am not a professionally trained statistician and that this is not intended to be an authoritative treatment of the subject. I’m just hoping to provide folks with an entryway into thinking about what to do with this kind of data and I’ll try to point to useful references to help take you further if you’re interested.
Nathan over at Flowing Data just posted an interesting piece on the emergence of a new class of scientists whose work focuses on the manipulation, analysis and presentation of data. The take home message is that in order to fully master the ability to understand and communicate patterns in large quantities of data that one needs to have some ability in:
- Computer science – for acquiring, managing and manipulating data
- Mathematics and Statistics – for mining and analyzing data
- Graphic design and Interactive interface design – to present the results of analyses in an easy to understand manner and encourage interaction and additional analysis by less technical users
His point is that while one could get together a group of people (one with each of these skills) to undertake this kind of task, that the challenges of cross-disciplinary collaboration can slow down progress (or even prevent it entirely). As such, there is a need for individuals that have at least some experience in several of these fields to help facilitate the process. I think this is a good model for this kind of work in ecology, though given the already extensive multidisciplinarity required in the field I view this role as one occupied only be fairly small fraction of folks.
The other thing that I really liked about this post (and about Flowing Data’s broader message) is the focus on the end user. The goal is to make ideas and tools available to the broadest possible audience and sometimes often the more technical folks in the biological scientists seem to forget that their goal should be to make things easy to understand and simple for non-technical users to use. This is undoubtedly a challenging task, but one that we should work to accomplish whenever possible.