Jabberwocky Ecology

Sharing in Science: my full reply to Eli Kintisch

A couple of weeks ago Eli Kintisch (@elikint) interviewed me for what turned out to be a great article on “Sharing in Science” for Science Careers. He also interviewed Titus Brown (@ctitusbrown) who has since posted the full text of his reply, so I thought I’d do the same thing.

How has sharing code, data, R methods helped you with your scientific research?

Definitely. Sharing code and data helps the scientific community make more rapid progress by avoiding duplicated effort and by facilitating more reproducible research. Working together in this way helps us tackle the big scientific questions and that’s why I got into science in the first place. More directly, sharing benefits my group’s research in a number of ways:

  1. Sharing code and data results in the community being more aware of the research you are doing and more appreciative of the contributions you are making to the field as a whole. This results in new collaborations, invitations to give seminars and write papers, and access to excellent students and postdocs who might not have heard about my lab otherwise.
  2. Developing code and data so that it can be shared saves us a lot of time. We reuse each others code and data within the lab for different projects, and when a reviewer requests a small change in an analysis we can make a small change in our code and then regenerate the results and figures for the project by running a single program. This also makes our research more reproducible and allows me to quickly answer questions about analyses years after they’ve been conducted when the student or postdoc leading the project is no longer in the lab. We invest a little more time up front, but it saves us a lot of time in the long run. Getting folks to work this way is difficult unless they know they are going to be sharing things publicly.
  3. One of the biggest benefits of sharing code and data is in competing for grants. Funding agencies want to know how the money they spend will benefit science as a whole, and being able to make a compelling case that you share your code and data, and that it is used by others in the community, is important for satisfying this goal of the funders. Most major funding agencies have now codified this requirement in the form of data management plans that describe how the data and code will be managed and when and how it will be shared. Having a well established track record in sharing makes a compelling argument that you will benefit science beyond your own publications, and I have definitely benefited from that in the grant review process.

What barriers exist in your mind to more people doing so?

There is a lot of fear about openly sharing data and code. People believe that making their work public will result in being scooped or that their efforts will be criticized because they are too messy. There is a strong perception that sharing code and data takes a lot of extra time and effort. So the biggest barriers are sociological at the moment.

To address these barriers we need to be a better job of providing credit to scientists for sharing good data and code. We also need to do a better job of educating folks about the benefits of doing so. For example, in my experience, the time and effort dedicated to developing and documenting code and data as if you plan to share it actually ends up saving the individual research time in the long run. This happens because when you return to a project a few months or years after the original data collection or code development, it is much easier if the code and data are in a form that makes it easy to work with.

How has twitter helped your research efforts?

Twitter has been great for finding out about exciting new research, spreading the word about our research, getting feedback from a broad array of folks in the science and tech community, and developing new collaborations. A recent paper that I co-authored in PLOS Biology actually started as a conversation on twitter.

How has R Open Science helped you with your work, or why is it important or not?

rOpenSci is making it easier for scientists to acquire and analyze the large amounts of scientific data that are available on the web. They have been wrapping many of the major science related APIs in R, which makes these rich data sources available to large numbers of scientists who don’t even know what an API is. It also makes it easier for scientists with more developed computational skills to get research done. Instead of spending time figuring out the APIs for potentially dozens of different data sources, they can simply access rOpenSci’s suite of packages to quickly and easily download the data they need and get back to doing science. My research group has used some of their packages to access data in this way and we are in the process of developing a package with them that makes one of our Python tools for acquiring ecological data (the EcoData Retriever) easy to use in R.

Any practical tips you’d share on making sharing easier?
We actually wrote a paper on this for data last year: Nine simple ways to make it easier to (re)use your data

One of the things I think is most important when sharing both code and data is to use standard licences. Scientists have a habit of thinking they are lawyers and writing their own licenses and data use agreements that govern how the data and code and can used. This leads to a lot of ambiguity and difficulty in using data and code from multiple sources. Using standard open source and open data licences vastly simplifies the the process of making your work available and will allow science to benefit the most from your efforts.

And do you think sharing data/methods will help you get tenure? Evidence it has helped others?

I have tenure and I certainly emphasized my open science efforts in my packet. One of the big emphases in tenure packets is demonstrating the impact of your research, and showing that other people are using your data and code is a strong way to do this. Whether or not this directly impacted the decision to give me tenure I don’t know. Sharing data and code is definitely beneficial to competing for grants (as I described above) and increasingly to publishing papers as many journals now require the inclusion of data and code for replication. It also benefits your reputation (as I described above). Since tenure at most research universities is largely a combination of papers, grants, and reputation, and I think that sharing at least increases one’s chances of getting tenure indirectly.

UPDATE: Added missing link to Titus Brown’s post: http://ivory.idyll.org/blog/2014-eli-conversation.html

A list of publicly available grant proposals in the biological sciences

Recently a bunch of folks in the biological sciences have started sharing their grant proposals openly. Their reasons for doing so are varied (see the links next to their names below), but part of the common justification is a general interest in opening up science so that all stages of the process can benefit from better interaction and communication, and part of it is to provide examples for younger scientists writing grants. To help accomplish both of these goals I’m going to do what Titus Brown suggested and compile a list of all of the available open proposals in the biological sciences (if you’re looking for math proposals they have a list too). Given the limited number of proposals available at the moment I’m just going to maintain the list here, sorted alphabetically by PI. Another way to find proposals is to look at the ‘grant’ and ‘proposal’ tags on figshare, where several of us have been posting proposals. If you know of more proposals, decide to post some yourself, or have corrections to proposal in the list, just let me know in the comments and I’ll keep the list updated. Enjoy!

B. Arman Aksoy (@armish)

Casey Bergman (@caseybergman)

Dave Bridges (

Titus Brown (@ctitusbrown; read Titus’ thoughts on sharing proposals)

Scott Chamberlain (@recology_)

Endymion D. Cooper (@EndymionCooper)

Karen Cranston (@kcranstn)

Kelly Dawe

Morgan Ernest (@skmorgane)

Edmund (Ted) Harte (@DistribEcology)

Jan Jensen (@janhjensen; read Jan’s thoughts on sharing proposals)

Paula Mabee

Rod Page (@rdmpage; read Rod’s thoughts on sharing proposals)

David Pappano (@djpappano)

Heather Piwowar (@researchremix) & Jason Priem (@jasonpriem) (read their thoughts on sharing proposals)

Rosie Redfield (@RosieRedfield)

Andrey Revyakin

Jeff Ross-Ibarra

Menno Schilthuizen (@schilthuizen)

Delia S. Shelton

Andrew Su (@andrewsu)

Sarah Supp (@srsupp)

Tracy Teal (@tracykteal)

Andrew Tredennick (@ATredennick)

Heroen Verbruggen

Todd Vision (@tjvision)

Ethan White (@ethanwhite; read Ethan’s thoughts on sharing proposals)

On making my grant proposals open access

As I announced on Twitter about a week ago, I am now making all of my grant proposals open access. To start with I’m doing this for all of my sole-PI proposals, because I don’t have to convince my collaborators to participate in this rather aggressively open style of science. At the moment this includes three funded proposals: my NSF Postdoctoral Fellowship proposal, an associated Research Starter Grant proposal, and my NSF CAREER award.

So, why am I doing this, especially with the CAREER award that still has several years left on it and some cool ideas that we haven’t worked on yet. I’m doing it for a few reasons. First, I think that openness is inherently good for science. While there may be benefits for me in keeping my ideas secret until they are published, this certainly doesn’t benefit science more broadly. By sharing our proposals the cutting edge of scientific thought will no longer be hidden from view for several years and that will allow us to make more rapid progress. Second, I think having examples of grants available to young scientists has the potential to help them learn how to write good proposals (and other folks seem to agree) and therefore decrease the importance of grantsmanship relative to cool science in the awarding of limited funds. Finally, I just think that folks deserve to be able to see what their tax dollars are paying for, and to be able to compare what I’ve said I will do to what I actually accomplish. I’ve been influenced in my thinking about this by posts by several of the big open science folks out there including Titus Brown, Heather Piwowar, and Rod Page.

To make my grants open access I chose to use figshare for several reasons.

  1. Credit. Figshare assigns a DOI to all of its public objects, which means that you can easily cite them in scientific papers. If someone gets an idea out of one of my proposals and works on it before I do, this let’s them acknowledge that fact. Stats are also available for views, shares, and (soon) citations, making it easier to track the impact of your larger corpus of research outputs.
  2. Open Access. All public written material is licensed under CC-BY (basically just cite the original work) allowing folks to do cool things without asking.
  3. Permanence. I can’t just change my mind and delete the proposal and I also expect that figshare will be around for a long time.
  4. Version control. For proposals that are not funded, revised, not funded, revised, etc. figshare allows me to post multiple versions of the proposal while maintaining the previous versions for posterity/citation.

During this process I’ve come across several other folks doing similar things and even inspired others to post their proposals, so I’m in the process of compiling a list of all of the publicly available biology proposals that I’m aware of and will post a list with links soon. It’s my hope that this will serve as a valuable resource for young and old researchers alike and will help to lead the way forward to a more open scientific dialogue.

The NSF Preproposal Process: Pt 2. A promising start.

When last we left our intrepid scientists, they were starting to ponder the changes that might result from the new pre-proposal process. In general, we really like the new system because it helps reviewers focus on the value of big picture thinking and potentially reduces the overall workload of both grant writing and grant reviewing. Of course academics are generally nervous about the major shift in the proposal process (and, let’s face it, change in general). Below we’ll talk about: 1) things we like about the new process; 2) concerns that we’ve heard expressed by colleagues and our thoughts on those issues; and 3) modifications to the system that we think are worth considering.

An emphasis on big picturing thinking.  As discussed in part 1, the 4-page proposal seems to shift the focus of the reader from the details of the project to the overall goals of the study. We are excited by this. The combined pre-proposal/full proposal process – with their different strengths and weaknesses – can potentially generate a strong synergy: the pre-proposal panel assesses which proposals could yield important enough results to warrant further scrutiny and the full-proposal panel assesses whether the research plan is sound enough to yield a reasonable chance of success. In the current reality of limited funding, it seems logical to increase the probability that funds go towards research that is both conceptually important and scientifically sound. Since many of us are more comfortable critiquing work based on specific methodological issues than on ‘general interest’ having a phase in the review that helps focus on the importance of the research seems valuable. However, if reviewers still focus primarily on methodological details (as seemed to be the case on Prof-like substance’s panel) then the new system could end up putting even less emphasis on big ideas, because the 4 pages will be entirely filled up with methods. Based on our experience this wasn’t a major concern, but it is definitely a possibility that NSF needs to be aware of.

Reduced reviewer workload: This was the primary motivation for the new system. We feel like we probably spent about as much time pre-panel reading and reviewing proposals, but we enjoyed it more because it involved more thinking about big questions and looking around in the literature and less slogging through 10 pages of methodological details. More importantly, there were no ad hoc reviewers for the pre-proposals, which greatly reduces the overall reviewer burden. The full-proposals will have ad hocs, but because there are fewer of them we should all end up getting fewer requests from NSF.

Reduced grant writer workload: One common concern about the new system is that people who write a successful pre-proposal will then have to also write a 15-page proposal, thus increasing the workload to 20 pages spread across two separate submissions (pre-proposal + proposal). Folks argue that this results in more time grant writing and less time doing science. Our perspective is that while not perfect, the new system is much better than the old system where many people we knew were putting in 1-2 (or even more) 15-page proposals per deadline (i.e., 2-4 proposals/year) with only a 5-10% funding rate (vs. 20-30% for full proposals under the new system). That’s a lot more wasted effort, especially when you consider that much of the prose from the pre-proposal will presumably be used in the full proposal. As grant writers we also really liked that we didn’t need to generate dozens of pages of time consuming supplemental documents (budgets, postdoc mentoring plans, etc.) until we knew there was at least a reasonable chance of the proposal being funded. The scientific community should definitely have a discussion about how to streamline the process further to optimize the ratio of effort in proposal writing and review to quality of science being funded, but the current system is definitely a step forward in our opinion. If you’re interested in some of the mechanisms for how the PI proposal writing workload could be modified – both Prof-Like Substance and Jack’s posts contain some interesting ideas.

New investigators: Everyone, everyone, everyone is concerned about the untenured people. Given the culture among universities that grants = tenure, untenured faculty don’t have the luxury of time, and the big concern is that only having 1 deadline/year gives untenured people fewer chances to get funding before tenure decisions. Since the number of proposals NSF is funding isn’t changing, this isn’t quite as bad as it seems. However, if it takes a new investigator a couple of rounds to make it past the prepoposal stage then they may not have very many tries to figure out how to write a successful full proposal. The counterarguments are that the once-yearly deadline gives investigators more time to refine ideas, digest feedback, obtain friendly reviews from colleagues and therefore (hopefully) submit stronger proposals as a result. It also (potentially) restricts the amount of time that untenured folks spend writing grants, therefore freeing up more time to focus on scholarly publications, mentoring students, and creating strong learning environments in our classrooms, which (theoretically) also are important for tenure. We love the ideas behind the counterarguments and if things really play out that way it would be to the betterment of science, but we do worry about how this ideal fares against the grants=tenure mentality.

Collaboration: One of our big concerns (and that of others as well ) is the potential impact of the 2 proposal limit on interdisciplinary collaboration. Much of science is now highly interdisciplinary and collaborative and if team size is limited because of proposal limits this will make both justifying and accomplishing major projects more difficult. We have already run into this problem both in having former co-PIs remove themselves from existing proposals and in having to turn down potential collaborations. We have no problem with a limit on the number of lead-PI proposals, in a lot of ways we think it will help improve the balance between proposing science and actually doing it, but the limit on collaboration is a major concern.

In general, we think that the new system is a definite improvement over the old system, but there are clearly still things to be discussed and fine tuned. Possible changes to consider include:

  • Find a way to allow full proposals that do well to skip the pre-proposal stage the next year. This will reduce stochasticity and frustration. These proposals could still count towards any limit on the number of proposals.
  • Clearly and repeatedly communicate to the pre-proposal panels (let’s face it, faculty don’t tend to listen very well) the desired difference in emphasis between evaluating preliminary proposals and full proposals. This will help maintain the emphasis on interesting ideas and might also help alleviate the angst some panelists felt about what to do about proposals that were missing important details but not obviously flawed.
  • Consider making the proposal limit on the number of proposals on which someone will be the lead PI. This still discourages excessive submissions without hurting the collaborative, interdisciplinary approach to science that we’ve all been working hard to foster.

So there it is. Our 2-part opinion piece on the new NSF-process. If you were hoping for a pre-proposal magic template, we’re sorry to disappoint, but hopefully you found a lot to think about here while you were looking for it!

UPDATE: If you were hoping for a pre-proposal magic template, checkout the nice post over at Sociobiology.

The NSF Pre-Proposal Process: Pt 1. Judging Preproposals

Before we start, this post refers to posts already written on this topic. To make sure no one gets lost, please follow the sequence of operations below:

Step 1: Do you know about the new pre-proposal process at NSF?

  • If Yes: Continue to Step 2.
  • If No: please read one of these posts and then proceed to Step 2.

Step 2: Have you read Jack William’s most excellent post (posted on Jacquelyn Gill’s most excellent blog) about a preproposal panelist’s perspective on the new process?

Step 3: Have you read Prof-like Substance’s post about his experience on a pre-proposal panel? (What? You haven’t read Prof-Like Substance’s blog before?! Go check him out.)

  • If Yes, continue to Step 4
  • If No, go to The Spandrel Shop and read Prof-like Substance’s post and return.

Step 4: Read our post! Like Jack and Prof-Like Substance, we also have experience with the new pre-proposal panels. The nuts and bolts of our experiences were similar to theirs (i.e., number of proposals read, assigning pre-proposals to one of three categories, etc). The main differences are really in our perceptions of the experience and the implications for the broader field. Please remember, there were a TON of pre-proposal panels this spring in both IOS and DEB. Differences from other panelists may reflect idiosyncratic differences in panels or differences in disciplines or just different takes on the same thing – because of NSF confidentiality rules, we can’t identify anything specific about our experiences – so don’t ask. And, speaking of rules: [start legalese] all opinions expressed within this post (including our comments, but not the comments of others) reflect only the aggregated opinions of Ethan & Morgan – henceforth referred to as Weecology – and do not represent official opinions by any entity other than Morgan & Ethan (even our daughter does not claim affiliation with our opinion…though to be honest, she’s two and she disagrees with everything we say anyway). [end legalese]

1) The Importance of Big Ideas. Our perspective on what made for a successful pre-proposal jives largely with Jack’s. The scope of the question being asked was really important. The panelists had to believe that the research would be a strong and important contribution to the field as a whole – not just to a specific system or taxon. Not only did the question being proposed need to be one that would have broad relevance to the program’s mission, it needed a logical framework for accomplishing that goal. In our experience, disconnects between what you propose to address and what you’re actually doing become glaringly obvious in 4 pages.

2) Judging Methods. The limited space for methods was tricky for both reviewers and writers. Sometimes the methods are just bad – if a design is flawed in 4 pages, it’ll still be flawed in 40 pages. The challenge was how to judge proposals where nothing was obviously wrong, but important details were missing. After reviewing full-proposals where you are trying to decide whether a proposal should be funded as is, this was a rough transition to make because all the details can’t reasonably be fit into 4 pages. While the panel was cognizant of this, it is still hard to jettison old habits. Sometimes proposals were nixed because of those missing details and sometimes not. We honestly don’t have a good feel for why, but it might reflect a complex algorithm involving: a) how cool the idea was, b) the abilities of the research team – i.e. is there a PI with demonstrated experience related to the unclear area, and c) just how important did those missing details really seem to a panelist.

3) Methods vs. Ideas. Our impression is that the 4-page format seems to alter the focus of the reviewer. In 15-pages, so much of the proposal is the methods – the details of questions, designs, data collection, analyses. It’s only natural for the reader to focus on what takes up most of the proposal. In contrast, the structure of the pre-proposal really shifts the focus of the reviewer to the idea. Discussions with our fellow panelists suggest we weren’t the only ones to perceive this though it’s important to note that not everyone feels this way – Prof-Like Substance’s post and comments flesh out an alternative to our experience.

4) Reviewers spend more time thinking about your proposal. This was an interesting and unexpected outcome of the short proposals. We both spent more time reading the literature to better understand the relevance of a pre-proposal for the field, looking up techniques, cited literature, etc. There was also a general feeling that panelists were more likely to reread pre-proposals. In our experience, most panelists felt like they spent about as much time reviewing each preproposal as they would a 15-pager, but more of this time was spent reading the literature and thinking about the proposal.

In general, like Jack, we came away with a positive feeling about the ability of the panel to assess the pre-proposals. A common refrain among panelists is that we were generally surprised how well assessing a 4-page proposal actually worked. However, the differences in how a 4-pager is evaluated could have some interesting implications for the type of science funded – something we will speculate on in our next blog post (yes, this is as close as an academic blog gets to a cliff-hanger….).

NSF Pre-proposal guidelines/instructions

UPDATE: If you’re looking for the information for 2014, checkout the DEBrief post for links.

UPDATE: If you’re looking for the information for 2013, here’s an updated post.

Since I have now spent far too much time on multiple occasions trying to track down the instructions for the new pre-proposals for NSF DEB and IOS grants I’m going to post the link here under the assumptions that other folks will be looking for this information as well (and also finding it difficult to track down).

http://www.nsf.gov/pubs/2011/nsf11573/nsf11573.htm#prep

Happy post-holiday grant writing to all.

UPDATE 1: Also note that the Biosketches are different for the pre-proposals (changes noted in bold-italics)

Biographical Sketches (2-page limit for each) should be included for each person listed on the Personnel page. It should include the individual’s expertise as related to the proposed research, professional preparation, professional appointments, five relevant publications, five additional publications, and up to five synergistic activities. Advisors, advisees, and collaborators should not be listed on this document, but in a separate table (see below).

UPDATE 2: Though it is not explicitly clear from the link above, Current & Pending Support should NOT be included in pre-proposals (thanks to Alan Tessier for clearing this up).