A couple of months ago Micah J. Marty and I had a twitter conversation and subsequent email exchange about how citations worked with preprints. I asked Micah if I could share our email discussion since I thought it would be useful to others and he kindly said yes. What follows are Michah’s questions followed by my responses.
Right now, I am finishing up a multi-chapter Master’s thesis and I plan to publish a few papers from my work. I may want to submit a preprint of one manuscript but before I propose this avenue to my advisor, I want to understand it fully myself. And I have remaining questions about the syntax of citing works when preprints come into play. What happens to a citation of a preprint after the manuscript is later published in a peer reviewed venue?
At the level of the journal nothing happens. So, if you cite a preprint in a published ms, and that preprint is later published as a paper, then the citation is still to the preprint. However, some of the services indexing citations recognize the relationship between the preprint and the paper and aggregate the citations. Specifically, Google Scholar treats the preprint and the published paper as the same for citation analysis purposes. See the citation record for our paper on Best practices for scientific computing which has been cited 49 times, but the vast majority of those are citations to the preprints.
Here’s an example with names we can play with: Manuscript 1 (M1) may require some extra analysis, but it presents some important unexpected results that I would like to get out on the table as soon as possible. M1 is submitted to PeerJ Preprints and accepted (i.e., published online as a preprint with a DOI). M2 is submitted to Marine Ecology Progress Series (MEPS) for peer review, and M2 cites the PeerJ Preprint M1.
Just a point related to vocabulary, I wouldn’t typically think of the preprint as being “accepted”. Any checking prior to posting is just a quick glance to make sure that it isn’t embarrassingly bad, so as long as it’s reasonably written and doesn’t have a title like “E is not equal to mc squared” it will be posted almost immediately (within 48 hours on most preprint servers).
1) Are preprints considered “grey literature”? That is, is it illegitimate for M2 to cite a work that has not been peer reviewed?
Yes, in the sense that they haven’t been formally peer reviewed prior to posting they are similar to “grey literature”. Whether or not they can be cited depends on the journal. Some journals are happy to allow citing of preprints. For example, this recent paper in TREE cites a preprint of ours on arXiv. Their paper was published before ours was accepted, so if it wasn’t for the preprint it couldn’t have been cited.
2) Is there a problem if M1 is eventually published in a peer reviewed journal but the published article of M2 cites only the PeerJ Preprint of M1?
I would say no for two reasons. First, assuming that M2 is published before M1 then the choice is between having a citation to something that people can read, science can benefit from, and that can potentially be indexed (giving you citation credit) vs. a citation to “Marty et al. unpublished data”, which basically does nothing. Second, all preprint servers provide a mechanism for linking to the final version, so if someone finds the preprint via a citation in M2 then that link will point them in the direction of the final version that they can then read/cite/etc.
In short, I think as long as you aren’t planning on submitting to a behind the times journal that doesn’t allow the submission of papers that have been posted as preprints (and the list of journals with this policy is shrinking rapidly) then there is no downside to posting preprints. In the best case scenario it can lead to more people reading your research and citing it. The worst case scenario is exactly the same as if you didn’t post a preprint.
I am currently the remotely working member of Weecology, finishing up my PhD in the lower elevation and better air of Kansas, while the rest of my colleagues are still in Utah, due to developing a chronic illness and finally getting diagnosed with fibromyalgia. The relocation is actually working out really well. I’m in better shape because I’m not having to fight the air too, and I’m finally making real progress toward finishing my dissertation again.
I ruthlessly culled everything that wasn’t directly working on my dissertation. I was going to attend the Gordon Conference this year, as I had heard fantastic things about it for years, but had not been ready to go yet, but I had to drop that because I wasn’t physically able to travel. I did not go to ESA, because I couldn’t travel. There are working groups and workshops galore, all involving travel, which I cannot do. Right now, the closest thing that we have to bringing absent scientists to an event is live tweeting, which is not nearly as good as hearing a speaker for yourself, and is pretty heartbreaking if you had to cancel your plans to attend an event because you were too infirm to go.The tools that I’m using to do science remotely are not just for increasing accessibility for a single chronically ill macroecologist. They are good tools for science in general. I’m using GitHub to version control my code, and Dropbox to share data and figures. Ethan can see what I’m working on as I’m doing it, and I’ve got a clear record of what I was doing and what decisions that I made. While my cognitive dysfunction may be a bit more extreme of a problem, I know that we’ve all stayed up too late coding and broken something we shouldn’t have and the ability to wave the magic Git wand and make any poor decisions that I made while my brain was out to lunch go away is priceless.
Open access? Having open access to papers is really important when you are going to be faced shortly with probably not having any institutional access anymore. Also, important for everyone else who isn’t at a major university with very expensive subscriptions to all the journals. Having open access to data and code is crucial when you can’t collect your own data and are going to be doing research from your home computer on the cheap because you can’t rely on your body to work reliably at any given point in time.
Video conferencing is working well for me to meet with the lab, but could also be great for attending conferences and workshops. This would not only be good for a certain macroecologist, but would also be good to include people from smaller universities, etc. who would like to participate in these type of things too, but can’t otherwise due to the travel. I did my master’s degree at Fort Hays State University, and I still love it dearly. This type of increased accessibility would have been great for me while I was a perfectly healthy master’s student. Fort Hays is a primarily undergraduate institution in the middle of Kansas, about four hours away from any major city, and it does not have some of the resources that a larger university would have. No seminar series, no workshops, not much travel money to go to workshops or conferences, which doesn’t mean that good science can’t still be happening.
Many of my labmates are looking for post-docs, or are already in postdoc positions at this point. I’m very excited for all of them, and await eagerly all the stories of the exciting new things they are doing. Having a chronic illness limits what I am capable of doing physically. I am not going to be able to move across the country for a post-doc. That does not mean that I do not want to play science too. I’ve got my home base set up, and I can reach pretty far from here. I still want to be a part of living science, I don’t want to have to get to the party after everyone else has gone home.
And I wonder, why can I not do these things? Is it not the future? Do we not have the internet, with video chat? I get to meet with Ethan and talk science at our weekly meetings every week. I go to lab meetings with video chat, and get to see what my labmates are doing, and crack jokes, and laugh at other people’s jokes. It wouldn’t be hard to get me to conferences and working groups either.
With technology, I get to be a part of living, breathing science, and it is a beautiful thing.
I’m looking for one or more graduate students to join my group next fall. In addition to the official add (below) I’d like to add a few extra thoughts. As Morgan Ernest noted in her recent ad, we have a relatively unique setup at Weecology in that we interact actively with members of the Ernest Lab. We share space, have joint lab meetings, and generally maintain a very close intellectual relationship. We do this with the goal of breaking down the barriers between the quantitative side of ecology and the field/lab side of ecology. Our goal is to train scientists who span these barriers in a way that allows them to tackle interesting and important questions.
I also believe it’s important to train students for multiple potential career paths. Members of my lab have gone on to faculty positions, postdocs, and jobs in both science non-profits and the software industry.
Scientists in my group regularly both write papers (e.g., these recent papers from dissertation chapters: Locey & White 2013, Xiao et al. 2014) and develop or contribute to software (e.g., EcoData Retriever, ecoretriever, rpartitions & pypartitions) even if they’ve never coded before they joined my lab.
My group generally works on problems at the population, community, and ecosystem levels of ecology. You can find out more about what we’ve been up to by checking out our website. If you’re interested in learning more about where the lab is headed I recommend reading my recently funded Moore Investigator in Data-Driven Discovery proposal.
PH.D STUDENT OPENINGS IN QUANTITATIVE, COMPUTATIONAL, AND MACRO- ECOLOGY
The White Lab at the University of Florida has openings for one or more PhD students in quantitative, computational, and/or macro- ecology to start fall 2015. The student(s) will be supported as graduate research assistants from a combination of NSF, Moore Foundation, and University of Florida sources depending on their research interests.
The White Lab uses computational, mathematical, and advanced statistical/machine learning methods to understand and make predictions/forecasts for ecological systems using large amounts of data. Background in quantitative and computational techniques is not necessary, only an interest in learning and applying them. Students are encouraged to develop their own research projects related to their interests.
The White Lab is currently at Utah State University, but is moving to the Department of Wildlife Ecology and Conservation at the University of Florida starting summer 2015.
Interested students should contact Dr. Ethan White (email@example.com) by Nov 15th, 2014 with their CV, GRE scores, and a brief statement of research interests.
UPDATE: Added a note that we work at population, community, and ecosystem levels.
We are very excited to announce the newest release of our EcoData Retriever software and the first release of a supporting R package, ecoretriever. If you’re not familiar with the EcoData Retriever you can read more here.
The biggest improvement to the Retriever in this set of releases is the ability to run it directly from R. Dan McGlinn did a great job leading the development of this package and we got ton of fantastic help from the folks at rOpenSci (most notably Scott Chamberlain, Gavin Simpson, and Karthik Ram). Now, once you install the main EcoData Retriever, you can run it from inside R by doing things like:
install.packages('ecoretriever') library(ecoretriever) # List the datasets available via the Retriever ecoretriever::datasets() # Install the Gentry dataset into csv files in your working directory ecoretriever::install('Gentry', 'csv') # Download the raw Gentry dataset files, without any processing, # to the subdirectory named data ecoretriever::download('Gentry', './data/') # Install and load a dataset as a list Gentry = ecoretriever::fetch('Gentry') names(Gentry) head(Gentry$counts)
The other big advance in this release is the ability to have the Retriever directly download files instead of processing them. This allows us to support data that doesn’t come in standard tabular forms. So, we can now include things like environmental data in GIS formats and phylogenetic data such as supertrees. We’ve used this new capability to allow the automatic downloading of the Bioclim data, one of the most widely used climate datasets in ecology, and the supertree for mammals from Fritz et al. 2009.
I am incredibly excited to announce that I am the recipient of one of the Moore Foundation’s Investigators in Data-Driven Discovery awards.
To quote Chris Mentzel, the Program Director of the Data-Driven Discovery Initiative:
Science is generating data at unprecedented volume, variety and velocity, but many areas of science don’t reward the kind of expertise needed to capitalize on this explosion of information. We are proud to recognize these outstanding scientists, and we hope these awards will help cultivate a new type of researcher and accelerate the use of interdisciplinary, data-driven science in academia.
I feel truly honored to have been selected. All the finalists that I met at the Moore Foundation in July were amazing as were all of the semi-finalists that I knew. I did not envy the folks making the final decisions.
So what will we be doing with this generous support from the Moore Foundation?
- Doing data-intensive prediction and forecasting in ecological systems: We’ll be focusing on population and community level forecasting as well as ecosystem level work where it interfaces with community level approaches. We’ll be using both process based ecological approaches with machine learning, with an emphasis on developing testable predictions and evaluating them with independent (out-of-sample) data. As part of this effort we’ll be making publicly available forecasts for large ecological datasets prior to the collection of the next round of data, following Brian McGill’s 6th P of Good Prediction (in fact we’ll be trying to follow all of his P’s as much as possible). There’s a lot of good work in this area and we’ll be building on it rather than reinventing any wheels.
- Increasing the emphasis on testable prediction and forecasting in ecology more broadly: Industry and other areas of science have improved their prediction/forecasting through competitions that provide data with held out values and challenge folks to see who can do the best job of predicting those values (most notable in Kaggle competitions). We’ll be helping put together something like this for ecology and hopefully integrating that with our advanced predictions to allow other folks to easily make predictions from their models public and have them evaluated automatically as new data is released.
- Tools for making data-intensive approaches to ecology easier: We’ll be continuing our efforts to make acquiring and working with ecological data easier. Our next big step is to make combining numerous ecological and environmental datasets easy so that researchers can focus on doing science rather than assembling data.
- Training: We’ll be helping build and grow Data Carpentry, a new training effort that is a sister project to Software Carpentry with a focus on data management, manipulation and analysis.
I’m very excited to be joined in this honor by my open science/computational training/data-intensive partner in crime C. Titus Brown(@ctitusbrown). I was also particularly thrilled to find out that I wasn’t the only investigator studying ecological systems. Laurel Larsen is in the Geography department at Berkeley and I can’t wait to interact with her more as we both leverage large amounts of ecological data to improve our understanding of ecological systems and our ability to forecast their states in the future. We are joined by astronomers, statisticians, computer scientists, and more. Check out the entire amazing group at the official Moore Foundation Investigators site and see the full press release for additional details about the program.
The award is being run through the University of Florida since we are in the process of relocating there, but I owe a huge dept of gratitude to the Biology Department and the Ecology Center at Utah State University for always supporting me while I spent time developing software, working on computational training initiatives, and generally building a data-intensive ecology program. Without their support I have no doubt that I wouldn’t be writing this blog post today.
So here it is, the first of the positions we’ll be advertizing as part of our move to the University of Florida. The official ad is below, but a few comments first. The position is for a student to work with me, but for those who aren’t really familiar with our groups, it’s important to note that my group works closely with Ethan White’s lab (we provide desk space that mixes the labs together, we have a single group lab meeting, etc). My group tends to attract people who like to do field work. Ethan’s tends to attract people who are more quantitatively or computationally inclined. We mix our groups because we believe that the divide that exists between quantitative and field-based approaches to ecology is bad for our science and that we need more people trained to serve as bridges between the quantitative and field-oriented worlds of ecology.
Here are some links to the papers my students have published from their dissertations to get a feel for what my students have intellectually gotten out of this environment:
PH.D STUDENT OPENING IN COMMUNITY ECOLOGY
The Ernest Lab at the University of Florida has an opening for a Ph.D student in the area of Community Ecology to start fall 2015. The student will be supported as a graduate research assistant as part of an NSF-funded project at a long-term research site (portalproject.weecology.org) in southeastern Arizona to study regime shifts (rapid shifts in ecosystem structure and function). This position will participate in data collection efforts in Arizona on rodents and plants.
The Ernest lab is interested in general questions about the processes that structure communities, with a particular focus on understanding how ecological communities change through time. Students are free to develop their own research projects depending on their interests.
The Ernest Lab is currently at Utah State University, but is moving to the Department of Wildlife Ecology and Conservation at the University of Florida starting summer 2015.
More information about the lab is available at: http://ernestlab.weecology.org
Interested students should contact Dr. Morgan Ernest (firstname.lastname@example.org) by Nov 15th, 2014 with their CV, GRE scores, and a brief statement of research interests.
We are excited to announce that Weecology will be moving to the University of Florida next summer. We were recruited as part of the UF Rising Preeminence Plan, a major hiring campaign to bring together researchers in a number of focal areas including Big Data and Biodiversity. We will both be joining the Wildlife Ecology and Conservation department, Ethan will be part of UF’s new Informatics Institute, and Morgan will be part of UF’s new Biodiversity Initiative.
As excited as we are about the opportunities at Florida, we are also incredibly sad to be saying goodbye to Utah State University. Leaving was not an easy decision. We have amazing colleagues and friends here in Utah that we will greatly miss. We have also felt extremely well treated by Utah State. They were very supportive while we were getting our programs up and running, including helping us solve the two-body problem. They allowed us to take risks in both research and the classroom. They have been incredibly supportive of our desires for work-life balance, and were very accommodating following the birth of our daughter. It was a fantastic place to spend nearly a decade and we will miss it and the amazing people who made it home.
So why are we leaving? It was a many faceted decisions, but at its core was the realization that the scale of the investment and recruiting of talented folks in both of our areas of interest was something we were unlikely to see again in our careers. The University of Florida has always had a strong ecology group, but between the new folks who have already accepted positions and those we know who are being considered, it is going to be such a talented and exciting group that we just had to be part of it!
As part of the move we’ll be hiring for a number of different positions, so stay tuned!
As announced by Noam Ross on Twitter (and confirmed by the Editor in Chief of Ecology Letters), Ecology Letters will now allow the submission of manuscripts that have been posted as preprints. Details will be published in an editorial in Ecology Letters. I want to say a heartfelt thanks to Marcel Holyoak and the entire Ecology Letters editorial board for listening to the ecological community and modifying their policies. Science is working a little better today than it was yesterday thanks to their efforts.
For those of you who are new to the concept of preprints, they are manuscripts, that have not yet been published in peer reviewed journals, which are posted to websites like arXiv, PeerJ, and bioRxiv. This process allows for more rapid communication of scientific results and improved quality of published papers though more expansive pre-publication peer-review. If you’d like to read more check out our paper on The Case for Open Preprints in Biology.
The fact that Ecology Letters now allows preprints is a big deal for ecology because they were the last of the major ecology journals to make the transition. The ESA journals began allowing preprints just over two years ago and the BES journals made the switch about 9 months ago. In addition, Science, Nature, PNAS, PLOS Biology, and a number of other ecology journals (e.g., Biotropica) all support preprints. This means that all of the top ecology journals, and all of the top general science journals that most ecologists publish in, allow the posting of preprints. As such, there is not longer a reason to not post preprints based on the possibility of not being able to publish in a preferred journal. This can potentially shave months to years off of the time between discovery and initial communication of results in ecology.
It also means that other ecology journals that still do not allow the posting of preprints are under significant pressure to change their policies. With all of the big journals allowing preprints they have no reasonable excuse for not modernizing their policies, and they risk loosing out on papers that are initially submitted to higher profile journals and are posted as preprints.
It’s a good day for science. Celebrate by posting your next manuscript as a preprint.
If you want more info, you should email one of the people who signed the below email (I’ve linked to their websites). I’m not an organizer, just a messenger!
We are delighted to announce an upcoming joint meeting of the BES Macroecology Group, the GfO Macroecology Group, and the Center for Macroecology, Evolution & Climate (CMEC). The meeting will be hosted by CMEC in Copenhagen, Denmark during June 2015. We are sure it will provide an exciting opportunity for the members of these groups to share their latest research and ideas, and to initiate new collaborations in the relatively informal atmosphere consistent with the society group meetings.
To help us find the best dates, length of meeting and a good estimate of participant numbers, we would appreciate it if you could spare a couple of minutes to fill out this very short survey: https://www.surveymonkey.com/s/KKFBMHY
Thanks very much
The organising committee
BES Macroecology: http://macroecologyuk.weebly.com/
GfO Macroecology: http://www.gfoe.org/en/gfoe-specialist-groups/macroecology.html
UPDATE: Fixed lots of broken links and a couple of typos
There is a lot of discussion on the internet about highly skewed speaker lists at symposia and conferences. For the past year, I’ve been co-organizing a small conference (~110 people) with Michael Angilletta where we’ve been practicing some of the approaches I developed and blogged about earlier for organizing a seminar series. However, in ecology we know that what works at small scales may not apply to larger scales. So, do I still think organizing a conference that is both strong on research and gender diversity is very doable? Read and find out.
But before I give you my thoughts, first, some stats and background. For this conference, we had a lot of moving pieces: Discussion Leaders who helped organize their sessions, Invited Speakers – both for long and short talks, a Mentoring Program for Young Scientists which involved selecting both mentors and mentees. In the end (I hope, I’m writing this about a week before the conference, so hopefully things don’t change drastically), we ended up with the following numbers for each of these parts.
Discussion Leaders: 5 men, 4 women
Invited Long-Talk Speakers (40 min talks): 9 men, 9 women
Invited Short Talk Speakers (12 min talks):9 men, 7 women
Students/Postdocs in Mentoring Program: 10 men, 10 women
Professors (all ranks) in Mentoring Program: 11 men, 9 women
So over all the slots that we invited people to fill, we have 53% men.
What did we do? Just like I talked about in my post about the seminar series, we generated a large pool of names. We started by making a big list of people to lead the various sessions at the conference and developed an invite list that was balanced. We then used our balanced group of Discussion Leaders to brainstorm potential speakers. Each Discussion Leader provided a list of people they thought would be excellent for their session. They were given detailed instructions about how to generate their list – diverse perspectives on their topic, diversity of taxa/ecosystems, including domestic and international scientists, and a reminder to be aware of the gender ratios of their list.
From those lists, Mike and I sat down and constructed our dream team speaker list – balancing research areas, topics and taxa/ecosystems, career stages, making sure we had some international representation, and keeping an eye on gender balance in the process. Then we set out to convince these people to come speak at the conference.
For the Mentoring Program, we ran an application process. We advertised on every social media outlet and listserv we could think of. Our pool of applicants was very gender balanced (23 women, 21 men). We selected 20 young scientists, equally split among male and female, again balancing across various dimensions of research & people diversity.
The Mentors and Short Talk Speakers were harder. Most of our Short Talk Speakers are students from the mentoring program but we had some slots leftover to fill. Both the mentor and short talk speakers needed to fill specific topic requirements either for the program or to overlap with already chosen mentees.
So what lessons did I learn?
Gender balancing a conference is hard, but not in the ways I generally heard about before I started. It is not harder to find female speakers – as long as you don’t restrict yourself to only senior female professors. There are lots of kickass women out there, but you need to embrace the fact that they are scattered across career stages. Women were not more likely to say no. I don’t know why we didn’t experience this commonly reported problem. Maybe it was because I was sending the invites. Are women more likely to say yes to another women asking? I also spent a lot of time sending personalized emails communicating why we thought they in particular would be a good fit for the conference and why we wanted them there (I did this for the men too). What was hard about it was spending the extra time sending personalized emails to communicate clearly why I was inviting them. Did those efforts make a difference? I really don’t know. You’ll have to ask the amazing scientists who said yes to our invites.
Developing at the get-go a diverse pool of people you would like involved is critical. This is another time intensive step. Crowd-sourcing this to our Discussion Leaders helped a lot. Many of them knew speakers (men and women) we hadn’t thought of. When we pooled all those suggestions, we had 123 suggestions for 16 speaking slots. That gave us a ton of flexibility when thinking about the program we wanted to create. It was also really handy when someone said no because all the brainstorming work had already been done. We could sit down with our list and come to consensus quickly on the next invite to send. We often saved up rejections to fill as a group, thus allowing us to manage the diversity better.
The more restricted the slot you’re trying to fill, the harder it is to get gender balance. If your need is 2 kickass people who work in general area X, then gender balance is easy. The more criteria you place or the fewer the number of slots (or both) the harder it gets. Need a senior researcher studying organism X on specific subtopic Y and need another senior researcher studying organisms Z on specific subtopic A in ecosystem Q? Yeah, both those slots are probably going to end up being men, just because of the numbers game. View your program creatively. Be willing to think about different ways people can fit into the program given the diversity of research you’re trying to cover and the multiple facets that everyone has in their research programs.
So my final thoughts on the matter? Making a gender balanced conference is not easy and because of the strong gender skew at the senior levels, it doesn’t just magically happen. It takes work, planning, creativity, and a great team of people helping you brainstorm names. But a 80:20 split in invited speakers is far from the grim ‘reality’ that some might think.