May 14-15, 2015
9:00 am - 5:00 pm
Instructors: Mine Çetinkaya-Rundel (Dept. of Statistical Science), Karen Cranston (NESCent), Hilmar Lapp (GCB), François Michonneau (University of Florida, iDigBio)
Helpers: Dan Leehr (GCB)
Making science more reproducible has the potential to advance scientific research and make researchers' work more effective and productive. For computational and data-intensive research, which is increasingly pervasive across the sciences, this is particularly true, and yet is often seen as difficult to achieve. In this 2-day bootcamp-style hands-on workshop, we will teach a number of tools, resources, and practices that can be employed today to make one's computational science more reproducible.
The course is co-run by Duke's Center for Genomic and Computational Biology (GCB), and is the result of the Reproducible Science Curriculum Hackathon that was held December 8-11, 2014, at the National Evolutionary Synthesis Center (NESCent) in Durham. The hackathon and instructor travel are supported by the National Science Foundation (NSF).
Who: The course is aimed at graduate students, postdocs, and other researchers who perform computational analysis or work. The material on automation uses basic R for teaching and illustrating the key concepts. Advanced knowledge of R is not needed.
Where: Bostock Library, 411 Chapel Dr, Durham, NC 27708. Get directions with OpenStreetMap or Google Maps.
Requirements: Participants must bring a laptop with a few specific software packages installed (listed below). They are also required to abide by our Code of Conduct, which we have adopted from Software Carpentry.
The course is free but requires registration. We ask that as a courtesy to others you cancel as early as possible if you register and subsequently are prevented from taking your seat.
Contact: Please email hlapp@duke.edu for more information.
09:00 | Introduction to Reproducible Research |
10:30 | Coffee |
12:30 | Lunch break |
13:30 | Organizing your project to facilitate Reproducible Research |
15:00 | Coffee |
17:00 | Wrap-up |
09:00 | Automating your workflows |
10:30 | Coffee |
12:30 | Lunch break |
13:30 | Sharing and publishing your research workflow |
15:00 | Coffee |
16:30 | Wrap-up |
Etherpad: https://etherpad.mozilla.org/ZL70mnoRpS.
We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
If you installed R and/or RStudio previously, please update to the latest versions. For this workshop we will use R 3.2.0 and RStudio 0.98.1103.
Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE. You can watch this video that demonstrates the process.
Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio IDE. You can watch this video that demonstrates the process.
You can download the binary files for your distribution
from CRAN. Or
you can use your package manager (e.g. for Debian/Ubuntu
run sudo apt-get install r-base
and for Fedora run
sudo yum install R
). Also, please install the
RStudio IDE.
Start RStudio, and type (or copy and paste) at the console:
install.packages(c("knitr", "rmarkdown", "ggplot2", "dplyr"))
Please do this even if you think you have these packages already.
Sometimes, older versions do not have dependencies installed that newer
ones do, and that our materials assume are present.