This repository has been archived by the owner on Feb 6, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathdocumentingR.Rmd
202 lines (164 loc) · 5.24 KB
/
documentingR.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
---
title: "Documenting R Code"
author: "Kelly Sovacool"
date: "5/21/2019"
output:
ioslides_presentation:
widescreen: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(eval = FALSE)
```
## Why document your code?
- Reproducibility
- [FAIR](https://doi.org/10.1038/sdata.2016.18) data principles: findable, accessible, **interopable**, **reusable**
- Make it usable for other scientists -- including future you!
See papers listed at the [Riffomonas Tutorial](http://www.riffomonas.org/reproducible_research/) for further reading on reproducibility.
## Ten simple rules for documenting scientific software
### Excerpted from [Lee et al. 2018](https://doi.org/10.1371/journal.pcbi.1006561):
- Write comments as you code.
- Include a README file with basic information.
- Version control your documentation.
- Use automated documentation tools.
- Write error messages that provide solutions or point to your documentation.
## What not to document
- "Document Design and Purpose, Not Mechanics" - [Wilson et al. 2014](https://doi.org/10.1371/journal.pbio.1001745)
- In other words, "Why, not how".
- Don't do this:
```{r}
i <- i + 1 # increment `i` by 1
```
- Try to write code that is self-documenting.
- Use descriptive variable names. If you have to write a comment to describe your variable, you probably need to rename it. Examples:
- `counter` instead of `i`
- `patient_metadata` instead of `df`
- `data_cleaned` instead of `df2`
## What to document
- No code is fully self-documenting.
- You should document:
- packages
- datasets
- functions
- classes
- any tricky lines of code
## How to document
### The bare minimum:
Include a comment at the top of your R script to briefly describe what it does at a high level.
```{r}
# Generate plots from mothur sensspec files for comparing clustering algorithms.
library(ggplot2)
```
## How (specifics for R)
### Best practices:
- For any project with R scripts:
- Write comments alongside your code with `roxygen2` syntax for `man/` files.
- Make your project a package with `usethis` & `devtools`.
- For packages you release into the wild:
- Write a vignette with `R Markdown`.
- Create a website with `pkgdown`.
## `roxygen2` syntax
[Document functions](http://r-pkgs.had.co.nz/man.html#man-functions) in `R/*.R` files
```{r}
#' Add together two numbers.
#'
#' @param x A number.
#' @param y A number.
#' @return The sum of \code{x} and \code{y}.
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}
```
## `roxygen2` syntax
[Document datasets](http://r-pkgs.had.co.nz/data.html#documenting-data) in `R/data.R`
```{r}
#' Prices of 50,000 round cut diamonds.
#'
#' A dataset containing the prices and other attributes of almost 54,000
#' diamonds.
#'
#' @format A data frame with 53940 rows and 10 variables:
#' \describe{
#' \item{price}{price, in US dollars}
#' \item{carat}{weight of the diamond, in carats}
#' ...
#' }
#' @source \url{http://www.diamondse.info/}
"diamonds"
```
## `roxygen2` syntax
[Document the package](http://r-pkgs.had.co.nz/man.html#man-packages) in `R/package_name.R`
```{r}
#' foo: A package for computating the notorious bar statistic.
#'
#' The foo package provides three categories of important functions:
#' foo, bar and baz.
#'
#' @section Foo functions:
#' The foo functions ...
#'
#' @docType package
#' @name foo
NULL
```
## Activity
Let's document code from the [Riffomonas minimalR](http://www.riffomonas.org/minimalR/06_line_plots.html) tutorial!
1. The package: `minimalR`. Edit [R/minimalR.R](R/minimalR.R).
1. The dataset: `baxter_metadata`. Edit [R/data.R](R/data.R)
(raw data in `inst/extdata/`, processed data in `data/`)
1. Functions: Edit [R/baxter.R](R/baxter.R)
+ `get_metadata`
+ `get_bmi`
+ `get_bmi_category`
+ `is_obese`
## Activity
1. Clone this repo
```{bash}
git clone https://github.com/SchlossLab/documenting-R
```
or if you previously cloned it, pull new commits:
```{bash}
cd path/to/documenting-R/ ; git pull
```
1. Checkout a new branch
```{bash}
git checkout -b descriptive-branch-name
```
1. After modifying your part, commit your changes
```{bash}
git add . ; git commit -m "descriptive commit message"
```
## Activity Wrap-up
1. Push your changes
```{bash}
git push -u origin descriptive-branch-name
```
1. Open a pull request on GitHub to merge your branch into master.
Mention your issue number(s) & assign me to the PR.
![new P](figures/new_PR.png){height=275px} ![example PR](figures/example_PR.png){height=275px}
## R Packages
Setup an R package
```{r}
library(usethis)
library(devtools)
create_package(file.path(getwd()))
```
Specify dependencies
```{r}
use_package("dplyr")
use_package("readxl")
```
Compile documentation
```{r}
devtools::document()
```
## Additional Reading
- [R Packages](http://r-pkgs.had.co.nz/man.html) - Hadley Wickham
- [pkgdown](https://pkgdown.r-lib.org/): build a website for your R package
- [Mastering Software Development in R](https://bookdown.org/rdpeng/RProgDA/documentation.html) - Roger Peng, Sean Kross, & Brooke Anderson
```{r, echo=FALSE}
library(rmarkdown)
rmarkdown::render('documentingR.Rmd', output_file = 'docs/documentingR.html')
```