-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path_targets.Rmd
110 lines (79 loc) · 3.36 KB
/
_targets.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: "Target Markdown"
output: html_document
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```
Target Markdown is a powerful R Markdown interface for reproducible analysis pipelines, and the chapter at https://books.ropensci.org/targets/markdown.html walks through it in detail. This R Markdown report the example from the chapter. Try it out in both interactive and non-interactive modes, either by running the code chunks in different ways or setting the `tar_interactive` chunk option.
# Packages
The example requires several R packages, and `targets` must be version 0.5.0.9000 or above.
```{r, eval = FALSE}
install.packages(c("biglm", "dplyr", "ggplot2", "readr", "targets", "tidyr"))
```
# Setup
If you are using old versions of `targets` (<= 0.7.0) and/or `knitr` (<= 1.33), you will need to load the `targets` package in the R Markdown document in order for Target Markdown code chunks to work.
```{r}
library(targets)
```
Near the top of the document, you may also wish to remove the `_targets_r` directory previously written by non-interactive runs of the report. Otherwise, your pipeline may contain superfluous targets.
```{r}
library(targets)
tar_unscript()
```
# Globals
We first define some global options/functions common to all targets. The function below plots a histogram of ozone concentrations, and our histogram target will need it.
```{targets example-globals, tar_globals = TRUE}
options(tidyverse.quiet = TRUE)
tar_option_set(packages = c("biglm", "dplyr", "ggplot2", "readr", "tidyr"))
create_plot <- function(data) {
ggplot(data) +
geom_histogram(aes(x = Ozone), bins = 12) +
theme_gray(24)
}
```
# Targets
Our first target borrows the `airquality` dataset built into base R.
```{targets raw-data}
tar_target(raw_data, airquality)
```
Our next targets preprocess the data, make a histogram, and fit a model.
```{targets downstream-targets}
list(
tar_target(data, raw_data %>% filter(!is.na(Ozone))),
tar_target(hist, create_plot(data))
)
```
Set the `tar_simple` chunk option to `TRUE` to define a single target with the command in the code chunk. The chunk below only contains `biglm(Ozone ~ Wind + Temp, data)` in the source, but because `tar_simple` is `TRUE`, it is shorthand for `tar_target(name = fit, command = biglm(Ozone ~ Wind + Temp, data))`. All other arguments to `tar_target()` are set to their default values (configurable with `tar_option_set()`).
```{targets fit, tar_simple = TRUE}
biglm(Ozone ~ Wind + Temp, data)
```
# Pipeline
If you ran all the `{targets}` chunks in non-interactive mode, then your R scripts are set up to run the pipeline.
```{r}
tar_make()
```
# Output
You can retrieve results from the `_targets/` data store using `tar_read()` or `tar_load()`.
```{r, message = FALSE}
library(biglm)
tar_read(fit)
```
```{r}
tar_read(hist)
```
The `targets` dependency graph helps your readers understand the steps of your pipeline at a high level.
```{r}
# some random EDA
if(F){
tar_load(data)
data %>%
gather("variable","value",-x,-y) %>%
filter(variable=="MODCF_interannualSD.tif") %>%
ggplot(aes(x=x,y=y,fill=value))+
# facet_wrap(~variable)+
geom_raster()+
geom_sf(data=region,inherit.aes = F,fill=NA,col="red")
}
```
At this point, you can go back and run `{targets}` chunks in interactive mode without interfering with the code or data of the non-interactive pipeline.