-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathemma_model.Rmd
89 lines (67 loc) · 3.13 KB
/
emma_model.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: "EMMA Prototype"
description: Modeling vegetation postfire recovery data
author:
- name: Adam Wilson & Glenn Moncrieff
date: 10-13-2021
editor_options:
chunk_output_type: console
output:
github_document:
html_preview: false
---
```{r, echo=F}
library(targets)
library(tidyverse)
# load data saved in the pipeline
tar_load(c(model, model_fit, posterior_summary))
```
The details are given in [@slingsby_near-real_2020;@wilson_climatic_2015], but in short what we do is estimate the age of a site by calculating the years since the last fire. We then fit a curve to model the recovery of vegetation (measured using NDVI) as a function of it's age. For this we use a negative exponential curve with the following form:
$$\mu_{i,t}=\alpha_i+\gamma_i\Big(1-e^{-\frac{age_{i,t}}{\lambda_i}}\Big)$$
where $\mu_{i,t}$ is the expected NDVI for site $i$ at time $t$
The observed greenness $NDVI_{i,t}$ is assumed to follow a normal distribution with mean $\mu_{i,t}$
$$NDVI_{i,t}\sim\mathcal{N}(\mu_{i,t},\sigma_)$$
An additional level models the parameters of the negative exponential curve as a function of environmental variables. This means that sites with similar environmental conditions should have similar recovery curves. The full model also includes a sinusoidal term to capture seasonal variation, but lets keep it simple here.
## ADVI
We have `age` in years, a plot identifier `pid`. the observed ndvi `nd` and two plot level environmental variable `env1`, which is mean annual precipitation, and `env2`, which is the summer maximum temperature.
Lets load up our Stan model which codes the model described above. This is not a particularly clever or efficient way of coding the model, but it is nice and readable and works fine on this example dataset
How long did that take?
```{r vb_time, echo=T, eval=T, message=FALSE}
model_fit$time()$total
```
```{r p1, echo=T, eval=T, message=FALSE}
params=c("tau","alpha_mu","gamma_b2","gamma_b1","lambda_b1","lambda_b2")
posteriors=model_fit$draws(params) %>%
as_tibble() %>%
gather(parameter)
ggplot(posteriors,aes(x=value))+
geom_density(fill="grey")+
facet_wrap(~parameter,scales = "free")
```
## Plot
When we make this comparison, the posterior predictive intervals from ADVI and MCMC are almost identical
```{r plot, echo=T, eval=T, message=FALSE}
posterior_summary %>%
filter(pid %in% as.numeric(sample(levels(as.factor(posterior_summary$pid)),20))) %>% # just show a few
ggplot(aes(x=age)) +
geom_line(aes(y=mean),colour="blue") +
geom_line(aes(y=nd),colour="black",lwd=0.5,alpha=0.3) +
geom_ribbon(aes(ymin=q5,ymax=q95),alpha=0.5)+
facet_wrap(~pid) +
xlim(c(0,20))+
labs(x="time since fire (years)",y="NDVI") +
theme_bw()
```
# Spatial Predictions
This section is not yet working - need to get the coordinates in the original data set.
```{r compare_data2, echo=T, eval=F, message=FALSE}
stan_spatial <- stan_vb %>%
mutate(pid=gsub("[]]","",gsub(".*[[]","",variable))) %>%
bind_cols(select(data,x,y,age,nd))
foreach(t=unique(raw_data$DA),.combine=stack) %do% {
stan_spatial %>%
filter(DA=t) %>%
select(x,y,age,nd,mean,q5) %>%
rasterFromXYZ()
}
```