forked from palday/lme4-convergence
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdivergent_convergence.Rmd
168 lines (139 loc) · 6.34 KB
/
divergent_convergence.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
title: "Convergence in lme4 and lme4.0"
author: "Phillip Alday"
date: `r as.character(Sys.time())`
output: html_document
---
```{r pkgs,message=FALSE}
library(knitcitations)
library(RCurl)
library(lme4)
library(lme4.0)
lme4ver <- as.character(packageVersion("lme4"))
lme4.0ver <- as.character(packageVersion("lme4.0"))
```
```{r opts,echo=FALSE}
opts_chunk$set(tidy=FALSE)
```
The differences in convergence between "old" and "new" `lme4` is well known and often leads to substantially poorer fits with new lme4 for psycholinguistic datasets `r citep("https://hlplab.wordpress.com/2014/03/17/old-and-new-lme4/")`. For the data set here, the difference in fits leads to similar estimates but very different standard errors and thus $t$-values.
The following was done using lme4 `r lme4ver` and lme4.0 `r lme4.0ver`.
```{r setup}
c. <- function(x) scale(x, center=TRUE, scale=FALSE)
cs. <- function(x) scale(x, center=TRUE, scale=TRUE)
reml <- FALSE
modeldata <- read.table("data.tab",header=TRUE,
colClasses=c(mean="numeric",
subj="factor",item="factor",
roi="factor",win="factor",
sdiff="numeric", dist="numeric",
signdist="numeric",ambiguity="factor"))
modeldata <- subset(modeldata, roi == 'Left-Posterior')
modeldata.n400 <- subset(modeldata,win=="N400")
form <- mean ~ ambiguity * c.(sdiff) + (1+c.(sdiff)|item) +
(1+c.(sdiff)|subj)
form.s <- mean ~ ambiguity * cs.(sdiff) + (1+cs.(sdiff)|item) +
(1+cs.(sdiff)|subj)
```
## Plots
(Haven't found a great way to do this, just trying to look
to see whether there is something odd about the data ...)
```{r pix}
library(ggplot2); theme_set(theme_bw())
mm <- transform(modeldata.n400,sdiff=cs.(sdiff),
subj=reorder(subj,mean),
item=reorder(item,mean))
mm0 <- subset(mm,select=c(mean,sdiff,ambiguity,subj,item))
## save("mm0",file="mm0.RData")
## L <- load("mm0.RData")
ggplot(mm0,aes(mean,subj,colour=sdiff))+geom_point(alpha=0.5)+
facet_grid(ambiguity~.)
```
# lme4
```{r lme4fitnew,cache=TRUE}
t.new <- system.time(sdiff.new <- lme4::lmer(form,
data=modeldata.n400, REML=reml))
t.new
print(summary(sdiff.new),correlation=FALSE)
```
Note we still get convergence warnings, even with lme4 `r lme4ver`; this means they are probably **not** just false positives (so in this case they are a good thing!) Should have gotten scaling warnings too ???
```{r lme4fitnew.S,cache=TRUE}
sdiff.new.scaled <- update(sdiff.new,formula=form.s)
```
# lme4.0
```{r lme4fitold,cache=TRUE}
t.old <- system.time(sdiff.old <- lme4.0::lmer(form,
data=modeldata.n400, REML=reml))
t.old
summary(sdiff.old)
```
As stated, the important difference in the fixed effects is the
size of the standard error in `c.(sdiff)` ...
```{r coefplot,echo=FALSE}
tf <- function(x,model="") {
x2 <- as.data.frame(coef(summary(x)))
x2$var <- rownames(x2)
rownames(x2) <- NULL
x2 <- x2[-1,c(4,1,2)]
names(x2) <- c("var","est","stder")
data.frame(model,x2)
}
cc <- rbind(tf(sdiff.new,"new"),tf(sdiff.old,"old"))
ggplot(cc,aes(x=model,y=est,ymin=est-2*stder,ymax=est+2*stder,
colour=model))+geom_pointrange()+
facet_wrap(~var,scale="free")
```
Compare theta values? (We can't get confidence intervals without
profiling ...)
# lme4 starting from lme4.0 results
```{r lme4fitstart,cache=TRUE}
oldtheta <- getME(sdiff.old,"theta")
newtheta <- lme4::getME(sdiff.new,"theta")
sdiff.oldstart <- lme4::lmer(form,
data=modeldata.n400, REML=reml,
start=list(theta=getME(sdiff.old,"theta")))
sdiff.oldstart.nlminb <- lme4::lmer(form,
data=modeldata.n400, REML=reml,
start=list(theta=getME(sdiff.old,"theta")),
control=lmerControl(optimizer="optimx",
optCtrl=list(method="nlminb")))
```
We have an interesting problem here. New `lme4` calculates the
deviance slightly differently; at $\hat \theta_{\textrm{old}}$
it computes a deviance value about 29 units higher than `lme4.0`.
If we start `lme4` from that point, we actually get a **worse**
result (the deviance goes up by about 480 units!) --- presumably
we get thrown off by the initial displacement of the points from
the starting value ...
```{r cmpdevs}
sdiff.newdev <- update(sdiff.new,devFunOnly=TRUE)
devvec <- c(old=deviance(sdiff.old),new=deviance(sdiff.new),
oldtheta.newdev=sdiff.newdev(oldtheta),
oldstart=deviance(sdiff.oldstart),
oldstart.nlminb=deviance(sdiff.oldstart.nlminb))
print(sort(devvec-min(devvec)),digits=3)
```
```{r get_allfit}
afurl <- paste0("https://raw.githubusercontent.com/lme4/",
"lme4/master/misc/issues/allFit.R")
eval(parse(text=getURL(afurl)))
```
I should suppress the warnings here (they'll be preserved in the `@optinfo` slot anyway), but that will require refreshing the cache, so I'm not going to do it right now ...) The second `allFit` call (starting from the `lme4.0` fitted values) doesn't work at all, for some structural reason I don't understand (having to do with package loading/unloading I think). All of this is sufficiently lengthy that I should consider putting it in a batch file rather than relying on knitr caching ...
```{r comp_allfit,warning=FALSE,message=FALSE,cache=TRUE}
detach("package:lme4.0")
t.all <- system.time(sdiff.new.all <- allFit(sdiff.new))
sdiff.oldstart.all <- allFit(sdiff.oldstart)
library(lme4.0)
```
The bottom line is that trying lots of different optimizers doesn't do us any good in this case. Oddly enough, our Nelder-Mead implementation (which we have found to be generally less reliable than BOBYQA for big problems) turns out to do *best* of all the options (other than `lme4.0`) in this case ...
```{r cmplik}
is.OK <- !sapply(sdiff.new.all,inherits,"error")
all.dev <- c(sapply(sdiff.new.all[is.OK],deviance),lme4.0=deviance(sdiff.old))
print(sort(all.dev-min(all.dev)),digits=4)
```
Now try with scaled data:
```{r comp_allfit_scaled,warning=FALSE,message=FALSE,cache=TRUE}
detach("package:lme4.0")
t.all <- system.time(sdiff.new.scaled.all <- allFit(sdiff.new.scaled))
library(lme4.0)
```
The code and data for this example can be found on [GitHub](https://github.com/palday/lme4-convergence)