Hey Cooch, thanks for taking the time to dig into this a little bit. I had started to reply to you yesterday, but took so long to write it that phidot asked me to log back in and I lost the post. All the better, because it gave me the impetus to play around a bit more. One quick bit of apology:
No, I used a single release of 1,000 fishies on the first occasion of a 6 occasion study, which is what I thought you were describing in your original note (i.e., analogous to a batch release, follow over time).
I'm with you now and would've been with you before if I had only taken the time to properly read what you wrote.
Just put on your Bayesian hat -- think of a posterior (which is in effect what you're generating) -- mean, median, or mode? Smart folks say that focus on the point estimate is missing the point (pun intended) -- the credible interval is the important point.
In my heart, I'm with you. But for this exercise I'm firmly seated in my frequentist's britches. That means that the true parameter is fixed, data are random, and while we care about how an estimator performs over the long run, in practice we only usually have one estimate and it might be particularly good or particularly bad. For this exercise I just want to know how the estimator performs, and so was going with the frequentist definition of bias. But your comments got me thinking about exploring the long run properties of the estimator by looking at more than just the mean. Here's what I found out:
Mimicking Skalski et al. I started out by playing around with fewer occasion, only 3 detection events and single release for a 4 column detection history matrix. Based on your observation that
Systematic bias relative to true parameter is probably an artifact of temporal change in sample size, exacerbated by low p, and phi near a boundary.
I bumped that up to 4 detection events, for a 5 column detection history matrix. I performed three simulations, keeping the following fixed: initial sample size of n=1500,
and
such that
. The three simulation scenarios were 1.
and 2.
and 3.
. Thus we would expect 250 fish to remain for the third detection event in all three scenarios. For each simulation I randomly generated 10,000 datasets and estimated survival and detection parameters using the single-release CJS estimators outlined in Skalski et al. 1998. What follows is brief tour of what I learned about CJS survival estimators and an interesting (to me, at least) exploration of the properties of the confidence intervals for the estimates.
So first off, how did the estimators do? It looks like when the true parameter values was greater (not further away from 0.5), the "bias" was slightly worse, but the temporal change in sample size was more of a dominant force:
Simulation 1:
- Code: Select all
> mean(phi.ests[,1])
[1] 0.8376835
> mean(phi.ests[,2])
[1] 0.3026313
> mean(phi.ests[,3])
[1] 0.5153572
The mean of the estimates differs from true parameter values of around 0.0043, 0.0026, and 0.0153.
Simulation 2:
- Code: Select all
> mean(phi.ests[,1])
[1] 0.3009724
> mean(phi.ests[,2])
[1] 0.8399773
> mean(phi.ests[,3])
[1] 0.5135659
The mean of the estimates differs from true parameter values of around 0.0009, 0.0066, and 0.0135.
Simulation 3:
- Code: Select all
> mean(phi.ests[,1])
[1] 0.5023404
> mean(phi.ests[,2])
[1] 0.5034649
> mean(phi.ests[,3])
[1] 0.5167618
The mean of the estimates differs from true parameter values of around 0.0023, 0.0035, and 0.0168.
The histogram of bootstrapped estimates of phi when you're up near a boundary is anything but 'normal' (or even symmetrical), meaning I'd not be sanguine about using the the mean as the most appropriate moment to describe the distribution.
But recall here in the mythical land of Frequencia maximum likelihood estimators are normal in the province of Asymtopia. I don't think we've quite reached the border with of that province, but one surprising result is that even though the mean of the estimates is generally higher than the true value, the median is spot on. So at least we can say that about half the time we'll be underestimating and half the time overestimating, it's just that when you overestimate, you can sometimes badly overestimate. Here is a histogram, arbitraily from the third simulation where all phi's are equal. Blue dashed line is the mean, the green dotted line is median, red line is true parameter value. So yes the mean is not the best descriptor of the distribution of the estimates, but as I said up above, in practice, we don't get 10,000 estimates. We usually only get one and try to make policy off it:
Ok, so that's the point estimates. What about confidence intervals? Standard errors were calculated using the equations provided in Skalski et al. and I made use of the normality assumption (despite just pointing out that it might not hold) to calculate 95% confidence intervals as the estimate plus or minus 1.96 times the standard error. Then I checked to see how many times in the simulation the true parameter was actually in the CIs.
Simulation 1:
- Code: Select all
> sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]) & phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
[1] 0.9499
> sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]) & phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
[1] 0.9527
> sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]) & phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
[1] 0.9484
Simulation 2:
- Code: Select all
> sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]) & phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
[1] 0.9519
> sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]) & phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
[1] 0.949
> sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]) & phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
[1] 0.9482
Simulation 3:
- Code: Select all
> sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]) & phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
[1] 0.9467
> sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]) & phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
[1] 0.9484
> sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]) & phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
[1] 0.9477
Hey! Not too shabby. But wait, how does that square with the bias I'm seeing where some estimates tend to be quite large? Maybe these CIs aren't symmetrical?
Simulation 1:
- Code: Select all
> 1-sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]))/m
[1] 0.0393
> 1-sum(phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
[1] 0.0108
> 1-sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]))/m
[1] 0.0374
> 1-sum(phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
[1] 0.0099
> 1-sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]))/m
[1] 0.0504
> 1-sum(phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
[1] 0.0012
Simulation 2:
- Code: Select all
> 1-sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]))/m
[1] 0.0321
> 1-sum(phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
[1] 0.016
> 1-sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]))/m
[1] 0.0422
> 1-sum(phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
[1] 0.0088
> 1-sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]))/m
[1] 0.0497
> 1-sum(phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
[1] 0.0021
Simulation 3:
- Code: Select all
> 1-sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]))/m
[1] 0.0402
> 1-sum(phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
[1] 0.0131
> 1-sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]))/m
[1] 0.0418
> 1-sum(phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
[1] 0.0098
> 1-sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]))/m
[1] 0.0501
> 1-sum(phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
[1] 0.0022
Well that's not what I expected. The CIs aren't symmetrical, but if the true estimate lies outside the interval, generally speaking, it's because it's actually greater than the upper confidence limit! Actually, in the scenarios where we tend to get really bad overestimates ([tex]\phi_3[\tex] after winnowing of the original sample size) the estimate is almost never below the lower confidence limit. That looks like this:
What's going on here? It turns out that the estimated standard error depends on the estimate itself and that both are affected by sample size:
So what did I learn here? Well, probably nothing too exciting. It turns out that yes, the CJS estimators can be biased, but that this is largely an effect of sample size. Somewhere my old Stat's professor is going "no duh", because the MLEs of CJS are like all MLEs asymptotically consistent. And I learned that I should maybe distrust estimates of survival further from the release due to the winnowing of sample size. I also learned that 1000 fish might not cut it.
RCode below for those interested:
- Code: Select all
Single.Rls<-function(x){
#x is a capture history matrix
k<-ncol(x) #number of events
m<-matrix(rep(NA, k^2), ncol=k) #
R<-rep(NA, k-1)
for(i in 1:(k-1)){
for(j in (i+1):k){
m[i,j]<-sum(x[,i]==1 & rowSums(x[,i:j])==2 & x[,j]==1)+sum(x[,i]==1 & rowSums(x[,i:j])==3 & x[,j]==2)
}
R[i]<-sum(x[,i]==1)
}
#m<-m[1:k-1,2:k]
r<-rowSums(m, na.rm=T)[1:(k-1)]
M<-colSums(m, na.rm=T)[2:(k-1)]
z<-rep(NA,length(M))
for(j in 2:(k-1)){z[j-1]<-sum(m[1:(j-1),(j+1):k], na.rm=T)}
Tot<-M+z
S<-r[1:(k-2)]/R[1:(k-2)]*(M/Tot+(z*R[2:(k-1)])/(Tot*r[2:(k-1)]))
p<-M/(M+z*R[2:(k-1)]/r[2:(k-1)])
Var.S<-S^2*(1/r[1:(k-2)]-1/R[1:(k-2)]+(1-p)^2*(1/r[2:(k-1)]-1/R[2:(k-1)]))+
S^2*(1-p)^2*(1-r[2:(k-1)]/R[2:(k-1)])^2*(M/(z*Tot))
return(list(S.est=S, p.est=p, Est.VarS=Var.S, Est.SE.S=sqrt(Var.S)))
}
phi.1<-.5
phi.2<-.5
phi.3<-.5
phi.4<-.9375
p.2<-.5
p.3<-.5
p.4<-.5
p.5<-.8
n<-1000
m<-10000
phi.ests<-cbind(rep(NA,m), rep(NA,m), rep(NA,m))
p.ests<-cbind(rep(NA,m), rep(NA,m), rep(NA,m))
SE.S.ests<-cbind(rep(NA,m), rep(NA,m), rep(NA,m))
for(j in 1:m){
x<-matrix(rep(NA,5*n), ncol=5)
x[,1]<-rep(1,n)
#y is latent matrix of survival, absent detection problems
y<-x
y[,2]<-rbinom(n,1,phi.1)
x[,2]<-rbinom(n,1,p.2*y[,2])
y[,3]<-rbinom(n,1,phi.2*y[,2])
x[,3]<-rbinom(n,1,p.3*y[,3])
y[,4]<-rbinom(n,1,phi.3*y[,3])
x[,4]<-rbinom(n,1,p.4*y[,4])
y[,5]<-rbinom(n,1,phi.3*y[,4])
x[,5]<-rbinom(n,1,p.4*y[,5])
phi.ests[j,]<-Single.Rls(x)$S.est
p.ests[j,]<-Single.Rls(x)$p.est
SE.S.ests[j,]<-Single.Rls(x)$Est.SE.S
}
mean(phi.ests[,1])
mean(phi.ests[,2])
mean(phi.ests[,3])
mean(p.ests[,1])
mean(p.ests[,2])
mean(p.ests[,3])
sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]) & phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]) & phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]) & phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
1-sum(phi.1<(phi.ests[,1]+1.96*SE.S.ests[,1]))/m
1-sum(phi.1>(phi.ests[,1]-1.96*SE.S.ests[,1]))/m
1-sum(phi.2<(phi.ests[,2]+1.96*SE.S.ests[,2]))/m
1-sum(phi.2>(phi.ests[,2]-1.96*SE.S.ests[,2]))/m
1-sum(phi.3<(phi.ests[,3]+1.96*SE.S.ests[,3]))/m
1-sum(phi.3>(phi.ests[,3]-1.96*SE.S.ests[,3]))/m
Estimates<-data.frame(Param=c(rep("Phi.1",m),rep("Phi.2",m), rep("Phi.3",m)),True=c(rep(phi.1,m),rep(phi.2,m), rep(phi.3,m)),
Est=c(phi.ests[,1], phi.ests[,2], phi.ests[,3]), SE=c(SE.S.ests[,1], SE.S.ests[,2], SE.S.ests[,3]))
par(mfrow=c(3,1))
hist(phi.ests[,1], breaks=75, main="10,000 estimates of phi.1=.5, n=1000")
abline(v=phi.1, col="red", lwd=2)
abline(v=mean(phi.ests[,1]), col="steelblue", lwd=2, lty=2)
abline(v=median(phi.ests[,1]), col="green", lwd=2, lty=3)
hist(phi.ests[,2], breaks=75, main="10,000 estimates of phi.2=.5, n=1000")
abline(v=phi.2, col="red", lwd=2)
abline(v=mean(phi.ests[,2]), col="steelblue", lwd=2, lty=2)
abline(v=median(phi.ests[,2]), col="green", lwd=2, lty=3)
hist(phi.ests[,3], breaks=75, main="10,000 estimates of phi.2=.5, n=1000")
abline(v=phi.3, col="red", lwd=2)
abline(v=mean(phi.ests[,3]), col="steelblue", lwd=2, lty=2)
abline(v=median(phi.ests[,3]), col="green", lwd=2, lty=3)
library(ggplot2)
ggplot(filter(Estimates.scen3, Param=="Phi.1") %>% arrange(Est), aes(y=Est, x=c(1:m)))+
geom_point()+geom_errorbar(aes(ymin=Est-1.96*SE, ymax=Est+1.96*SE), alpha=.1)+
geom_abline(intercept=phi.1, slope=0, colour="red", size=1.25)
ggplot(filter(Estimates.scen3, Param=="Phi.2") %>% arrange(Est), aes(y=Est, x=c(1:m)))+
geom_point()+geom_errorbar(aes(ymin=Est-1.96*SE, ymax=Est+1.96*SE), alpha=.1)+
geom_abline(intercept=phi.2, slope=0, colour="red", size=1.25)
ggplot(filter(Estimates.scen3, Param=="Phi.3") %>% arrange(Est), aes(y=Est, x=c(1:m)))+
geom_point()+geom_errorbar(aes(ymin=Est-1.96*SE, ymax=Est+1.96*SE), alpha=.1)+
geom_abline(intercept=phi.3, slope=0, colour="red", size=1.25)
ggplot(Estimates.scen3, aes(x=Est, y=SE))+geom_point()+facet_wrap(~Param)+
geom_vline(aes(xintercept=c(True)))