Page 1 of 1

False Positive Dynamic Occupancy - Weird Estimates

PostPosted: Sun Aug 25, 2024 8:48 am
by gleclair
Hi everyone,

I'm experimenting with using false positive models for a dataset that involves long-term community science data. I'm doing so because a species of conservation interest has been misidentified frequently, and I want to get more accurate estimates of occupancy, as well as patch extinction/colonization rates as misidentifications are likely causing my extinction/colonization rates to be way higher than expected.

My understanding of using false positive models would be that something with a "soft" detect (not 100% confident) would be coded as a 1 and not given as much weight in determining occupancy as something that is a "hard" detect (coded as a 2; in this case, when a volunteer provides a photograph confirming their sighting). As usual, nondetects are a 0. My detection data for this species is sparse; only two sites out of 35 sites have a "hard" detect and 7 more have "soft" detections.

When I run this in either a static occupancy-false positive or a dynamic occupancy model, I get estimates that feel legit (occupancy rates <30%, higher patch extinction rates than colonization). But, when I incorporate false positives WITH dynamic occupancy, I get psi estimates above 90%, colonization rates above 50%, and extinction rates of <2%. This is the exact opposite of what I expected, as I pretty much know that many of my sites are false positives and therefore occupancy should be far lower. I even tried coding sites by region, where sites where I know this species to exist are coded differently than those where they are not known to exist - and somehow, my occupancy estimates of the known occupied sites are <50% and the unknown sites are 100%.

I've even messed with my data and filled in all my NAs with zeros just to make sure there's nothing there about missing data causing inflated rates. Again, same issue.

What am I missing here?

Re: False Positive Dynamic Occupancy - Weird Estimates

PostPosted: Mon Aug 26, 2024 9:25 am
by jhines
Your understanding of the false-positive models is accurate as “sure” (“hard”) detections (coded as “2”) do not allow the possibility of non-occupancy in a season for a site, providing more information than “unsure”/”soft” detections.

The added flexibility of these models comes at a cost – less precise estimates of occupancy and detection. With only 2 “sure” detections, I suspect the standard errors of your estimates are large.

When you use the multi-season model, I’m guessing there are many seasons where you have zero “sure” detections. When you have seasons with no “sure” detections, the model cannot distinguish between a true detection and a false-positive detection. By that I mean the likelihood function will have more than 1 maximum value or perhaps many local maxima.

When you suspect that the model has converged on a local maxima, you can use the “randinit” argument to the occMod function. This argument specifies how many random initial value vectors to try in finding the maximum likelihood. I suggest trying this (eg., randinit=100) to see if it finds a different maximum likelihood value which makes more sense. However, I suspect that the data are too sparse to get reliable estimates under the multi-season, false-positive model.