Hi everyone,
I'm experimenting with using false positive models for a dataset that involves long-term community science data. I'm doing so because a species of conservation interest has been misidentified frequently, and I want to get more accurate estimates of occupancy, as well as patch extinction/colonization rates as misidentifications are likely causing my extinction/colonization rates to be way higher than expected.
My understanding of using false positive models would be that something with a "soft" detect (not 100% confident) would be coded as a 1 and not given as much weight in determining occupancy as something that is a "hard" detect (coded as a 2; in this case, when a volunteer provides a photograph confirming their sighting). As usual, nondetects are a 0. My detection data for this species is sparse; only two sites out of 35 sites have a "hard" detect and 7 more have "soft" detections.
When I run this in either a static occupancy-false positive or a dynamic occupancy model, I get estimates that feel legit (occupancy rates <30%, higher patch extinction rates than colonization). But, when I incorporate false positives WITH dynamic occupancy, I get psi estimates above 90%, colonization rates above 50%, and extinction rates of <2%. This is the exact opposite of what I expected, as I pretty much know that many of my sites are false positives and therefore occupancy should be far lower. I even tried coding sites by region, where sites where I know this species to exist are coded differently than those where they are not known to exist - and somehow, my occupancy estimates of the known occupied sites are <50% and the unknown sites are 100%.
I've even messed with my data and filled in all my NAs with zeros just to make sure there's nothing there about missing data causing inflated rates. Again, same issue.
What am I missing here?