MultScalOcc with p covariates for each visit

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

MultScalOcc with p covariates for each visit

Postby zwarren » Mon May 07, 2018 3:43 pm

I am trying to perform a multi-scale occupancy analysis following similar code as contained in the LASP dataset. I would like to perform MMI with covariates for large-scale occupancy (psi), local occupancy (theta), and detection probability with variables pertaining to each sampling visit (p).

For the purpose of this reproducible example code, I have 200 grids which each have 3 detectors that recorded simultaneously for 4 nights each. I believe I therefore have a k=3 and an L/mixtures = 4 for an encounter of K*L = 12. In the example code, I have a single psi covariate with one column(forest), three columns for the theta covariate (grass1:3), and I want 12 columns for a p covariate (wind1:12) as this differs by night and detector. When I use the my example code below however, it only recognizes wind1, wind2, and wind3. This is visible when using covariate.predictions(), viewing the simplified PIM, opening the .out file and examining the design matrix, and finally when comparing results of deleting wind4:12 from the initial dataframe.

Assuming it is possible to have covariates for each night/detector combination, what is the proper way to incorporate this? I assume this issue is arising either by the naming of my columns, or an argument I am failing to pass in the mark.wrapper as the ddl shows a model.index of 1:16 with a par.index of 1:12.

I apologize if I am lacking a basic understanding of multi-scale occupancy or RMARK as I am new to both.

Thanks in advance for your help - Zac

Code: Select all
library(RMark)
library(tidyverse)


set.seed(1234)
fake.data.3 <-
  data.frame(
    #create cols of random "wind" values, one for each detector/night combination used for det. prob. (p)
         wind1 = rpois(200, 5),
         wind2 = rpois(200, 5),
         wind3 = rpois(200, 5),
         wind4 = rpois(200, 5),
         wind5 = rpois(200, 5),
         wind6 = rpois(200, 5),
         wind7 = rpois(200, 5),
         wind8 = rpois(200, 5),
         wind9 = rpois(200, 5),
         wind10 = rpois(200, 5),
         wind11 = rpois(200, 5),
         wind12 = rpois(200, 5)) %>%
  #Create a "forest" variable which predicts larger scale occupancy (psi)
  mutate(forest = rpois(200, 15)) %>%
  #Create a "grass" variable which predicts local ocupancy (theta)
  mutate(grass1 = rnorm(200, 40, 10),
         grass2 = rnorm(200, 40, 10),
         grass3 = rnorm(200, 40, 10)) %>%
  #If forest is >= to 16 units, then assign presence as 0 or 1, if its less than 16, then it's always 0
  #final = the true occupany of the large scale unit
  mutate(final = ifelse(forest >= 16, sample(c(0,1), size = 200, replace = T), 0)) %>%
  #d = the true local occupancy of the detector
  #if the site is occupied, and grass is >= 40 units, then assign a 0 or 1 as present with a 30/70 probability, else...0
  mutate(d1 = ifelse(final == 1 & grass1 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d2 = ifelse(final == 1 & grass2 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d3 = ifelse(final == 1 & grass3 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #n designates which night (1-4), d designates the detector  (1-3)
  #if the detector is occupied, and the corresponding wind value for the night is <= 6, then sample 0 or 1, else 0
  mutate(n1.d1 = ifelse(d1 == 1 & wind1 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d1 = ifelse(d1 == 1 & wind2 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d1 = ifelse(d1 == 1 & wind3 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d1 = ifelse(d1 == 1 & wind4 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d2 = ifelse(d2 == 1 & wind5 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d2 = ifelse(d2 == 1 & wind6 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d2 = ifelse(d2 == 1 & wind7 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d2 = ifelse(d2 == 1 & wind8 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d3 = ifelse(d3 == 1 & wind9 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d3 = ifelse(d3 == 1 & wind10 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d3 = ifelse(d3 == 1 & wind11 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d3 = ifelse(d3 == 1 & wind12 <= 5, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #create a spatial detection history
  mutate(ch = paste0(n1.d1, n2.d1, n3.d1, n4.d1, n1.d2, n2.d2, n3.d2, n4.d2, n1.d3, n2.d3, n3.d3, n4.d3)) %>%
  select(ch, forest, contains("grass"), contains("wind"))

#process data
fake.proc.3=process.data(fake.data.3,model="MultScalOcc", begin.time=1,mixtures=4)

#create the data design
ddl.3=make.design.data(fake.proc.3)

#build a formula list with p(wind), theta(grass), and psi(forest)
do.Species.3=function()
{
  p.1=list(formula=~1)   
  p.2=list(formula=~wind)   
 
 
  Theta.1=list(formula=~1)
  Theta.2=list(formula=~grass)
 
  Psi.1=list(formula=~1)
  Psi.2=list(formula=~forest)
 
  cml=create.model.list("MultScalOcc")
  return(mark.wrapper(cml,data=fake.proc.3,ddl=ddl.3,adjust=FALSE,realvcv=TRUE))
}

#run models
fake.results.3=do.Species.3()

#view model table
fake.results.3$model.table

#view results of THE SECOND TOP MODEL AS THIS CONTAINS WIND
fake.results.3[[as.numeric(rownames(fake.results.3$model.table[2,]))]]$results$beta




### Covariate prediction and model averaging

# psi --------------

#create forest values from range
minforest.3 <- min(fake.data.3[,2])
maxforest.3 <- max(fake.data.3[,2])
forest.values.3 <- minforest.3+(0:100)*(maxforest.3-minforest.3)/100

#view pim to see index for predict
PIMS(fake.results.3[[8]],"Psi",simplified=F)

#predict off of the model with all 3 terms
forest.3 <- covariate.predictions(model = fake.results.3[[8]],
                                  data=data.frame(forest = forest.values.3),
                                  indices = c(1))

psi.forest.3 <- forest.3$estimates

plot(psi.forest.3$covdata, psi.forest.3$estimate)


# theta ------------------------------------------

PIMS(fake.results.3[[8]],"Theta",simplified=T)

mingrass.3 <- min(fake.data.3[,3:5])
maxgrass.3 <- max(fake.data.3[,3:5])
grass.values.3 <- mingrass.3+(0:100)*(maxgrass.3-mingrass.3)/100


grass.3 <- covariate.predictions(model = fake.results.3[[8]],
                                 data=data.frame(grass1 = grass.values.3),
                                 indices = c(2))
theta.grass.3.1 <- grass.3$estimates

plot(theta.grass.3.1$covdata, theta.grass.3.1$estimate)


# p(wind) -------------------

minwind.3 <- min(fake.data.3[,6:17])
maxwind.3 <- max(fake.data.3[,6:17])
wind.values.3 <- minwind.3+(0:100)*(maxwind.3-minwind.3)/100

PIMS(fake.results.3[[8]],"p",simplified=F)

wind.1 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind1=wind.values.3),indices=c(5))
#wind 4 does not predict regardless of indicies
wind.4 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind4=wind.values.3),indices=c(8))

p.wind.1 <- wind.1$estimates
p.wind.4 <- wind.4$estimates

plot(p.wind.1$covdata, p.wind.1$estimate)
plot(p.wind.4$covdata, p.wind.4$estimate)

zwarren
 
Posts: 3
Joined: Fri Apr 20, 2018 4:10 pm

Re: MultScalOcc with p covariates for each visit

Postby zwarren » Mon May 07, 2018 4:35 pm

UPDATE: After modifying the code and reading for a few days, I believe I believe I figured out the answer immediately after posting this question. It seems the issue is with the naming of the p covariate columns. Rather than naming them Wind1:12, they should be labeled Wind11, Wind12...Wind21, Wind22...Wind31, Wind32...Wind34. Please correct me if I am wrong or have violated any assumptions with my understanding of K and L. Since I was unable to find an applicable example in example code, or in supplemental data, I plan on leaving this up in case anyone has a similar issue.

Code: Select all
 
library(RMark)
library(tidyverse)


set.seed(1234)
fake.data.3 <-
  data.frame(
    #create cols of random "wind" values, one for each detector/night combination used for det. prob. (p)
    wind11 = rpois(200, 5),
    wind12 = rpois(200, 5),
    wind13 = rpois(200, 5),
    wind14 = rpois(200, 5),
    wind21 = rpois(200, 5),
    wind22 = rpois(200, 5),
    wind23 = rpois(200, 5),
    wind24 = rpois(200, 5),
    wind31 = rpois(200, 5),
    wind32 = rpois(200, 5),
    wind33 = rpois(200, 5),
    wind34 = rpois(200, 5)) %>%
  #Create a "forest" variable which predicts larger scale occupancy (psi)
  mutate(forest = rpois(200, 15)) %>%
  #Create a "grass" variable which predicts local ocupancy (theta)
  mutate(grass1 = rnorm(200, 40, 10),
         grass2 = rnorm(200, 40, 10),
         grass3 = rnorm(200, 40, 10)) %>%
  #If forest is >= to 16 units, then assign presence as 0 or 1, if its less than 16, then it's always 0
  #final = the true occupany of the large scale unit
  mutate(final = ifelse(forest >= 16, sample(c(0,1), size = 200, replace = T), 0)) %>%
  #d = the true local occupancy of the detector
  #if the site is occupied, and grass is >= 40 units, then assign a 0 or 1 as present with a 30/70 probability, else...0
  mutate(d1 = ifelse(final == 1 & grass1 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d2 = ifelse(final == 1 & grass2 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d3 = ifelse(final == 1 & grass3 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #n designates which night (1-4), d designates the detector  (1-3)
  #if the detector is occupied, and the corresponding wind value for the night is <= 6, then sample 0 or 1, else 0
  mutate(n1.d1 = ifelse(d1 == 1 & wind11 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d1 = ifelse(d1 == 1 & wind12 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d1 = ifelse(d1 == 1 & wind13 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d1 = ifelse(d1 == 1 & wind14 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d2 = ifelse(d2 == 1 & wind21 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d2 = ifelse(d2 == 1 & wind22 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d2 = ifelse(d2 == 1 & wind23 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d2 = ifelse(d2 == 1 & wind24 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d3 = ifelse(d3 == 1 & wind31 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d3 = ifelse(d3 == 1 & wind32 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d3 = ifelse(d3 == 1 & wind33 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d3 = ifelse(d3 == 1 & wind34 <= 5, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #create a spatial detection history
  mutate(ch = paste0(n1.d1, n2.d1, n3.d1, n4.d1, n1.d2, n2.d2, n3.d2, n4.d2, n1.d3, n2.d3, n3.d3, n4.d3)) %>%
  select(ch, forest, contains("grass"), contains("wind"))

#process data
fake.proc.3=process.data(fake.data.3,model="MultScalOcc", begin.time=1,mixtures=4)

#create the data design
ddl.3=make.design.data(fake.proc.3)

#build a formula list with p(wind), theta(grass), and psi(forest)
do.Species.3=function()
{
  p.1=list(formula=~1)   
  p.2=list(formula=~wind)   
 
 
  Theta.1=list(formula=~1)
  Theta.2=list(formula=~grass)
 
  Psi.1=list(formula=~1)
  Psi.2=list(formula=~forest)
 
  cml=create.model.list("MultScalOcc")
  return(mark.wrapper(cml,data=fake.proc.3,ddl=ddl.3,adjust=FALSE,realvcv=TRUE))
}

#run models
fake.results.3=do.Species.3()

#view model table
fake.results.3$model.table

#view results of THE SECOND TOP MODEL AS THIS CONTAINS WIND
fake.results.3[[as.numeric(rownames(fake.results.3$model.table[1,]))]]$results$beta




### Covariate prediction and model averaging

# psi --------------

#create forest values from range
minforest.3 <- min(fake.data.3[,2])
maxforest.3 <- max(fake.data.3[,2])
forest.values.3 <- minforest.3+(0:100)*(maxforest.3-minforest.3)/100

#view pim to see index for predict
PIMS(fake.results.3[[8]],"Psi",simplified=F)

#predict off of the model with all 3 terms
forest.3 <- covariate.predictions(model = fake.results.3[[8]],
                                  data=data.frame(forest = forest.values.3),
                                  indices = c(1))

psi.forest.3 <- forest.3$estimates

plot(psi.forest.3$covdata, psi.forest.3$estimate)


# theta ------------------------------------------

PIMS(fake.results.3[[8]],"Theta",simplified=T)

mingrass.3 <- min(fake.data.3[,3:5])
maxgrass.3 <- max(fake.data.3[,3:5])
grass.values.3 <- mingrass.3+(0:100)*(maxgrass.3-mingrass.3)/100


grass.3 <- covariate.predictions(model = fake.results.3[[8]],
                                 data=data.frame(grass1 = grass.values.3),
                                 indices = c(2))
theta.grass.3.1 <- grass.3$estimates

plot(theta.grass.3.1$covdata, theta.grass.3.1$estimate)


# p(wind) -------------------

minwind.3 <- min(fake.data.3[,6:17])
maxwind.3 <- max(fake.data.3[,6:17])
wind.values.3 <- minwind.3+(0:100)*(maxwind.3-minwind.3)/100

PIMS(fake.results.3[[8]],"p",simplified=F)

wind.1 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind11=wind.values.3),indices=c(5))
#wind 4 does not predict regardless of indicies
wind.4 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind21=wind.values.3),indices=c(9))

p.wind.1 <- wind.1$estimates
p.wind.4 <- wind.4$estimates

plot(p.wind.1$covdata, p.wind.1$estimate)
plot(p.wind.4$covdata, p.wind.4$estimate)
zwarren
 
Posts: 3
Joined: Fri Apr 20, 2018 4:10 pm

Re: MultScalOcc with p covariates for each visit

Postby jlaake » Mon May 07, 2018 5:05 pm

That is correct. They are named vts where v is the base of the variable name, t is time value and s is session. It is the same as in robust design models. The documentation for this should be made better if it exists at all.
jlaake
 
Posts: 1011
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA


Return to RMark

Who is online

Users browsing this forum: Bing [Bot] and 2 guests