MultScalOcc with p covariates for each visit

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

MultScalOcc with p covariates for each visit

Postby zwarren » Mon May 07, 2018 3:43 pm

I am trying to perform a multi-scale occupancy analysis following similar code as contained in the LASP dataset. I would like to perform MMI with covariates for large-scale occupancy (psi), local occupancy (theta), and detection probability with variables pertaining to each sampling visit (p).

For the purpose of this reproducible example code, I have 200 grids which each have 3 detectors that recorded simultaneously for 4 nights each. I believe I therefore have a k=3 and an L/mixtures = 4 for an encounter of K*L = 12. In the example code, I have a single psi covariate with one column(forest), three columns for the theta covariate (grass1:3), and I want 12 columns for a p covariate (wind1:12) as this differs by night and detector. When I use the my example code below however, it only recognizes wind1, wind2, and wind3. This is visible when using covariate.predictions(), viewing the simplified PIM, opening the .out file and examining the design matrix, and finally when comparing results of deleting wind4:12 from the initial dataframe.

Assuming it is possible to have covariates for each night/detector combination, what is the proper way to incorporate this? I assume this issue is arising either by the naming of my columns, or an argument I am failing to pass in the mark.wrapper as the ddl shows a model.index of 1:16 with a par.index of 1:12.

I apologize if I am lacking a basic understanding of multi-scale occupancy or RMARK as I am new to both.

Thanks in advance for your help - Zac

Code: Select all
library(RMark)
library(tidyverse)


set.seed(1234)
fake.data.3 <-
  data.frame(
    #create cols of random "wind" values, one for each detector/night combination used for det. prob. (p)
         wind1 = rpois(200, 5),
         wind2 = rpois(200, 5),
         wind3 = rpois(200, 5),
         wind4 = rpois(200, 5),
         wind5 = rpois(200, 5),
         wind6 = rpois(200, 5),
         wind7 = rpois(200, 5),
         wind8 = rpois(200, 5),
         wind9 = rpois(200, 5),
         wind10 = rpois(200, 5),
         wind11 = rpois(200, 5),
         wind12 = rpois(200, 5)) %>%
  #Create a "forest" variable which predicts larger scale occupancy (psi)
  mutate(forest = rpois(200, 15)) %>%
  #Create a "grass" variable which predicts local ocupancy (theta)
  mutate(grass1 = rnorm(200, 40, 10),
         grass2 = rnorm(200, 40, 10),
         grass3 = rnorm(200, 40, 10)) %>%
  #If forest is >= to 16 units, then assign presence as 0 or 1, if its less than 16, then it's always 0
  #final = the true occupany of the large scale unit
  mutate(final = ifelse(forest >= 16, sample(c(0,1), size = 200, replace = T), 0)) %>%
  #d = the true local occupancy of the detector
  #if the site is occupied, and grass is >= 40 units, then assign a 0 or 1 as present with a 30/70 probability, else...0
  mutate(d1 = ifelse(final == 1 & grass1 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d2 = ifelse(final == 1 & grass2 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d3 = ifelse(final == 1 & grass3 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #n designates which night (1-4), d designates the detector  (1-3)
  #if the detector is occupied, and the corresponding wind value for the night is <= 6, then sample 0 or 1, else 0
  mutate(n1.d1 = ifelse(d1 == 1 & wind1 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d1 = ifelse(d1 == 1 & wind2 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d1 = ifelse(d1 == 1 & wind3 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d1 = ifelse(d1 == 1 & wind4 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d2 = ifelse(d2 == 1 & wind5 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d2 = ifelse(d2 == 1 & wind6 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d2 = ifelse(d2 == 1 & wind7 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d2 = ifelse(d2 == 1 & wind8 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d3 = ifelse(d3 == 1 & wind9 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d3 = ifelse(d3 == 1 & wind10 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d3 = ifelse(d3 == 1 & wind11 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d3 = ifelse(d3 == 1 & wind12 <= 5, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #create a spatial detection history
  mutate(ch = paste0(n1.d1, n2.d1, n3.d1, n4.d1, n1.d2, n2.d2, n3.d2, n4.d2, n1.d3, n2.d3, n3.d3, n4.d3)) %>%
  select(ch, forest, contains("grass"), contains("wind"))

#process data
fake.proc.3=process.data(fake.data.3,model="MultScalOcc", begin.time=1,mixtures=4)

#create the data design
ddl.3=make.design.data(fake.proc.3)

#build a formula list with p(wind), theta(grass), and psi(forest)
do.Species.3=function()
{
  p.1=list(formula=~1)   
  p.2=list(formula=~wind)   
 
 
  Theta.1=list(formula=~1)
  Theta.2=list(formula=~grass)
 
  Psi.1=list(formula=~1)
  Psi.2=list(formula=~forest)
 
  cml=create.model.list("MultScalOcc")
  return(mark.wrapper(cml,data=fake.proc.3,ddl=ddl.3,adjust=FALSE,realvcv=TRUE))
}

#run models
fake.results.3=do.Species.3()

#view model table
fake.results.3$model.table

#view results of THE SECOND TOP MODEL AS THIS CONTAINS WIND
fake.results.3[[as.numeric(rownames(fake.results.3$model.table[2,]))]]$results$beta




### Covariate prediction and model averaging

# psi --------------

#create forest values from range
minforest.3 <- min(fake.data.3[,2])
maxforest.3 <- max(fake.data.3[,2])
forest.values.3 <- minforest.3+(0:100)*(maxforest.3-minforest.3)/100

#view pim to see index for predict
PIMS(fake.results.3[[8]],"Psi",simplified=F)

#predict off of the model with all 3 terms
forest.3 <- covariate.predictions(model = fake.results.3[[8]],
                                  data=data.frame(forest = forest.values.3),
                                  indices = c(1))

psi.forest.3 <- forest.3$estimates

plot(psi.forest.3$covdata, psi.forest.3$estimate)


# theta ------------------------------------------

PIMS(fake.results.3[[8]],"Theta",simplified=T)

mingrass.3 <- min(fake.data.3[,3:5])
maxgrass.3 <- max(fake.data.3[,3:5])
grass.values.3 <- mingrass.3+(0:100)*(maxgrass.3-mingrass.3)/100


grass.3 <- covariate.predictions(model = fake.results.3[[8]],
                                 data=data.frame(grass1 = grass.values.3),
                                 indices = c(2))
theta.grass.3.1 <- grass.3$estimates

plot(theta.grass.3.1$covdata, theta.grass.3.1$estimate)


# p(wind) -------------------

minwind.3 <- min(fake.data.3[,6:17])
maxwind.3 <- max(fake.data.3[,6:17])
wind.values.3 <- minwind.3+(0:100)*(maxwind.3-minwind.3)/100

PIMS(fake.results.3[[8]],"p",simplified=F)

wind.1 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind1=wind.values.3),indices=c(5))
#wind 4 does not predict regardless of indicies
wind.4 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind4=wind.values.3),indices=c(8))

p.wind.1 <- wind.1$estimates
p.wind.4 <- wind.4$estimates

plot(p.wind.1$covdata, p.wind.1$estimate)
plot(p.wind.4$covdata, p.wind.4$estimate)

zwarren
 
Posts: 3
Joined: Fri Apr 20, 2018 4:10 pm

Re: MultScalOcc with p covariates for each visit

Postby zwarren » Mon May 07, 2018 4:35 pm

UPDATE: After modifying the code and reading for a few days, I believe I believe I figured out the answer immediately after posting this question. It seems the issue is with the naming of the p covariate columns. Rather than naming them Wind1:12, they should be labeled Wind11, Wind12...Wind21, Wind22...Wind31, Wind32...Wind34. Please correct me if I am wrong or have violated any assumptions with my understanding of K and L. Since I was unable to find an applicable example in example code, or in supplemental data, I plan on leaving this up in case anyone has a similar issue.

Code: Select all
 
library(RMark)
library(tidyverse)


set.seed(1234)
fake.data.3 <-
  data.frame(
    #create cols of random "wind" values, one for each detector/night combination used for det. prob. (p)
    wind11 = rpois(200, 5),
    wind12 = rpois(200, 5),
    wind13 = rpois(200, 5),
    wind14 = rpois(200, 5),
    wind21 = rpois(200, 5),
    wind22 = rpois(200, 5),
    wind23 = rpois(200, 5),
    wind24 = rpois(200, 5),
    wind31 = rpois(200, 5),
    wind32 = rpois(200, 5),
    wind33 = rpois(200, 5),
    wind34 = rpois(200, 5)) %>%
  #Create a "forest" variable which predicts larger scale occupancy (psi)
  mutate(forest = rpois(200, 15)) %>%
  #Create a "grass" variable which predicts local ocupancy (theta)
  mutate(grass1 = rnorm(200, 40, 10),
         grass2 = rnorm(200, 40, 10),
         grass3 = rnorm(200, 40, 10)) %>%
  #If forest is >= to 16 units, then assign presence as 0 or 1, if its less than 16, then it's always 0
  #final = the true occupany of the large scale unit
  mutate(final = ifelse(forest >= 16, sample(c(0,1), size = 200, replace = T), 0)) %>%
  #d = the true local occupancy of the detector
  #if the site is occupied, and grass is >= 40 units, then assign a 0 or 1 as present with a 30/70 probability, else...0
  mutate(d1 = ifelse(final == 1 & grass1 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d2 = ifelse(final == 1 & grass2 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         d3 = ifelse(final == 1 & grass3 >= 40, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #n designates which night (1-4), d designates the detector  (1-3)
  #if the detector is occupied, and the corresponding wind value for the night is <= 6, then sample 0 or 1, else 0
  mutate(n1.d1 = ifelse(d1 == 1 & wind11 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d1 = ifelse(d1 == 1 & wind12 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d1 = ifelse(d1 == 1 & wind13 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d1 = ifelse(d1 == 1 & wind14 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d2 = ifelse(d2 == 1 & wind21 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d2 = ifelse(d2 == 1 & wind22 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d2 = ifelse(d2 == 1 & wind23 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d2 = ifelse(d2 == 1 & wind24 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n1.d3 = ifelse(d3 == 1 & wind31 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n2.d3 = ifelse(d3 == 1 & wind32 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n3.d3 = ifelse(d3 == 1 & wind33 <= 6, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0),
         n4.d3 = ifelse(d3 == 1 & wind34 <= 5, sample(c(0,1), size = 200, replace = T, prob = c(.30, .70)), 0)) %>%
  #create a spatial detection history
  mutate(ch = paste0(n1.d1, n2.d1, n3.d1, n4.d1, n1.d2, n2.d2, n3.d2, n4.d2, n1.d3, n2.d3, n3.d3, n4.d3)) %>%
  select(ch, forest, contains("grass"), contains("wind"))

#process data
fake.proc.3=process.data(fake.data.3,model="MultScalOcc", begin.time=1,mixtures=4)

#create the data design
ddl.3=make.design.data(fake.proc.3)

#build a formula list with p(wind), theta(grass), and psi(forest)
do.Species.3=function()
{
  p.1=list(formula=~1)   
  p.2=list(formula=~wind)   
 
 
  Theta.1=list(formula=~1)
  Theta.2=list(formula=~grass)
 
  Psi.1=list(formula=~1)
  Psi.2=list(formula=~forest)
 
  cml=create.model.list("MultScalOcc")
  return(mark.wrapper(cml,data=fake.proc.3,ddl=ddl.3,adjust=FALSE,realvcv=TRUE))
}

#run models
fake.results.3=do.Species.3()

#view model table
fake.results.3$model.table

#view results of THE SECOND TOP MODEL AS THIS CONTAINS WIND
fake.results.3[[as.numeric(rownames(fake.results.3$model.table[1,]))]]$results$beta




### Covariate prediction and model averaging

# psi --------------

#create forest values from range
minforest.3 <- min(fake.data.3[,2])
maxforest.3 <- max(fake.data.3[,2])
forest.values.3 <- minforest.3+(0:100)*(maxforest.3-minforest.3)/100

#view pim to see index for predict
PIMS(fake.results.3[[8]],"Psi",simplified=F)

#predict off of the model with all 3 terms
forest.3 <- covariate.predictions(model = fake.results.3[[8]],
                                  data=data.frame(forest = forest.values.3),
                                  indices = c(1))

psi.forest.3 <- forest.3$estimates

plot(psi.forest.3$covdata, psi.forest.3$estimate)


# theta ------------------------------------------

PIMS(fake.results.3[[8]],"Theta",simplified=T)

mingrass.3 <- min(fake.data.3[,3:5])
maxgrass.3 <- max(fake.data.3[,3:5])
grass.values.3 <- mingrass.3+(0:100)*(maxgrass.3-mingrass.3)/100


grass.3 <- covariate.predictions(model = fake.results.3[[8]],
                                 data=data.frame(grass1 = grass.values.3),
                                 indices = c(2))
theta.grass.3.1 <- grass.3$estimates

plot(theta.grass.3.1$covdata, theta.grass.3.1$estimate)


# p(wind) -------------------

minwind.3 <- min(fake.data.3[,6:17])
maxwind.3 <- max(fake.data.3[,6:17])
wind.values.3 <- minwind.3+(0:100)*(maxwind.3-minwind.3)/100

PIMS(fake.results.3[[8]],"p",simplified=F)

wind.1 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind11=wind.values.3),indices=c(5))
#wind 4 does not predict regardless of indicies
wind.4 <- covariate.predictions(fake.results.3[[8]],
                                data=data.frame(wind21=wind.values.3),indices=c(9))

p.wind.1 <- wind.1$estimates
p.wind.4 <- wind.4$estimates

plot(p.wind.1$covdata, p.wind.1$estimate)
plot(p.wind.4$covdata, p.wind.4$estimate)
zwarren
 
Posts: 3
Joined: Fri Apr 20, 2018 4:10 pm

Re: MultScalOcc with p covariates for each visit

Postby jlaake » Mon May 07, 2018 5:05 pm

That is correct. They are named vts where v is the base of the variable name, t is time value and s is session. It is the same as in robust design models. The documentation for this should be made better if it exists at all.
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: MultScalOcc with p covariates for each visit

Postby s02hoff » Thu Aug 18, 2022 1:22 pm

I am running into a similar issue and probably not understanding how to input my variables... I have a similar design where k=3 and l=8, so 24 values for my p covariate tmin. Initially I had named the columns tmin1, tmin2, etc in the design data, but when attempting to use covariate.predictions with an index of 24 the estimated predicted values do not vary.

This is what the output of the PIMS shows:
session = 1;session.label = 1;group = Group 1
1 2 3 4 5 6 7 8
1 5 6 7 8 9 10 11 12
session = 2;session.label = 2;group = Group 1
1 2 3 4 5 6 7 8
1 13 14 15 16 17 18 19 20
session = 3;session.label = 3;group = Group 1
1 2 3 4 5 6 7 8
1 21 22 23 24 25 26 27 28

How should I name the columns, and what do I name the variable I am calling in the data frame for predicting?
s02hoff
 
Posts: 4
Joined: Tue Jul 12, 2022 4:10 pm

Re: MultScalOcc with p covariates for each visit

Postby jlaake » Thu Aug 18, 2022 3:45 pm

In the initial posting of this thread, the variable was an individual covariate with the values differing between locations. In that case, the variables are in the data frame with the encounter history. You are talking about putting them in the design data which is only appropriate if the values of tmin are the same across locations or locations within a group for a particular session/occasion. If they vary across locations then you put them in the data file with the encounter history and name the series of variables as described in those posts. If they are the same across all locations or the same for all locations within each group then it would be named tmin as a single column in the design data and would have different values for the entries in the design data. The function merge_design.covariates was create to merge covariates into design data but it only works bygroup and bytime and has never been extended to work with a robust design which has 2 levels of "time". If tmin is supposed to be in the design data then you can't use merge_design.covariates but you can use the R function merge. I can help you through that if that is your situation.
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: MultScalOcc with p covariates for each visit

Postby s02hoff » Sat Aug 20, 2022 2:24 pm

Much thanks to Jeff for taking the time to help me with this issue!
I was indeed naming my columns incorrectly for p and theta covariates. Once I named my tmin covariate according to the session and time (tmin11,tmin12...tmin21, tmin22.. etc. rather than tmin1, tmin2, tmin3), covariate predictions were estimating the detection parameter correctly.
s02hoff
 
Posts: 4
Joined: Tue Jul 12, 2022 4:10 pm


Return to RMark

Who is online

Users browsing this forum: Google [Bot] and 15 guests