code for generating model subsets from global model

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

code for generating model subsets from global model

Postby Rotella » Thu Apr 18, 2013 10:57 pm

The R package "MuMIn" (multi-model inference) contains a function 'dredge' that can be used to easily generate a set of models that contains all possible combinations of the terms in a user-specified global model. It can be applied in RMark as shown in 2 basic examples below. One example uses the mallard nest-survival dataset that is available in RMark. The other works with the dipper dataset. The help for "MuMIn" provides further details about using the function, setting various options for constraining the model combinations to evaluate, and what information criterion to use.


#################################
# Example with nest survival models #
library(RMark)
library(MuMIn)
data(mallard)
Big.Mod=mark(mallard,nocc=90,model="Nest",
model.parameters=list(S=list(formula=~NestAge+Robel+PpnGrass)))
mods=dredge(Big.Mod)
mods
print(mods, abbrev.names=FALSE)
print.data.frame(mods,digits=7,na.print = "", abbrev.names = FALSE)
coeffs(mods)

################################
# Dipper example #
require(RMark)
require(MuMIn)
data(dipper)

# Process data
dipper.processed=process.data(dipper,groups=("sex"))

# Create default design data
dipper.ddl=make.design.data(dipper.processed)

# Add Flood covariates for Phi and p that have different values
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1

Phidot=list(formula=~1)
Phitime=list(formula=~time)
PhiFlood=list(formula=~Flood)

# Define range of models for p

pdot=list(formula=~1)
ptime=list(formula=~time)

gen=mark(dipper.processed,dipper.ddl,
model.parameters=list(Phi=list(formula=~sex*time),p=list(formula=~Flood)))

mods=dredge(gen)
mods
coeffs(mods)
Rotella
 
Posts: 72
Joined: Mon Jun 09, 2003 11:32 am

Re: code for generating model subsets from global model

Postby jlaake » Fri Apr 19, 2013 9:37 am

Thanks for that Jay. This example and the one I posted on bs (splines) are both good examples of the usefulness of building tools in R and having a community of folks offering ideas like this one. I've not yet used MuMIn except for the examples Jay sent. I did see that there is a parallel option so this may be an alternative to mark.wrapper and mark.wrapper.parallel. I believe this will provide the capabilities now in MARK that was discussed at http://www.phidot.org/forum/viewtopic.php?f=1&t=988&hilit=all+possible+subset+Rmark

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: code for generating model subsets from global model

Postby jCeradini » Wed Dec 14, 2016 1:21 pm

Hi folks,

When I run Jay's dipper example to estimate all possible combos via MuMIn::dredge, I end up with 10 of the exact same models in the output: Phi=list(formula=~sex*time),p=list(formula=~Flood). Sorry, I cannot get the table output to display better, but it is clear nonetheless that they're all the same. Is this still the best method for running all combos via RMark? How am I breaking it?
I also tinkered with Bret's example (viewtopic.php?f=21&t=2395&p=7502&hilit=all+possible+formulas#p7502) but did not have much luck, plus it seems that it requires writing out all the formulas still, I think. I was not able to find the other thread on "bs (splines)" that Jeff refers to and sounds like may give another example of how to create all possible combos.

Thanks!

Code: Select all
> mods=dredge(gen)

Fixed terms are "Phi((Intercept))" and "p((Intercept))"
Global model call: mark(data = dipper.processed, ddl = dipper.ddl, model.parameters = list(Phi = list(formula = ~sex *
    time), p = list(formula = ~Flood)), output = FALSE)
---
Model selection table
   Phi((Int)) Phi(sex) Phi(tim) Phi(sex:tim) p((Int)) p(Fld) df  logLik  AICc delta weight
1      0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
2      0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
3      0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
4      0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
8      0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
9      0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
10     0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
11     0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
12     0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
16     0.5626        +        +            +    2.186 0.2969 14 -329.08 687.2     0    0.1
jCeradini
 
Posts: 72
Joined: Mon Oct 13, 2014 3:53 pm

Re: code for generating model subsets from global model

Postby jlaake » Wed Dec 14, 2016 1:33 pm

I've not used dredge before but if you use create.model.list with mark.wrapper you can run all subsets as long as you name the formulas with a suffix (something after .)like Phi.1, Phi.dot or whatever. Best to do that inside of a function that you define so that create.model.list only picks up formulas and objects defined inside the function.

When you run into an issue like that, use debug and see what is going wrong by stepping through the code. Typing debug(dredge) before the call to dredge will let you step through the code in dredge to see what is going on. See ?debug to see how to use debug. Q gets out of debug and undebug(dredge) stops debuging for next call to dredge.

--jeff
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: code for generating model subsets from global model

Postby Rotella » Wed Dec 14, 2016 1:43 pm

There was a bug in the code for the 'dredge' function that caused it to stop working on RMark objects. Kamil has fixed that now in the updated MuMIn package. To update to the newest version, implement the code below. It is also wise to update all of your packages, R, and RStudio to the latest versions.

Code: Select all
install.packages("MuMIn", repos="http://R-Forge.R-project.org”)
Rotella
 
Posts: 72
Joined: Mon Jun 09, 2003 11:32 am

Re: code for generating model subsets from global model

Postby jCeradini » Wed Dec 14, 2016 2:51 pm

Thanks Jay - I like that solution! I ran install.packages("MuMIn", repos="http://R-Forge.R-project.org”) and dredge now works as expected.

Thanks for the tips Jeff. debug sounds very handy for all kinds of R problems. Also, regarding create.model.list and mark.wrapper, doesn't that approach do all combos of the predefined parameters formulas but not, additionally, all formula combos within each parameter? Or, am I misunderstanding how to specify the RMark function to get similar output to dredge? I would certainly prefer staying within the RMark functions because, I think, the resulting list of models is easier to work with....but dredge isn't bad.
Working off of Jay's code:

All combos via dredge
Code: Select all
#Dipper example
data(dipper)

# Process data
dipper.processed=process.data(dipper,groups=("sex"))

# Create default design data
dipper.ddl=make.design.data(dipper.processed)

# Add Flood covariates for Phi and p that have different values
dipper.ddl$Phi$Flood=0
dipper.ddl$Phi$Flood[dipper.ddl$Phi$time==2 | dipper.ddl$Phi$time==3]=1
dipper.ddl$p$Flood=0
dipper.ddl$p$Flood[dipper.ddl$p$time==3]=1

gen=mark(dipper.processed,dipper.ddl,
         model.parameters=list(Phi=list(formula=~sex+time),p=list(formula=~Flood)),
         output = FALSE)

mods=dredge(gen)
mods

  Phi((Int)) Phi(sex) Phi(tim) p((Int))  p(Fld) df   logLik  AICc delta weight
1     0.2421                      2.226          2 -333.419 670.9  0.00  0.436
2     0.2036        +             2.227          3 -333.338 672.7  1.87  0.172
5     0.2421                      2.238 -0.1157  3 -333.412 672.9  2.01  0.159
3     0.5144                 +    2.220          7 -329.865 674.0  3.13  0.091
6     0.2037        +             2.239 -0.1136  4 -333.331 674.8  3.89  0.062
7     0.5195                 +    2.186  0.2992  8 -329.824 676.0  5.13  0.034
4     0.4829        +        +    2.222          8 -329.825 676.0  5.13  0.034
8     0.4882        +        +    2.187  0.2970  9 -329.784 678.0  7.14  0.012


All combos via RMark function
Code: Select all
combos <- function()
{
  # Phi
  Phi.1=list(formula=~1)
  Phi.2=list(formula=~time)
  Phi.3=list(formula=~sex)
 
  # p
  p.1=list(formula=~1)
  p.2=list(formula=~Flood)

  cml <- create.model.list("CJS")
  return(mark.wrapper(cml, data = dipper.processed, ddl = dipper.ddl, adjust = FALSE, output = FALSE))
}

mods2 <- combos()
mods2

                      model npar     AICc DeltaAICc     weight Deviance
1        Phi(~1)p(~1)    2 670.8660  0.000000 0.45722113 84.36055
5      Phi(~sex)p(~1)    3 672.7331  1.867043 0.17976411 84.19909
2    Phi(~1)p(~Flood)    3 672.8806  2.014583 0.16698025 84.34663
3     Phi(~time)p(~1)    7 673.9980  3.132004 0.09550372 77.25297
6  Phi(~sex)p(~Flood)    4 674.7578  3.891783 0.06531852 84.18569
4 Phi(~time)p(~Flood)    8 675.9936  5.127545 0.03521227 77.17114


Thanks again.
jCeradini
 
Posts: 72
Joined: Mon Oct 13, 2014 3:53 pm

Re: code for generating model subsets from global model

Postby jlaake » Wed Dec 14, 2016 3:03 pm

I've not used dredge so you know better than I but create.model.list does not subset within the parameter formula. It only creates all combinations across different parameters.
jlaake
 
Posts: 1417
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: code for generating model subsets from global model

Postby jCeradini » Wed Dec 14, 2016 3:07 pm

Ok, that's what I thought - just wanted to make sure there wasn't some modification I was missing. Thanks Jeff.
jCeradini
 
Posts: 72
Joined: Mon Oct 13, 2014 3:53 pm

Re: code for generating model subsets from global model

Postby jCeradini » Thu Jan 19, 2017 7:21 pm

Likely reinventing a lower quality wheel here, but I've found this function handy when needing all possible additive combinations of predictors, and still wanting the output to be an RMark list of models. Feel free to improve the function in any way. The function literally just prints the formulas in the console and then I copy and paste them into the RMark function.

You need to specify the exact column names of the predictors, the parameter, and may or may not specify a predictor that want in every model. Also creates a constant model with no predictors no matter what. You need to manually add any models with interactions afterwards.

Function
Code: Select all
require(gtools) # combinations function
require(plyr) # rbind.fill

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Make all possible combos of predictors and combine into formulas
# - Need to use exact column names of predictors
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

all.combos <- function(preds, param, preds.in.all=NULL){

  vars <- data.frame(preds = preds)
  vars$preds <- as.character(vars$preds)
  pred.num <- nrow(vars)
  combos <- 1:pred.num

  # Make all possible combos of predictors as columns in DF
  pred.combos <- data.frame()
  for(i in 1:pred.num){
    temp <- data.frame(combinations(pred.num, combos[i], vars$preds))
    pred.combos <- rbind.fill(temp, pred.combos)
  }

 
  # Each ROW is a model
  # Rename last column
  last.col <- paste("X", ncol(pred.combos), sep = "") # rename last col
  names(pred.combos)[ncol(pred.combos)] <- last.col # rename last col
  last.var.col <- ncol(pred.combos)
 
  # Change all cols to character
  i <- sapply(pred.combos, is.factor)
  pred.combos[i] <- lapply(pred.combos[i], as.character)

  # Add extra row for constant model (intercept only)
  constant <- data.frame(x = 1)
  names(constant) <- last.col
  pred.combos <- rbind.fill(pred.combos, constant)
 
  # Add column for preds.in.all, which will be in every model
  pred.combos$in.all <- preds.in.all
  # Remove from constant model row
  pred.combos$in.all[pred.combos[last.var.col] == "1"] <- NA
 
  # Add parameter column
  pred.combos$param <- param
 
 
  # Paste all pred cols together to make model names
  # 1st reorder cols so paste is correct
  pred.combos <- pred.combos[ ,c(ncol(pred.combos), 1:(ncol(pred.combos)-1))] # reorder
  pred.combos$mod <- apply(pred.combos, 1, paste, collapse = ".") # paste
  pred.combos$mod <- gsub(".NA", "", pred.combos$mod, fixed = TRUE) # remove NA's

 
  # Paste all pred cols together to make formulas
  pred.combos$formula <- apply(pred.combos[ ,-c(1, ncol(pred.combos))], 1, paste, collapse = " + ")
  # Strip NA's from formulas
  pred.combos$formula <- gsub(" + NA", "", pred.combos$formula, fixed = TRUE)
  pred.combos$formula <- gsub("NA + ", "", pred.combos$formula, fixed = TRUE)

 
  # Paste list-formula stuff so RMark knows it is a formula
  pred.combos$formula <- paste("<- list(formula = ~", pred.combos$formula, ")", sep = "")
 
 
  # Combine mods and formula
  pred.combos$formula <- paste(pred.combos$mod, pred.combos$formula, sep = " ")

 
  # Print equations, without rownames, in console so can copy-paste directly into function
  print(pred.combos[ ,"formula", drop=FALSE], row.names = FALSE)

}


Some examples of output
Code: Select all
all.combos(preds = c("cover", "ndvi", "year"), param = "Psi")

 Psi.cover.ndvi.year <- list(formula = ~cover + ndvi + year)
             Psi.cover.ndvi <- list(formula = ~cover + ndvi)
             Psi.cover.year <- list(formula = ~cover + year)
               Psi.ndvi.year <- list(formula = ~ndvi + year)
                         Psi.cover <- list(formula = ~cover)
                           Psi.ndvi <- list(formula = ~ndvi)
                           Psi.year <- list(formula = ~year)
                                 Psi.1 <- list(formula = ~1)


all.combos(preds = c("cover", "ndvi"), param = "p")

 p.cover.ndvi <- list(formula = ~cover + ndvi)
             p.cover <- list(formula = ~cover)
               p.ndvi <- list(formula = ~ndvi)
                     p.1 <- list(formula = ~1)


all.combos(preds = c("ndvi", "precip"), param = "p", preds.in.all = "year")

p.ndvi.precip.year <- list(formula = ~ndvi + precip + year)
                 p.ndvi.year <- list(formula = ~ndvi + year)
             p.precip.year <- list(formula = ~precip + year)
                                   p.1 <- list(formula = ~1)


Would then just paste formulas into a function
Code: Select all
# Detection formulas
all.combos(preds = c("precip", "temp"), param = "p", preds.in.all = "year")

# Occupancy formulas
all.combos(preds = c("cover", "slope", "year"), param = "Psi")


combos.formulas <- function()
{
  # Detection
  p.precip.temp.year <- list(formula = ~precip + temp + year)
  p.precip.year <- list(formula = ~precip + year)
  p.temp.year <- list(formula = ~temp + year)
  p.1 <- list(formula = ~1)
 
  # Occupancy
  Psi.cover.slope.year <- list(formula = ~cover + slope + year)
  Psi.cover.slope <- list(formula = ~cover + slope)
  Psi.cover.year <- list(formula = ~cover + year)
  Psi.slope.year <- list(formula = ~slope + year)
  Psi.cover <- list(formula = ~cover)
  Psi.slope <- list(formula = ~slope)
  Psi.year <- list(formula = ~year)
  Psi.1 <- list(formula = ~1)
 
  cml <- create.model.list("Occupancy")
 
  return(mark.wrapper(cml, data = data.proc, ddl = data.ddl, adjust = FALSE, output = FALSE))
 
}

# Run models
mod.set <- combos.formulas()


Joe
jCeradini
 
Posts: 72
Joined: Mon Oct 13, 2014 3:53 pm

Re: code for generating model subsets from global model

Postby WiPhi » Mon May 01, 2017 1:09 pm

I just wanted to share a small chunk of code that works to remove nonsensical models before calling MARK when exploring mixtures.

As noted elsewhere, models without heterogeneity can still be fit with functions like "CJSMixture" if a formula for pi, fixes the mixture at 1 is included along with other mixture models:

Code: Select all
pi.fix=list(formula=~1,fixed=1)
pi.mix=list(formula=~1)


“pi.fix” models will produce the same results as fitting a "CJS" model with no mixture.
(Note that here I am fitting a mixture of 2 clases with the ~1. If more than >2 classes are specified, the formula must be ~mixture as recently covered here:
http://www.phidot.org/forum/viewtopic.php?f=21&t=3436&p=11209#p11209

When using create.model.list() within a function call contain both no-heterogeneity and mixture models, misspecified models will be fit whenever the no-heterogeneity 'pi.fix' model is crossed with any parameter model that specifies an effect of ‘mixture’.

This code will remove all models from the model list (here called 'cml') that specify a mixture but mixture is not called by another parameter model (detection, for example) or when mixture is called in another parameter but pi is fixed:
Code: Select all
 cml<-cml[!((grepl("fix",cml$pi)&grepl("mix",cml$p))|(!grepl("fix",cml$pi)&!grepl("mix",cml$p))),]

This will work as long as the character string ‘fix’ is used in the pi model for no-heterogeneity and the character string ‘mix’ is used in any parameter model that includes the mixing parameter.

Mark will run without this, but this results in a cleaner final model list.
WiPhi
 
Posts: 27
Joined: Thu Mar 31, 2016 6:15 pm


Return to RMark

Who is online

Users browsing this forum: No registered users and 14 guests

cron