Warning when using merge_design.covariates

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Warning when using merge_design.covariates

Postby BBAllen » Tue Jun 13, 2017 6:35 pm

I am receiving this warning message when trying to merge 11 time-varying covariates to my capture history that includes three groups (Year, Age, Sex) and three individual covariates (Wt, z.wt, and rsf).

Warning message:
In `[<-.factor`(`*tmp*`, ri, value = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, :
invalid factor level, NA generated

I have no issues with process.data or make design data on the ch dataset. But when I merge the data frame that includes a column "time" and "group" for one of the group terms RMark throws the above warning. I reran the process.data and only defined one "group" (Year), which is the group column in the time-varying covariate data frame being merged.

So, I have no idea what the invalid factor level is, where these NA's are being generated, or how to figure out how to identify the invalid factor.

Any ideas???
BBAllen
 
Posts: 3
Joined: Tue Mar 01, 2016 2:42 pm
Location: University of Maine, Orono, ME

Re: Warning when using merge_design.covariates

Postby jlaake » Wed Jun 14, 2017 12:33 pm

I'm very confused by your post. You mention group variables and 3 individual covariates in your data but don't mention any design covariates. As the function name implies merge_design.covariates is for design covariates and not individual covariates. To be clear, an individual covariate can be different for each capture history (typically each different critter) and is stored in the data with the capture history and a design covariate only differs by time or group and is stored in the design data. Now you state you are merging 11 time varying covariates to you capture history which is in the data and not the design data. What exactly do you mean by a time-varying covariate? Does it vary only by time or are they time-varying individual covariates? The terminology is rather important here.

You really didn't provide enough details to work out the problem. I thought at first it may be due to multiple factor levels for groups but I can't reproduce the error you get (see below). One thought is that or or more of the names of the 11 time-varying covariates or you have missing values (ie NA).
Code: Select all
# an example that works
data(dipper)
dipper$region=factor(c(rep(1,100),rep(2,194)))
dp=process.data(dipper,groups=c("sex","region"))
ddl=make.design.data(dp)
ddf=data.frame(time=rep(1:6,4),group=rep(apply(dp$group,1,paste,collapse=""),each=6),cov=1:24)
merge_design.covariates(ddl$Phi,ddf,bygroup=TRUE)


# thought maybe this was the problem where group levels are out of ordered but that isn't it.
ddf=data.frame(time=rep(1:6,4),group=rep(apply(cbind(dp$group[,2],dp$group[,1]),1,paste,collapse=""),each=6),cov=1:24)
merge_design.covariates(ddl$Phi,ddf,bygroup=TRUE)

jlaake
 
Posts: 953
Joined: Fri May 12, 2006 12:50 pm
Location: National Marine Mammal Laboratory, Seattle, WA

Re: Warning when using merge_design.covariates

Postby jlaake » Wed Jun 14, 2017 12:33 pm

I'm very confused by your post. You mention group variables and 3 individual covariates in your data but don't mention any design covariates. As the function name implies merge_design.covariates is for design covariates and not individual covariates. To be clear, an individual covariate can be different for each capture history (typically each different critter) and is stored in the data with the capture history and a design covariate only differs by time or group and is stored in the design data. Now you state you are merging 11 time varying covariates to you capture history which is in the data and not the design data. What exactly do you mean by a time-varying covariate? Does it vary only by time or are they time-varying individual covariates? The terminology is rather important here.

You really didn't provide enough details to work out the problem. I thought at first it may be due to multiple factor levels for groups but I can't reproduce the error you get (see below). One thought is that or or more of the names of the 11 time-varying covariates or you have missing values (ie NA).
Code: Select all
# an example that works
data(dipper)
dipper$region=factor(c(rep(1,100),rep(2,194)))
dp=process.data(dipper,groups=c("sex","region"))
ddl=make.design.data(dp)
ddf=data.frame(time=rep(1:6,4),group=rep(apply(dp$group,1,paste,collapse=""),each=6),cov=1:24)
merge_design.covariates(ddl$Phi,ddf,bygroup=TRUE)


# thought maybe this was the problem where group levels are out of ordered but that isn't it.
ddf=data.frame(time=rep(1:6,4),group=rep(apply(cbind(dp$group[,2],dp$group[,1]),1,paste,collapse=""),each=6),cov=1:24)
merge_design.covariates(ddl$Phi,ddf,bygroup=TRUE)

jlaake
 
Posts: 953
Joined: Fri May 12, 2006 12:50 pm
Location: National Marine Mammal Laboratory, Seattle, WA

Re: Warning when using merge_design.covariates

Postby BBAllen » Wed Jun 14, 2017 2:03 pm

Yeah, sorry you are suffering confusion from my confusion. This stuff is a little foreign to me, and I'm not exactly clear on the correct order of operations, I guess. Let me give it another shot:

Background: I am working with telemetry data and doing a CJS analysis to investigate factors influencing departure rates from a migratory stopover. I am defining Phi as the probability of remaining in the study area and 1-Phi the departure probability.

I have a daily encounter (capture) history for 186 individuals over four years. There are 102 days per year. I have defined the year in which the individual is captured as a group, as well as sex, and age. I also have weight (measured at capture), z-standarized weight, and an average rspf value (averaged over the duration the individual stayed in the study area) for each individual (the three individual covariates; not time-varying).

The 11 time-varying covariates are daily measurements of environmental variables (avg. wind speed, min. temp, avg. barometric pressure, etc.), which if I am understanding correctly are design covariates.

Here is the code I've been trying. I included intermediary steps just in case I have assigned something as a factor when I should not have.

Code: Select all
amwo <- read.csv("~/Dropbox/Thesis/nj_amwo/all_years/data_csv/daily_ch_rmark.csv", header=T)
rspf.cjs <- read.csv("~/Dropbox/Thesis/nj_amwo/all_years/data_csv/rspf_cjs.csv") # generated from chapt1_rsf.R script
names(rspf.cjs) <- c("uid", "rspf")


wx.moon <- read.csv("/Users/brianballen/Dropbox/Thesis/nj_amwo/all_years/data_csv/wx.moon.csv")


amwo2 <- join(amwo, rspf.cjs, by= "uid", type= "left")
amwo2$z.rspf<- scale(amwo2$rspf)
amwo2$z.rspf[is.na(amwo2$z.rspf)] <- 0
amwo.cjs <- amwo2[c("uid", "ch", "Year","Age", "Sex", "Wt", "Z_Wt", "z.rspf")]
amwo.cjs$ch <- as.character(amwo$ch)
amwo.cjs$Year <- as.factor(amwo$Year)

amwo.process <- process.data(amwo.cjs, model="CJS", groups = c("Year"))
amwo.ddl <- make.design.data(amwo.process)

wx.moon$group <- as.factor(wx.moon$group)
wx.moon$X <- NULL
amwo.ddl$Phi <- merge_design.covariates(amwo.ddl$Phi,wx.moon, bytime = TRUE, bygroup = TRUE)


Output
Code: Select all
> amwo <- read.csv("~/Dropbox/Thesis/nj_amwo/all_years/data_csv/daily_ch_rmark.csv", header=T)
> rspf.cjs <- read.csv("~/Dropbox/Thesis/nj_amwo/all_years/data_csv/rspf_cjs.csv") # generated from chapt1_rsf.R script
> names(rspf.cjs) <- c("uid", "rspf")
>
>
> wx.moon <- read.csv("/Users/brianballen/Dropbox/Thesis/nj_amwo/all_years/data_csv/wx.moon.csv")
>
>
> amwo2 <- join(amwo, rspf.cjs, by= "uid", type= "left")
> amwo2$z.rspf<- scale(amwo2$rspf)
> amwo2$z.rspf[is.na(amwo2$z.rspf)] <- 0
> amwo.cjs <- amwo2[c("uid", "ch", "Year","Age", "Sex", "Wt", "Z_Wt", "z.rspf")]
> amwo.cjs$ch <- as.character(amwo$ch)
> amwo.cjs$Year <- as.factor(amwo$Year)
>
> amwo.process <- process.data(amwo.cjs, model="CJS", groups = c("Year"))
> amwo.ddl <- make.design.data(amwo.process)
>
> wx.moon$group <- as.factor(wx.moon$group)
> wx.moon$X <- NULL
> amwo.ddl$Phi <- merge_design.covariates(amwo.ddl$Phi,wx.moon, bytime = TRUE, bygroup = TRUE)
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,  :
  invalid factor level, NA generated


And here is the head of amwo.cjs and wx.moon. I have a feeling this will provide insight into the issue.

    > head(amwo.cjs)
    uid
    1 164.033_10a
    2 164.043_10a
    3 164.052_10a
    4 164.093_10a
    5 164.113_10a
    6 164.122_10a
    ch Year Age
    1 ...000001100000100100010111000000000000000000000000000000000000000000000000000000..................... 2010 HY
    2 ...100000000000000000000000000000000000000000000000000000000000000000000000000000..................... 2010 HY
    3 ...011001000000100100001111001110000000000000000000000000000000000000000000000000..................... 2010 AHY
    4 ...000001110100000100011111000000000000000000000000000000000000000000000000000000..................... 2010 HY
    5 ...000001110100101100011111000000000000000000000000000000000000000000000000000000..................... 2010 HY
    6 ...000000110100101100011111001100000111110011111001111100110100000000000000000000..................... 2010 HY
    Sex Wt Z_Wt z.rspf
    1 F 183 0.5933 -0.81063965
    2 F 230 2.6292 0.00000000
    3 M 143 -1.1395 -2.62820102
    4 F 182 0.5499 -0.78084812
    5 F 179 0.4200 -0.41633140
    6 F 183 0.5933 0.01694618
    > head(wx.moon)
    time group moon min.temp precip avg.bar avg.wind max.wind avg.cc avg.ch avg.wnd.dir
    1 1 2010 1.3529332 0 0 0 0 0 0 0 0
    2 2 2010 1.2393887 0 0 0 0 0 0 0 0
    3 3 2010 1.0974580 0 0 0 0 0 0 0 0
    4 4 2010 0.8987551 0 0 0 0 0 0 0 0
    5 5 2010 0.6716660 0 0 0 0 0 0 0 0
    6 6 2010 0.3878047 0 0 0 0 0 0 0 0

You'll notice in the code only Year is a group and I had stated Year, Age, and Sex were groups. I thought having Age and Sex as groups was the issue so I decided to define them as individual covariates, since it doesn't affect my analysis.
BBAllen
 
Posts: 3
Joined: Tue Mar 01, 2016 2:42 pm
Location: University of Maine, Orono, ME

Re: Warning when using merge_design.covariates

Postby jlaake » Thu Jun 15, 2017 10:55 am

A few things here.

1) head function not particularly useful. Use str function instead because it shows mode of each value (ie numeric, factor etc) as well as first few values.
2) age and sex are factor variables as defined and cannot be used as individual covariates. Values of individual covariates are plugged into the design matrix and must be numeric. You really want Age and Sex to be group variables. Also you want to assign an initial.age to each age group but since you have years as groups it is not clear whether you have recaptures across years or not so age would only be within year.
3) The message you got is only a warning that it plugged in NA to a value of a factor. This means that you can look at the result in amwo.ddl$Phi and see what is causing the problem. You can use str and summary.
4) My guess is that one of your numeric variables in wx.moon is actually a factor variable rather than numeric but hard to know without looking at the values in the ddl.

If you can't resolve this, let's take this offlist. jefflaake@gmail.com

--jeff
jlaake
 
Posts: 953
Joined: Fri May 12, 2006 12:50 pm
Location: National Marine Mammal Laboratory, Seattle, WA


Return to RMark

Who is online

Users browsing this forum: No registered users and 2 guests

cron