I'm going to expand on what Jim has posted because it depends on the variable class (mainly numeric or factor here). I give some examples below and you should be able to do this with your ddl dataframes like Phi and p unless the variables are individual covariates (ie only in the data and not in the design data). This is explained in detail in the workshop notes which you should read (see link you get when you type library(RMark)). In my description below I'm assuming that sex, color and time are all variables in the design data.
Here is a simple dataframe in which sex, color and time are all factor variables.
- Code: Select all
> data=data.frame(color=c("yellow","green"),sex=c("M","F"),time=c("1","2"),stringsAsFactors=TRUE)
> data
color sex time
1 yellow M 1
2 green F 2
> str(data)
'data.frame': 2 obs. of 3 variables:
$ color: Factor w/ 2 levels "green","yellow": 2 1
$ sex : Factor w/ 2 levels "F","M": 2 1
$ time : Factor w/ 2 levels "1","2": 1 2
>
Now using the data, I'm creating a design matrix with the formula ~sex:color:time which are all factor variables. This is what RMark does.
- Code: Select all
model.matrix(~sex:color:time,data)
(Intercept) sexF:colorgreen:time1 sexM:colorgreen:time1 sexF:coloryellow:time1 sexM:coloryellow:time1 sexF:colorgreen:time2 sexM:colorgreen:time2 sexF:coloryellow:time2 sexM:coloryellow:time2
1 1 0 0 0 1 0 0 0 0
2 1 0 0 0 0 1 0 0 0
attr(,"assign")
[1] 0 1 1 1 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"
attr(,"contrasts")$color
[1] "contr.treatment"
attr(,"contrasts")$time
[1] "contr.treatment"
Note that there are 9 columns in the design matrix, an intercept and a column for each potential combination of the 2 levels of 3 factor variables 8=2*2*2. However there are only 8 potential combinations so this is one too many parameters if you had data in all 8 cells (not the case in my simple example). Note that RMark will drop any column which is all 0's because there are no design data for the cell. If you were to want to use this approach you should use ~-1+sex:color:time so it would drop the intercept and only have the 8 parameters - one for each cell (combination).
Now as Jim mentioned using ~sex*color*time with all factor variables, will give you the main effects and all 2 and 3 way interactions for a total of 8 parameters. The beta's will differ but if all cells have design data and the models converge, it should give the AIC as ~-1+sex:color:time but ~sex:color:time should give the same likelihood but AIC will be greater by 2 because it has one too many parameters.
- Code: Select all
> model.matrix(~sex*color*time,data)
(Intercept) sexM coloryellow time2 sexM:coloryellow sexM:time2 coloryellow:time2 sexM:coloryellow:time2
1 1 1 1 0 1 0 0 0
2 1 0 0 1 0 0 0 0
attr(,"assign")
[1] 0 1 2 3 4 5 6 7
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"
attr(,"contrasts")$color
[1] "contr.treatment"
attr(,"contrasts")$time
[1] "contr.treatment"
Now this all changes if one or more of the 3 parameters is numeric. Here I made time an integer with values 1 and 2.
- Code: Select all
> data=data.frame(color=c("yellow","green"),sex=c("M","F"),time=1:2,stringsAsFactors=TRUE)
> str(data)
'data.frame': 2 obs. of 3 variables:
$ color: Factor w/ 2 levels "green","yellow": 2 1
$ sex : Factor w/ 2 levels "F","M": 2 1
$ time : int 1 2
Now if I use ~sex:color:time I get 5 parameters. An intercept and 4 slopes for time - one slope for each of the 4 sex-color combinations.
- Code: Select all
> model.matrix(~sex:color:time,data)
(Intercept) sexF:colorgreen:time sexM:colorgreen:time sexF:coloryellow:time sexM:coloryellow:time
1 1 0 0 0 1
2 1 2 0 0 0
attr(,"assign")
[1] 0 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"
attr(,"contrasts")$color
[1] "contr.treatment"
But if I use ~sex*color*time I get an intercept, each main effect for sex and color and then 5 slopes for time combined with the factor interactions. In this case with just 2 values of time, you get 8 betas for the 8 combinations but if time had 3 or more values you would still only get 8 parameters because it is fitting lines (on the link scale) for time.
- Code: Select all
> model.matrix(~sex*color*time,data)
(Intercept) sexM coloryellow time sexM:coloryellow sexM:time coloryellow:time sexM:coloryellow:time
1 1 1 1 1 1 1 1 1
2 1 0 0 2 0 0 0 0
attr(,"assign")
[1] 0 1 2 3 4 5 6 7
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"
attr(,"contrasts")$color
[1] "contr.treatment"
>
Now remember that this all assumes that there is at least one record in the design data for each combination of of the three factor variables (if they are all factor variables). For instance, if all males were yellow and females could be green or yellow and you used sex and color as group variables in process.data then you won't get the male-green-time cells and you will end up with fewer parameters (as you should) with -1+sex:color:time. An example of where this is useful is described for age and year models in the workshop notes. If you only mark young birds, in the first year you only have young birds and in subsequent years you have young and older birds.