## DM formatting to average across multiple groups (not time)

questions concerning analysis/theory using program MARK

### DM formatting to average across multiple groups (not time)

Is there an example in the MARK book that shows how to set up the design matrix to average parameters across groups (betas) that don't correspond to time-varying survival? If using the dipper data, it would be looking for an average across sex, colony, or sex and colony.

Ultimately, I'm looking for a way to get averages across 2 different grouping factors in a nest survival model where there is also a cubic effect of day-of-season. I just figured I'd try to find a simpler example with the dipper data first and try to build off that.

I spent most of my time digging through chapter 6, so it's possible I missed it somewhere else.
tlyons4

Posts: 18
Joined: Thu Nov 30, 2017 7:29 pm

### Re: DM formatting to average across multiple groups (not tim

tlyons4 wrote:Is there an example in the MARK book that shows how to set up the design matrix to average parameters across groups (betas) that don't correspond to time-varying survival? If using the dipper data, it would be looking for an average across sex, colony, or sex and colony.

I don't really follow -- what do you mean 'that don't correspond to time-varying survival'? Think about the dipper (as an example) do you simply want mean for males, and females? Mean over time, within each sex? If so (and I'm guessing), then there are a couple of ways you can do this -- random effects using Burnham's method of moments (Appendix D), or MCMC (Appendix E).
cooch

Posts: 1338
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

### Re: DM formatting to average across multiple groups (not tim

I was referring to an example in chapter 6 on pgs 6-81 & 6-83 where it uses coding in the design matrix to calculate an average survival probability when survival varies through time.

With the dipper data, it would be having a model with sex and colony, then calculating a mean for each sex across colonies, each colony across sexes, and then an overall mean.

Sorry for not being clearer.
tlyons4

Posts: 18
Joined: Thu Nov 30, 2017 7:29 pm

### Re: DM formatting to average across multiple groups (not tim

tlyons4 wrote:I was referring to an example in chapter 6 on pgs 6-81 & 6-83 where it uses coding in the design matrix to calculate an average survival probability when survival varies through time.

With the dipper data, it would be having a model with sex and colony, then calculating a mean for each sex across colonies, each colony across sexes, and then an overall mean.

Sorry for not being clearer.

Same answer, then -- the DM coding approach does fine with getting the 'mean', but if you want the right SE/process variance, you should use either RE or MCMC. Howyou approach it will depend on your DM. There is a fair bit on this in the MCMC appendix, specifically. But, unless you have a fair number of sampling occasions (rule of thumb is at least 10), then estimates of the mean may not be particularly robust.
cooch

Posts: 1338
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

### Re: DM formatting to average across multiple groups (not tim

I guess that just seems odd to me. Why try use a random effect to get an average across sex? It wouldn't work as there are only two levels, and it's normally considered a fixed effect. Or can it be included in the DM like below? Column 1 is the intercept, 2 is the offset for sex ?

Code: Select all
`1   11   11   11  -11  -11  -1`

Probably a better way of putting, I'm after the marginal means of a 2-way model.
tlyons4

Posts: 18
Joined: Thu Nov 30, 2017 7:29 pm

### Re: DM formatting to average across multiple groups (not tim

To get the approriate estimate of the variance. Presumably it isn't just the 'average' you're after.
cooch

Posts: 1338
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

### Re: DM formatting to average across multiple groups (not tim

For completeness, consider the data set at the bottom of this post: simulated with 6 occasions, 2 groups (call them female, male), true phi are 0.70 and 0.85 respectively. Constant p, same for both groups.

If you build the following DM, corresponding to a standard offset structure for phi(group)p(dot) -- the true generating model -- using the identity link to make interpreting the betas easier:

Code: Select all
`1 11 11 11 1 1 11 01 01 01 01 0`

you get the following estimates:

Code: Select all
` Parameter                    Beta         Standard Error       -------------------------  --------------  --------------      1:                      0.8495034       0.0012078         2:                      -0.1475446      0.0019485     `

Beta(1) is the reference (males, corresponding to 0.85), while beta(2) is the difference of the other group (females) relative to the control (0.7 is 0.15 less than 0.85, i.e., ~ -0.15).

If you use the following DM:

Code: Select all
`1 11 11 11 1 1 11 -11 -11 -11 -11 -1`

you get the following

Code: Select all
` Parameter                    Beta         Standard Error       -------------------------  --------------  --------------      1:                      0.7757311       0.0010306          2:                      -0.0737723      0.9742689E-003 `

Beta(1) is the the average of the 2 groups ([0.85+0.70]/2=0.775), while beta(2) is the average difference of the two groups from this mean (e.g., 0.85 is 0.075 higher than the mean of 0.775).

OK, so whats wrong with using this approach? The problem is in the estimate of the error. SE from above is 0.0010306. If we estimate var as square of this, we get a really small number.

If you run a RE approach to the same question you get the following estimates:

Code: Select all
`    Beta-hat  -----------    0.775241`

So, estimate is basically identical to what we saw with the previous DM, but, the reported variance is 0.0061345, which is a robust estimate of the process variance, and (more to the point), is much bigger than the estimate based on the SE from the DM approach, above. [As a quick aside, you get the same answer using MCMC as you would with the RE approach based on 'methods of moments'].

So, as per my earlier post, you should try the RE or MCMC approach, instead of trying to get there from here by tweaking the DM.

Here are the encounter data:

Code: Select all
`111000 1103 840;110000 1990 1239;111110 411 642;111100 601 698;100000 3549 1824;111111 553 1432;101000 261 207;101100 134 164;111010 101 172;110111 141 364;110110 93 163;110100 170 152;111101 131 369;111001 40 93;101110 109 168;101111 133 373;101101 33 94;100110 34 50;100100 41 57;110010 31 36;111011 137 375;100011 6 22;110101 34 94;101011 37 93;110011 35 83;100111 41 88;100101 8 24;101010 15 34;110001 10 19;101001 10 20;100010 7 6;100001 1 5;011001 73 140;011011 226 522;011000 1991 1257;011110 785 1017;011100 1167 969;010000 3420 1817;011111 969 2151;010100 268 230;010111 242 510;011010 203 252;010101 55 154;010110 201 234;011101 278 515;010011 69 149;010010 44 54;010001 9 29;001000 3384 1817;001100 2018 1421;001111 1822 3097;001110 1449 1496;001011 439 797;001010 343 377;001101 425 809;001001 120 186;000101 867 1146;000100 3585 2027;000111 3099 4644;000110 2449 2183;000011 5617 6795;000010 4383 3205;`
cooch

Posts: 1338
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

### Re: DM formatting to average across multiple groups (not tim

Thank you for the clarification. I'm working on wrapping my head around it still. I appreciate the help.
tlyons4

Posts: 18
Joined: Thu Nov 30, 2017 7:29 pm