Hello all,
My advisor and I are trying to understand the names assigned to the beta coefficients when we run a single season occupancy model in RPresence. I've looked into design matrices and my advisor is familiar with them and we don't understand the naming conventions that are leading to this output.
For comparison, we've run three models
Method Only Model: psi ~ 1, p ~ method
method = a factor of 1 and 2 indicating visual surveys vs eDNA
eDNA Covariate Model: psi~1, p ~ method + pH + volume + conductivity + salinity
all variables are numerical except method (same as above). These variables only apply to survey 3 (eDNA)
Weather Model: psi ~ 1, p ~ method + clouds + precipitation + wind + airtemp
Method, clouds, precip, and wind are categorical
Airtemp is numerical
The design matrix for the method only model looks like this:
b1 b2
p1 "1" "0"
p2 "1" "0"
p3 "1" "1"
And the beta output looks like this:
est se
p1_B1.Int -0.313603 0.256614
p3_B2 -0.544725 0.329389
Based on our understanding, b1 is thus the intercpet b2 is a slope modifier for eDNA detection, i.e.,
p = B1 + B2(eDNA or no)
This makes sense to us, but when we add in other variables, we get confused.
Here's the design matrix for the eDNA covariate model:
b1 b2 b3 b4 b5 b6
p1 "1" "0" "0" "0" "0" "0"
p2 "1" "0" "0" "0" "0" "0"
p3 "1" "1" "p.pH" "p.volume" "p.conduct" "p.salinity"
And the beta output:
est se
p1_B1.Int -0.230920 0.241361
p3_B2 7.063321 0.378967
p3_B3.p.pH -0.803916 0.000207
p3_B3.p.pH.p.volume -0.000166 0.000156
p3_B3.p.pH.p.conduct -0.004731 0.000114
p3_B3.p.pH.p.salinity 0.809555 0.000208
And for the weather model:
b1 b2 b3 b4 b5
p1 "1" "0" "p.cloudsOvercast" "p.cloudsPartly Cloudy" "p.precipRain"
p2 "1" "0" "p.cloudsOvercast" "p.cloudsPartly Cloudy" "p.precipRain"
p3 "1" "1" "0" "0" "0"
b6 b7 b8 b9
p1 "p.precipSnow/Sleet/Hail" "p.windNone/Light" "p.windStrong" "p.airtemp"
p2 "p.precipSnow/Sleet/Hail" "p.windNone/Light" "p.windStrong" "p.airtemp"
p3 "0" "0" "0" "0"
Output:
est se
p1_B1.Int -1.640442e+00 6.329470e-01
p3_B2 7.094350e-01 6.606490e-01
p1_B1.Int.p.cloudsOvercast 2.535100e-01 8.406270e-01
p1_B1.Int.p.cloudsPartly Cloudy 6.360650e-01 4.939090e-01
p1_B1.Int.p.precipRain -3.255465e+06 2.991532e+11
p1_B1.Int.p.precipSnow/Sleet/Hail -9.832576e+05 9.035419e+10
p1_B1.Int.p.windNone/Light 1.270829e+00 5.908420e-01
p1_B1.Int.p.windStrong 2.209185e+00 1.367805e+00
p1_B1.Int.p.airtemp 1.400000e-05 1.500000e-04
What we don't understand:
1) Why don't B4 + (eDNA) and B3 + (weather) appear in the beta list?
2) Why does everything after p3_B2 looks like an interaction term when we didn't include any interaction terms when defining the model (only additive)?
3) Why does pH show up in all subsequent beta names in the eDNA covariate model?
4) Why does pH show up in all subsequent betas, but the same thing doesn't happen to p.cloudsOvercast in the weather model? The reason we ask this is because if we change the order of the variables in the eDNA model, the first variable listed after method is the one that ends up appearing in the names of subsequent betas.
Based on my reading about design matrices, each row of the matrix is essentially equivalent to a linear equation. So my understanding of the beta values in the eDNA covariate model is that each row means:
p1 (probability of detection in survey 1) = B1 + 0*(all other Bs)
p2 (probability of deteciton in survey 2) = identical to p1, and I believe Rpresence doesn't list things when they are identical to previous, which is why p2 doesn't appear in Beta lists?
p3 (probability of detection in survey 3) = B1 + B2*(1 = yes, eDNA) + B3 (pH) + B4 (volume) + B5 (conductivity) + B6 (salinity)
My advisor agrees with this interpretation, so we're trying to understand how the Beta names output in model_object$beta$p line up with B1 - B9 in the design matrix, and why it looks like there are interaction terms involved when we didn't specify any. The number of rows in the beta output matches the number of columns in the design matrix, so we're wondering if this is just some naming artifact that we can ignore or if it is actually an indication of issues in the underlying code.
Thanks!