www.phidot.org

by **kdavis79** » Wed Oct 16, 2024 3:32 pm

Hello all,

My advisor and I are trying to understand the names assigned to the beta coefficients when we run a single season occupancy model in RPresence. I've looked into design matrices and my advisor is familiar with them and we don't understand the naming conventions that are leading to this output.

For comparison, we've run three models
Method Only Model: psi ~ 1, p ~ method
method = a factor of 1 and 2 indicating visual surveys vs eDNA

eDNA Covariate Model: psi~1, p ~ method + pH + volume + conductivity + salinity
all variables are numerical except method (same as above). These variables only apply to survey 3 (eDNA)

Weather Model: psi ~ 1, p ~ method + clouds + precipitation + wind + airtemp
Method, clouds, precip, and wind are categorical
Airtemp is numerical

The design matrix for the method only model looks like this:
b1 b2
p1 "1" "0"
p2 "1" "0"
p3 "1" "1"
And the beta output looks like this:
est se
p1_B1.Int -0.313603 0.256614
p3_B2 -0.544725 0.329389

Based on our understanding, b1 is thus the intercpet b2 is a slope modifier for eDNA detection, i.e.,
p = B1 + B2(eDNA or no)

This makes sense to us, but when we add in other variables, we get confused.

Here's the design matrix for the eDNA covariate model:
b1 b2 b3 b4 b5 b6
p1 "1" "0" "0" "0" "0" "0"
p2 "1" "0" "0" "0" "0" "0"
p3 "1" "1" "p.pH" "p.volume" "p.conduct" "p.salinity"
And the beta output:
est se
p1_B1.Int -0.230920 0.241361
p3_B2 7.063321 0.378967
p3_B3.p.pH -0.803916 0.000207
p3_B3.p.pH.p.volume -0.000166 0.000156
p3_B3.p.pH.p.conduct -0.004731 0.000114
p3_B3.p.pH.p.salinity 0.809555 0.000208

And for the weather model:
b1 b2 b3 b4 b5
p1 "1" "0" "p.cloudsOvercast" "p.cloudsPartly Cloudy" "p.precipRain"
p2 "1" "0" "p.cloudsOvercast" "p.cloudsPartly Cloudy" "p.precipRain"
p3 "1" "1" "0" "0" "0"
b6 b7 b8 b9
p1 "p.precipSnow/Sleet/Hail" "p.windNone/Light" "p.windStrong" "p.airtemp"
p2 "p.precipSnow/Sleet/Hail" "p.windNone/Light" "p.windStrong" "p.airtemp"
p3 "0" "0" "0" "0"
Output:
est se
p1_B1.Int -1.640442e+00 6.329470e-01
p3_B2 7.094350e-01 6.606490e-01
p1_B1.Int.p.cloudsOvercast 2.535100e-01 8.406270e-01
p1_B1.Int.p.cloudsPartly Cloudy 6.360650e-01 4.939090e-01
p1_B1.Int.p.precipRain -3.255465e+06 2.991532e+11
p1_B1.Int.p.precipSnow/Sleet/Hail -9.832576e+05 9.035419e+10
p1_B1.Int.p.windNone/Light 1.270829e+00 5.908420e-01
p1_B1.Int.p.windStrong 2.209185e+00 1.367805e+00
p1_B1.Int.p.airtemp 1.400000e-05 1.500000e-04

What we don't understand:
1) Why don't B4 + (eDNA) and B3 + (weather) appear in the beta list?
2) Why does everything after p3_B2 looks like an interaction term when we didn't include any interaction terms when defining the model (only additive)?
3) Why does pH show up in all subsequent beta names in the eDNA covariate model?
4) Why does pH show up in all subsequent betas, but the same thing doesn't happen to p.cloudsOvercast in the weather model? The reason we ask this is because if we change the order of the variables in the eDNA model, the first variable listed after method is the one that ends up appearing in the names of subsequent betas.

Based on my reading about design matrices, each row of the matrix is essentially equivalent to a linear equation. So my understanding of the beta values in the eDNA covariate model is that each row means:
p1 (probability of detection in survey 1) = B1 + 0*(all other Bs)
p2 (probability of deteciton in survey 2) = identical to p1, and I believe Rpresence doesn't list things when they are identical to previous, which is why p2 doesn't appear in Beta lists?
p3 (probability of detection in survey 3) = B1 + B2*(1 = yes, eDNA) + B3 (pH) + B4 (volume) + B5 (conductivity) + B6 (salinity)

My advisor agrees with this interpretation, so we're trying to understand how the Beta names output in model_object$beta$p line up with B1 - B9 in the design matrix, and why it looks like there are interaction terms involved when we didn't specify any. The number of rows in the beta output matches the number of columns in the design matrix, so we're wondering if this is just some naming artifact that we can ignore or if it is actually an indication of issues in the underlying code.

Thanks!

by **jhines** » Wed Oct 16, 2024 7:39 pm

Hi,

You're correct on the method-only model as b1 is the intercept and b2 is the effect of the covariate (method). So, p1=p2=inv-logit(b1) and p3=inv-logit(b1+b2). B2 is the difference (on logit scale) between detection for visual surveys and eDNA. From the estimates, it appears detection for eDNA is lower (b2=-.545) than detection for visual surveys.

I'm not sure why the covariate names show pH for the p3's in the weather model. I'll look into it, but I suspect it has something to do with the way R does partial matching of strings. You could try naming the covariates without the parameter names attached (eg., name covariates "pH", "volume", "conduct", "salinity" instead of "p.pH", "p.volume", ...) to see if that prevents the partial matching of names.

The algorithm I use to name covariates looks at the design matrix columns and finds the first non-zero or one cell and append that covariate name to the beta name. Since it finds the covariate names in row 3 of the eDNA covariate model, beta parameters, B3,B4,B5 and B6 get names starting with "p3". In the weather model, covariate names appear in the first row, so B4-B9 beta names start with "p1". The algorithm seemed to work well in most cases, but not in this case. If you'd like to send me (jhines@usgs.gov) your script and data, I'd love to update the algorithm.

Your reading of the design matrices is correct and that is the best way to interpret the beta's (not the beta names). The beta names appear to be confounded in this case as you don't have any interaction terms.

Jim

www.phidot.org

Confused by Beta Names in RPresence

Confused by Beta Names in RPresence

Re: Confused by Beta Names in RPresence

Who is online