Introduction to Bayesian regression models

04 - Categorical predictors

Stefano Coretta

University of Edinburgh

MALD again

Figure 1: RTs by lexical status.

MALD: IsWord

Does the lexical status of the target word (real or nonce) affect reaction times?

Categorical predictors: treatment contrasts

brm_cat <- brm(
  RT ~ 1 + PhonLev + IsWord,
  data = mald,
  family = lognormal,
  cores = 4,
  seed = 9812,
  file = "data/cache/brm_cat.rds"
)

brm_cat
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: RT ~ 1 + PhonLev + IsWord 
   Data: mald (Number of observations: 3000) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept       6.66      0.03     6.61     6.72 1.00     6091     3254
PhonLev         0.03      0.00     0.02     0.03 1.00     6579     3225
IsWordFALSE     0.12      0.01     0.10     0.14 1.00     4237     2689

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.28      0.00     0.27     0.29 1.00     3265     2928

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Interpret treatment contrasts

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept       6.66      0.03     6.61     6.72 1.00     6091     3254
PhonLev         0.03      0.00     0.02     0.03 1.00     6579     3225
IsWordFALSE     0.12      0.01     0.10     0.14 1.00     4237     2689
  • Intercept: mean RT when PhonLev is 0 and IsWord = TRUE.

  • PhonLev: change in mean RT for each unit increase in PhonLev (average across IsWord).

  • IsWordFalse: difference in mean RT between IsWord = FALSE and TRUE (when PhonLev = 0).

So, does the lexical status of the word affect RTs?

Indexing categorical predictors

\[ \begin{align} RT_i & \sim Lognormal(\mu_i, \sigma)\\ log(\mu_i) & = \beta_1 \cdot W_{\text{T}[i]} + \beta_2 \cdot W_{\text{F}[i]} + \beta_3 \cdot PL_i\\ \end{align} \]

  • \(\beta_1\): mean RT when IsWord = TRUE (\(W_{T}\)).

  • \(\beta_1\): mean RT when IsWord = FALSE (\(W_{F}\)).

  • \(\beta_3\): change in RT for each unit increase of PhonLev (\(PL\)).

Indexing with intercept suppression

brm_ind <- brm(
  RT ~ 0 + IsWord + PhonLev,
  data = mald,
  family = lognormal,
  cores = 4,
  seed = 6293,
  file = "data/cache/brm_ind.rds"
)

brm_ind
 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: RT ~ 0 + IsWord + PhonLev 
   Data: mald (Number of observations: 3000) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
IsWordTRUE      6.66      0.03     6.61     6.72 1.00     1356     1375
IsWordFALSE     6.78      0.03     6.72     6.84 1.00     1335     1369
PhonLev         0.03      0.00     0.02     0.03 1.00     1335     1430

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.28      0.00     0.27     0.29 1.00     1695     1735

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

RTs by lexical status (at mean PhonLev)

conditional_effects(brm_ind, "IsWord")

RTs by lexical status (PhonLev = 0)

conditional_effects(brm_ind, "IsWord", conditions = list(PhonLev = 0))

RTs by lexical status (PhonLev = 5)

conditional_effects(brm_ind, "IsWord", conditions = list(PhonLev = 5))

RTs by lexical status (PhonLev = 10)

conditional_effects(brm_ind, "IsWord", conditions = list(PhonLev = 10))

RTs by lexical status (PhonLev = 14)

conditional_effects(brm_ind, "IsWord", conditions = list(PhonLev = 14))

Calculate RT difference from the MCMC draws: calculate RTs

brm_ind_draws <- as_draws_df(brm_ind) |> 
  mutate(
    # RT when PhonLev = 5 (min PhonLev)
    rt_true_5 = exp(b_IsWordTRUE + b_PhonLev * 5),
    rt_false_5 = exp(b_IsWordFALSE + b_PhonLev * 5),
    # RT when PhonLev = 7 (mean PhonLev)
    rt_true_7 = exp(b_IsWordTRUE + b_PhonLev * 7),
    rt_false_7 = exp(b_IsWordFALSE + b_PhonLev * 7),
    # RT when PhonLev = 14 (max PhonLev)
    rt_true_14 = exp(b_IsWordTRUE + b_PhonLev * 14),
    rt_false_14 = exp(b_IsWordFALSE + b_PhonLev * 14),
  )

Calculate RT difference from the MCMC draws: get difference

brm_ind_draws <- brm_ind_draws |> 
  mutate(
    rt_05_diff = rt_false_5 - rt_true_5,
    rt_07_diff = rt_false_7 - rt_true_7,
    rt_14_diff = rt_false_14 - rt_true_14,
  )

brm_ind_draws |> select(rt_05_diff:rt_14_diff)
# A tibble: 4,000 × 3
   rt_05_diff rt_07_diff rt_14_diff
        <dbl>      <dbl>      <dbl>
 1      105.        110.       128.
 2      109.        114.       131.
 3      108.        112.       128.
 4      124.        128.       145.
 5      110.        116.       139.
 6      119.        125.       149.
 7       98.4       104.       129.
 8      116.        123.       152.
 9      114.        121.       150.
10      110.        117.       146.
# ℹ 3,990 more rows

Calculate CrIs of difference

library(posterior)

brm_ind_draws |> 
  select(rt_05_diff:rt_14_diff) |> 
  pivot_longer(everything()) |> 
  group_by(name) |> 
  summarise(
    # Use quantile2() from posterior package
    lo_95 = round(quantile2(value, 0.025)),
    hi_95 = round(quantile2(value, 0.0975))
  )
# A tibble: 3 × 3
  name       lo_95 hi_95
  <chr>      <dbl> <dbl>
1 rt_05_diff    93   100
2 rt_07_diff    98   105
3 rt_14_diff   118   126

MALD: IsWord

Does the lexical status of the target word (real or nonce) affect reaction times?

Yes.

But…

Figure 2: RTs by lexical status and mean phone-level distance.