polite <- read_csv("data/winter2012/polite.csv")
polite_filt <- polite |>
drop_na(f0mn)35 Regression models: interactions

The previous chapter explained how to include two predictors in a regression model and how to interpret the results of the model. But we were left wondering: what if the effect of attitude differs between the two genders? In this chapter we will talk about interactions: regression models can estimated different effects of one predictor depending on the levels of the other other predictor, by adding a so-called interaction term. Keep reading to learn about them.
35.1 Is the effect of attitude modulated by gender?
Let’s start by reading and filtering the polite data (Winter and Grawunder 2012). We drop NA values from the f0mn column.
Figure 35.1 shows once again the mean f0 values split by gender and attitude. In principle, we have no reason to expect that the effect of attitude will be the same in both genders, and conversely we have no reason to expect the opposite (that it will be different). The best way forward in regression modelling is to allow the model to estimate the (possible) difference: depending on the results, you will be able to report the estimated difference and reason as to whether the difference is meaningful or not.
Code
polite_filt |>
ggplot(aes(gender, f0mn, colour = attitude)) +
geom_jitter(alpha = 0.7, position = position_jitterdodge(jitter.width = 0.1, seed = 2836))
35.2 Interaction terms
To allow a regression model to estimate “interactions” between predictors, or in other words to estimate different effects of one predictors based on the levels of the other, we need to include so-called interaction terms. An interaction term, is a model term that tells the model to estimate interactions. Before we see how to include interaction terms in R/brms, let’s go through the model’s formula.
We will model mean f0 f0mn as a function of gender and attitude and an interaction between the two. Here is what the formula looks like:
\[ \begin{aligned} f0mn_i & \sim Gaussian(\mu_i, \sigma)\\ \mu_i & = \quad \beta_0 \\ & \quad + \beta_1 \cdot gen_{\text{M}[i]} \\ & \quad + \beta_2 \cdot att_{\text{POL}[i]} \\ & \quad + \beta_3 \cdot gen_{\text{M}[i]} \cdot att_{\text{POL}[i]} \end{aligned} \]
There are now not just three \(\beta\) coefficients, but four: \(\beta_0, \beta_1, \beta_2, \beta_3\). The new coefficient that we haven’t seen before is \(\beta_3\). This is the coefficient of the interaction term of gender and attitude. Let’s work out the formula of \(\mu\) depending on the combinations gender and attitude: remember that \(gen_\text{M}\) is 0 for female and 1 for male, and \(att_\text{POL}\) is 0 for informal and 1 for polite.
| female and informal | \(\beta_0\) |
| male and informal | \(\beta_0 + \beta_1\) |
| female and polite | \(\beta_0 + \beta_2\) |
| male and polite | \(\beta_0 + \beta_1 + \beta_2 + \beta_3\) |
We get this because when either \(gen_\text{M}\) or \(att_\text{POL}\) are 0, the corresponding \(\beta\) coefficient gets dropped from the equation. Importantly, notice how for male/polite we need all four coefficients. For male and polite mean f0, we need the coefficient for male \(\beta_1\) and that for polite \(\beta_2\), but also the interaction coefficient \(\beta_3\) because in this combination \(gen_\text{M}\) and \(att_\text{POL}\) are both 1. You can thing of \(\beta_3\) as the “adjustment” to the individual effects of gender = male and attitude = polite. In other words, the interaction coefficient \(\beta_3\) tells you how the combined effects of gender = male and attitude = polite differ from their simple sum (i.e. \(\beta_1 + \beta_2\)).
Moving on to R, we can include an interaction term between gender and attitude in the R formula with gender:attitude (the two predictors separated by a colon :).
f0_bm_int <- brm(
f0mn ~ gender + attitude + gender:attitude,
family = gaussian,
data = polite,
cores = 4,
seed = 7123,
file = "cache/ch-regression-interaction_f0_bm_int"
)Note that there is a syntactic sugar for interactions: instead of writing in full gender + attitude + gender:attitude you can just write gender * attitude (the two predictors separated by a star *). The “star” syntax expands to the full syntax. It is up to you which syntax you use, although for beginners I suggest using the longer one for clarity. Run the model now.
Getting a regression model to estimate interactions is easy, but a warning: because we have included an interaction in the model, the interpretation of the coefficients has changed slightly relative to how we interpret the coefficients of categorical predictors in models without interactions!
35.3 Interpreting models with interactions
summary(f0_bm_int) Family: gaussian
Links: mu = identity; sigma = identity
Formula: f0mn ~ gender + attitude + gender:attitude
Data: polite (Number of observations: 212)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 256.56 5.20 246.38 267.27 1.00 2632 2950
genderM -119.38 7.66 -134.08 -104.50 1.00 2532 2665
attitudepol -17.48 7.28 -31.76 -3.18 1.00 2573 2806
genderM:attitudepol 6.65 10.67 -14.20 27.47 1.00 2191 2693
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 39.10 1.94 35.52 43.14 1.00 3834 2779
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
conditional_effects(f0_bm_int, "gender:attitude")
f0_bm_int_draws <- as_draws_df(f0_bm_int)f0_bm_int_draws <- f0_bm_int_draws |>
mutate(
f_inf = b_Intercept,
f_pol = b_Intercept + b_attitudepol,
m_inf = b_Intercept + b_genderM,
m_pol = b_Intercept + b_genderM + b_attitudepol + `b_genderM:attitudepol`
)library(posterior)This is posterior version 1.6.1
Attaching package: 'posterior'
The following objects are masked from 'package:stats':
mad, sd, var
The following objects are masked from 'package:base':
%in%, match
f0_bm_int_draws |>
mutate(
m_pol_inf = m_pol - m_inf
) |>
summarise(
mean_diff = mean(m_pol_inf), sd_diff = sd(m_pol_inf),
lo_diff = quantile2(m_pol_inf, probs = 0.025), hi_diff = quantile2(m_pol_inf, probs = 0.975)
) |>
round()