Statistics and Quantitative Methods (S2)

class: center, middle, inverse, title-slide

.title[
# Statistics and Quantitative Methods (S2)
]
.subtitle[
## Week 8 - Multiple predictors and interactions
]
.author[
### Elizabeth Pankratz
]
.institute[
### University of Edinburgh
]
.date[
### 2023/03/14
]

---

class: center middle reverse

# Please fill out today's attendance form:

`https://forms.office.com/e/A0X816LnUp`

![:scale 30%](imgs/qr.png)

---
layout: false
layout: true

## Summary from last time

---

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(1)** What are **binary outcomes?**

- Outcomes with two discrete levels (e.g., yes/no, grammatical/ungrammatical, correct/incorrect).
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(2)** How do we **visualise** them?

- As proportions.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(3)** What **distribution** do binary outcomes follow?

- A Bernoulli distribution with one parameter, `$p$`, the probability of success.
- In brms: `family = bernoulli()`.
]

---

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(4)**  How do we fit a model requiring a **continuous outcome space** using binary data?

- We model the binary outcomes as being generated based on the probability of success `$p$`.
- We transform this probability into log-odds, which is continuous.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(5)** How do we **interpret** the estimates of such a model?

- The model's estimates are in log-odds space.
- To interpret them, we convert back to probability space using `plogis()`.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(6)** What are **credible intervals**?

- Credible intervals are the difference between two quantiles.
- For Bayesian modelling, they are the region in which, with some probability (usually 95%), the true value lies.
]

---
layout:false

## Five learning objectives for today

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(1)** What is a **factorial design?**
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(2)** How do we estimate and interpret the effects of **multiple predictors?**
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(3)** How do we deal with situations when **one predictor's effect is different, depending on the value of the other predictor?**
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(4)** How can such interactions between predictors be **built into our models?**
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(5)** How do we **interpret model estimates** of interactions? 
]

---

## What is a factorial design?

<br>

.pull-left[
.center[
.pull-left[![:scale 80%](imgs/tree_a.pdf)]

.pull-right[![:scale 80%](imgs/tree_b.pdf)]
]
]

.pull-right[
<br>
.center[
.f4[
|    	    | B = B1     	| B = B2     	|
|-------- |--------	|--------	|
| **A = A1** 	| A1, B1 	| A1, B2 	|
| **A = A2** 	| A2, B1 	| A2, B2 	|
]]]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
How can we analyse **both predictors** at the same time?
]

---

## We've seen multiple predictors before!

Last week's model with dummy variables:

<br>

.f3[
`$$logit(p) = \beta_0 + (\beta_1 \cdot relation_{ncons}) + (\beta_2 \cdot relation_{cons})$$`
]

---
layout:false
layout: true

## Running example: Morphological processing

---

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**Lexical decision task:** Is the string you see a word in English or not?

- Outcome variable: **Accuracy** (incorrect/correct)

- Predictors:

- **Relation_type** (Unrelated/Constituent)
  
  - **Branching** (Left (*[unripe]-ness*) /Right (*un-[graceful]*))
]

<br>

---

```r
shallow <- shallow %>% 
  filter(Relation_type != 'NonConstituent') %>% 
  mutate(
    Relation_type = factor(Relation_type, levels = c('Unrelated', 'Constituent')),
    Branching     = factor(Branching, levels = c('Left', 'Right')),
    Accuracy      = factor(Accuracy, levels = c('incorrect', 'correct'))
  )
```

---
layout:false

## Visualising `Relation_type` and `Branching`

---
layout:false
layout:true

## Setting up `Relation_type` and `Branching`

---

<br>

<br>
--

---

| When: | Then coded as: |
| --- | --- |
| <table> <thead> <tr> <th>Relation_type</th> <th>Branching</th> </tr> </thead> <tbody> <tr> <td>Unrelated</td> <td>Left</td> </tr> <tr> <td>Unrelated</td> <td>Right</td> </tr> <tr> <td>Constituent</td> <td>Left</td> </tr> <tr> <td>Constituent</td> <td>Right</td> </tr> </tbody> </table> | <table> <thead> <tr> <th>Relation_type</th> <th>Branching</th> </tr> </thead> <tbody> <tr> <td>0</td> <td>0</td> </tr> <tr> <td>0</td> <td>1</td> </tr> <tr> <td>1</td> <td>0</td> </tr> <tr> <td>1</td> <td>1</td> </tr> </tbody> </table> |

<br>
--

Let's verify that this coding is what R will use:

.pull-left[

```r
contrasts(shallow$Relation_type)
```

```
##             Constituent
## Unrelated             0
## Constituent           1
```
]

.pull-right[

```r
contrasts(shallow$Branching)
```

```
##       Right
## Left      0
## Right     1
```
]

---
layout: false

## What will the `$\beta$`s mean now?

`$$logit(p) = \beta_0 + (\beta_1 \cdot relation) + (\beta_2 \cdot branch)$$`
<br>
--

$$
`\begin{aligned}
\text{Unrelated, Left:}     & & \beta_0 &+ (\beta_1 \cdot 0) + (\beta_2 \cdot 0)  &&= \beta_0  &\\
\text{Unrelated, Right:}    & & \beta_0 &+ (\beta_1 \cdot 0) + (\beta_2 \cdot 1)  &&= \beta_0 + \beta_2 &\\
\text{Constituent, Left:}   & & \beta_0 &+ (\beta_1 \cdot 1) + (\beta_2 \cdot 0)  &&= \beta_0 + \beta_1 &\\
\text{Constituent, Right:}  & & \beta_0 &+ (\beta_1 \cdot 1) + (\beta_2 \cdot 1)  &&= \beta_0 + \beta_1 + \beta_2 &\\
\end{aligned}`
$$

---

## The model we'll fit

.f3[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + \beta_1 \cdot relation + \beta_2 \cdot branch \\
\beta_0 & \sim Gaussian(\mu_0, \sigma_0) \\
\beta_1 & \sim Gaussian(\mu_1, \sigma_1) \\
\beta_2 & \sim Gaussian(\mu_2, \sigma_2) \\
\end{aligned}`
$$
]

---

```r
acc_mult_bm <- brm(
  Accuracy ~ Relation_type + Branching,
  family = bernoulli(),
  data   = shallow,
  backend = 'cmdstanr',
  file = 'data/cache/acc_mult_bm'
)
```

```r
summary(acc_mult_bm)
```

```
##  Family: bernoulli 
##   Links: mu = logit 
## Formula: Accuracy ~ Relation_type + Branching 
##    Data: shallow (Number of observations: 692) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##                          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                    1.08      0.17     0.75     1.43 1.00     4499     3212
## Relation_typeConstituent     0.69      0.25     0.20     1.17 1.00     2877     2567
## BranchingRight               1.44      0.27     0.91     1.99 1.00     2465     2441
## 
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

---
layout:true

## Model estimates: `Intercept`

---

```
## Population-Level Effects: 
##                          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                    1.08      0.17     0.75     1.43 1.00     4499     3212
## Relation_typeConstituent     0.69      0.25     0.20     1.17 1.00     2877     2567
## BranchingRight               1.44      0.27     0.91     1.99 1.00     2465     2441
```

<br>

.f4[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + \beta_1 \cdot relation + \beta_2 \cdot branch \\
\beta_0 & \sim Gaussian(\mu_0, \sigma_0) \\
\beta_1 & \sim Gaussian(\mu_1, \sigma_1) \\
\beta_2 & \sim Gaussian(\mu_2, \sigma_2) \\
\end{aligned}`
$$
]

---

<br>

.f4[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + \beta_1 \cdot relation + \beta_2 \cdot branch \\
\beta_0 & \sim Gaussian(1.08, 0.17) \\
\beta_1 & \sim Gaussian(\mu_1, \sigma_1) \\
\beta_2 & \sim Gaussian(\mu_2, \sigma_2) \\
\end{aligned}`
$$
]

---
layout:false

## Model estimates: `Relation_type`

<br>

.f4[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + \beta_1 \cdot relation + \beta_2 \cdot branch \\
\beta_0 & \sim Gaussian(1.08, 0.17) \\
\beta_1 & \sim Gaussian(0.69, 0.25) \\
\beta_2 & \sim Gaussian(\mu_2, \sigma_2) \\
\end{aligned}`
$$
]

---

## Model estimates: `Branching`

<br>

.f4[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + \beta_1 \cdot relation + \beta_2 \cdot branch \\
\beta_0 & \sim Gaussian(1.08, 0.17) \\
\beta_1 & \sim Gaussian(0.69, 0.25) \\
\beta_2 & \sim Gaussian(1.44, 0.27) \\
\end{aligned}`
$$
]

---

## Conditional posterior probability of log-odds of "correct" response

<br>
--

From the model: `$\beta_0 = 1.08$`, `$\beta_1 = 0.69$`, and `$\beta_2 = 1.44$`.

<br>
--

$$
`\begin{aligned}
\text{Unrelated, Left:}     & & 1.08 &+ (0.69 \cdot 0) + (1.44 \cdot 0)  &&= 1.08  &= 1.08 \text{ log-odds}\\
\text{Unrelated, Right:}    & & 1.08 &+ (0.69 \cdot 0) + (1.44 \cdot 1)  &&= 1.08 + 1.44 &= 2.52 \text{ log-odds}\\
\text{Constituent, Left:}   & & 1.08 &+ (0.69 \cdot 1) + (1.44 \cdot 0)  &&= 1.08 + 0.69 &= 1.77 \text{ log-odds}\\
\text{Constituent, Right:}  & & 1.08 &+ (0.69 \cdot 1) + (1.44 \cdot 1)  &&= 1.08 + 0.69 + 1.44 &= 3.21 \text{ log-odds}\\
\end{aligned}`
$$

???

Ask: now, how move to probs?

---

---

---
layout:true

## Do these estimates match the data?

---

---

???

The model assumes there's a difference in Branching = Right where there doesn't seem to be one.
And because that difference is small, it's also underestimating the larger difference in Branching = Left.

If only there were some way to tell the model that the effect of relation type can be different between the two levels of branching.

Oh wait...

---
layout:false
class: center middle reverse

# Time for a wee break!

### If you haven't yet, please fill in today's attendance form:

`https://forms.office.com/e/A0X816LnUp`

![:scale 30%](imgs/qr.png)

---

## How does the current model fall short?

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
The question we want to answer: 
- **"Is the effect of `Relation_type` different when `Branching == Left` compared to when `Branching == Right`?"**
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
Our current model doesn't let effects differ between levels of other variables.
]

.bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt2[
The solution: Include an **interaction** between these two predictors.
]

---
layout:true

## In other words: we want a _difference of differences_

---

---

---
layout: false

## How do we make this happen?

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
Create a **new variable** that represents the interaction.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
Assign that variable **its own `$\beta$` coefficient.**
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
Estimate `$\beta$` as usual.

- This estimate tells us **how much one predictor's effect changes between levels of the other**.
]

.bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt2[
**What does the new variable look like?**
]

---

## Treatment coding, feat. interaction

| When: | Then coded as: |
| --- | --- |
| <table> <thead> <tr> <th>Relation_type</th> <th>Branching</th> </tr> </thead> <tbody> <tr> <td>Unrelated</td> <td>Left</td> </tr> <tr> <td>Unrelated</td> <td>Right</td> </tr> <tr> <td>Constituent</td> <td>Left</td> </tr> <tr> <td>Constituent</td> <td>Right</td> </tr> </tbody> </table> | <table> <thead> <tr> <th>Relation_type</th> <th>Branching</th> <th>Relation_type:Branching</th> </tr> </thead> <tbody> <tr> <td>0</td> <td>0</td> <td>0 \* 0 = 0</td> </tr> <tr> <td>0</td> <td>1</td> <td>0 \* 1 = 0</td> </tr> <tr> <td>1</td> <td>0</td> <td>1 \* 0 = 0</td> </tr> <tr> <td>1</td> <td>1</td> <td>1 \* 1 = 1</td> </tr> </tbody> </table>  |

<br>
--

`$$logit(p) = \beta_0 + (\beta_1 \cdot relation) + (\beta_2 \cdot branch) + (\beta_3 \cdot relation \cdot branch)$$`

<br>
--

$$
`\begin{aligned}
\text{Unrelated, Left:}     && \beta_0 &+ (\beta_1 \cdot 0) + (\beta_2 \cdot 0) + (\beta_3 \cdot 0) &&= \beta_0  &\\
\text{Unrelated, Right:}    && \beta_0 &+ (\beta_1 \cdot 0) + (\beta_2 \cdot 1) + (\beta_3 \cdot 0) &&= \beta_0 + \beta_2 &\\
\text{Constituent, Left:}   && \beta_0 &+ (\beta_1 \cdot 1) + (\beta_2 \cdot 0) + (\beta_3 \cdot 0) &&= \beta_0 + \beta_1 &\\
\text{Constituent, Right:}  && \beta_0 &+ (\beta_1 \cdot 1) + (\beta_2 \cdot 1) + (\beta_3 \cdot 1) &&= \beta_0 + \beta_1 + \beta_2 + \beta_3 &\\
\end{aligned}`
$$

---

## What does the extra `$\beta$` give us?

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
With both variables at their non-baseline level (coded as 1) in a model **with no interaction:**

$$
 \beta_0 + (\beta_1 \cdot 1) + (\beta_2 \cdot 1) = \beta_0 + \beta_1 + \beta_2 \\
$$
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
And in a model **with an interaction:**

$$
\beta_0 + (\beta_1 \cdot 1) + (\beta_2 \cdot 1) + (\beta_3 \cdot 1) = \beta_0 + \beta_1 + \beta_2 + \beta_3 \\
$$
]

.bg-washed-green.b--dark-green.ba.bw2.br3.shadow-5.ph4.mt2[
`$\beta_3$` can **adjust** the value of `$\beta_1$` or `$\beta_2$`.

In other words, it can **modulate the other effects.**
]

---

## The model we'll fit

.f4[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + (\beta_1 \cdot relation) + (\beta_2 \cdot branch) + (\beta_3 \cdot relation \cdot branch)\\
\beta_0 & \sim Gaussian(\mu_0, \sigma_0) \\
\beta_1 & \sim Gaussian(\mu_1, \sigma_1) \\
\beta_2 & \sim Gaussian(\mu_2, \sigma_2) \\
\beta_3 & \sim Gaussian(\mu_3, \sigma_3) \\
\end{aligned}`
$$
]

---

## How to specify a model with an interaction

```r
acc_inter_bm <- brm(
  Accuracy ~ Relation_type + Branching + Relation_type:Branching,
  family = bernoulli(),
  data   = shallow,
  backend = 'cmdstanr',
  file = 'data/cache/acc_inter_bm'
)
```

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
A note on the syntax for specifying interactions in R:

**`Relation_type + Branching + Relation_type:Branching`**

is the same as

**`Relation_type * Branching`**

but more transparent about what's happening behind the scenes!
]

---

<br>

```r
summary(acc_inter_bm)
```

```
##  Family: bernoulli 
##   Links: mu = logit 
## Formula: Accuracy ~ Relation_type + Branching + Relation_type:Branching 
##    Data: shallow (Number of observations: 692) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##                                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                                   1.02      0.17     0.69     1.38 1.00     3445     3109
## Relation_typeConstituent                    0.85      0.29     0.28     1.43 1.00     2527     2141
## BranchingRight                              1.69      0.36     1.04     2.42 1.00     2131     2325
## Relation_typeConstituent:BranchingRight    -0.63      0.55    -1.72     0.42 1.00     1860     2203
## 
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
```

???

`$\beta_3$`'s mean of -0.63 means that, when we move from Branching Left -> Right, the effect of Relation_type is smaller.

Or equivalently (but not how we're looking at it here), it means that, when we move from Relation_type Unrelated -> Constituent, the effect of Branching is smaller.

---

<br>

```
## Population-Level Effects: 
##                                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept                                   1.02      0.17     0.69     1.38 1.00     3445     3109
## Relation_typeConstituent                    0.85      0.29     0.28     1.43 1.00     2527     2141
## BranchingRight                              1.69      0.36     1.04     2.42 1.00     2131     2325
## Relation_typeConstituent:BranchingRight    -0.63      0.55    -1.72     0.42 1.00     1860     2203
```

<br>

.f4[
$$
`\begin{aligned}
\text{acc} & \sim Bernoulli(p) \\
logit(p) & = \beta_0 + (\beta_1 \cdot relation) + (\beta_2 \cdot branch) + (\beta_3 \cdot relation \cdot branch)\\
\beta_0 & \sim Gaussian(1.02, 0.17) \\
\beta_1 & \sim Gaussian(0.85, 0.29) \\
\beta_2 & \sim Gaussian(1.69, 0.36) \\
\beta_3 & \sim Gaussian(-0.63, 0.55) \\
\end{aligned}`
$$
]

---
layout: true

## How to interpret `$\beta_3 \sim Gaussian(-0.63, 0.55)$`

---

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**Interpretation 1:**

- `$\beta_3$`'s mean of `$-0.63$` is a **_negative_ adjustment to the effect of `Relation_type`** when we move from `Branching == Left` (baseline) to `Branching == Right` (non-baseline).

- In other words, the model thinks the effect of `Relation_type` is **smaller** in `Branching == Right` (non-baseline) than in `Branching == Left` (baseline).
  
  - The 95% CrI `$[-1.72, 0.42]$` suggests a little uncertainty about the direction of the interaction.
]

--
  
.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**Equivalently, interpretation 2:**

- `$\beta_3$` is a **_negative_ adjustment to the effect of `Branching`** when we move from `Relation_type == Unrelated` (baseline) to `Relation_type == Constituent` (non-baseline).

- Or: the model thinks the effect of `Branching` is **(probably) smaller** in `Relation_type == Constituent` (non-baseline) than in `Relation_type == Unrelated` (baseline).
]

---

.pull-left[
<img src="index_files/figure-html/barplot-symm1-1.png" width="100%" style="display: block; margin: auto;" />
]

.pull-right[
<img src="index_files/figure-html/barplot-symm2-1.png" width="100%" style="display: block; margin: auto;" />
]

---
layout:false

## Conditional posterior probability of log-odds of "correct" response

<br>
--

From the model: `$\beta_0 = 1.02$`, `$\beta_1 = 0.85$`, `$\beta_2 = 1.69$`, and `$\beta_3 = -0.63$`.

<br>
--

$$
`\begin{aligned}
\text{Unr., L:}     && 1.02 &+ (0.85 \cdot 0) + (1.69 \cdot 0) + (-0.63 \cdot 0) &&= 1.02  &= 1.02 \text{ log-odds}\\
\text{Unr., R:}    && 1.02 &+ (0.85 \cdot 0) + (1.69 \cdot 1) + (-0.63 \cdot 0) &&= 1.02 + 1.69 &= 2.71 \text{ log-odds}\\
\text{Con., L:}   && 1.02 &+ (0.85 \cdot 1) + (1.69 \cdot 0) + (-0.63 \cdot 0) &&= 1.02 + 0.85 &= 1.87 \text{ log-odds}\\
\text{Con., R:}  && 1.02 &+ (0.85 \cdot 1) + (1.69 \cdot 1) + (-0.63 \cdot 1) &&= 1.02 + 0.85 + 1.69 -0.63 &= 2.93 \text{ log-odds}\\
\end{aligned}`
$$

---

---

## Compare estimates with and without the interaction

.pull-left[
<img src="index_files/figure-html/mult-draws-dens-L-1.png" width="100%" style="display: block; margin: auto;" />
]

.pull-right[
<img src="index_files/figure-html/inter-draws-dens-R-1.png" width="100%" style="display: block; margin: auto;" />
]

---

## How well does the interaction model fit the data?

---

.pull-left[
<img src="index_files/figure-html/mult-barplot-L-1.png" width="100%" style="display: block; margin: auto;" />
]

.pull-right[
<img src="index_files/figure-html/inter-barplot-R-1.png" width="100%" style="display: block; margin: auto;" />
]

---
layout: false
layout: true

## Learning objectives revisited

---

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(1)** What is a **factorial design?**
- A design with multiple predictors in which data is gathered for **every combination** of those predictors' levels.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(2)** How do we estimate and interpret the effects of **multiple predictors?**
- A predictor's `$\beta$` is the effect of that predictor *when the other predictor's value is 0*.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(3)** How do we deal with situations when **one predictor's effect is different, depending on the value of the other predictor?**
- We fit a model that contains an **interaction term** between those two predictors.
]

---

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(4)** How can such interactions between predictors be **built into our models?**
- The interaction term is defined as the **product** of the two interacting predictors.
- It gets its own `$\beta$` coefficient, which we estimate as usual.
]

.bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[
**(5)** How do we **interpret model estimates** of interactions? 
- The interaction term's `$\beta$` tells us **how much one predictor's effect changes between the baseline and non-baseline levels of the other**.
- Interactions are symmetrical, and so they have two equivalent interpretations.
]