class: center, middle, inverse, title-slide .title[ # Bayesian Linear Models ] .subtitle[ ## 02 - BRM basics ] .author[ ### Stefano Coretta ] .institute[ ### University of Edinburgh ] .date[ ### 2023/07/07 ] --- layout: true ## Estimate mean and SD --- <img src="index_files/figure-html/rt-dist-2-1.png" height="500px" style="display: block; margin: auto;" /> --- .f3[ $$ `\begin{aligned} RT_i & \sim Gaussian(\mu, \sigma) \\ \\ \mu & = 1046 \\ \sigma & = 348 \end{aligned}` $$ ] -- - The RT values are distributed according to a Gaussian distribution with mean `\(\mu\)` and standard deviation `\(\sigma\)`. - The mean and SD from the sample do not take into consideration the **uncertainty and variability** in the sample. --- .f3[ $$ `\begin{aligned} RT_i & \sim Gaussian(\mu, \sigma) \\ \\ \mu & \sim Gaussian(\mu_1, \sigma_1) \\ \sigma & \sim Gaussian_+(\mu_2, \sigma_2) \end{aligned}` $$ ] -- - We can account for uncertainty and variability by assuming that the mean and SD are themselves values coming from a probability distribution. - The mean `\(\mu\)` is a value from a Gaussian distribution with mean `\(\mu_1\)` and SD `\(\sigma_1\)`. - The SD `\(\sigma\)` is a value from a Gaussian distribution with mean `\(\mu_2\)` and SD `\(\sigma_2\)`. - SD can only be positive so the Gaussian distribution is truncated to positive values only ( `\(Gaussian_+\)` ). --- ```r # Attach the brms package library(brms) # Run a Bayesian model rt_bm <- brm( # This is the formula of the model. RT ~ 1, # This is the probability distribution family. family = gaussian(), # And the data. data = mald ) ``` --- ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ 1 ## Data: mald (Number of observations: 3000) ## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup draws = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept 1046.44 6.26 1034.22 1058.64 1.00 3527 2803 ## ## Family Specific Parameters: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sigma 347.48 4.38 339.16 356.06 1.00 3292 2558 ## ## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` --- ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ 1 ## Data: mald (Number of observations: 3000) ## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup draws = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-70% CI u-70% CI Rhat Bulk_ESS Tail_ESS ## Intercept 1046.44 6.26 1040.04 1052.96 1.00 3527 2803 ## ## Family Specific Parameters: ## Estimate Est.Error l-70% CI u-70% CI Rhat Bulk_ESS Tail_ESS ## sigma 347.48 4.38 342.91 352.13 1.00 3292 2558 ## ## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` --- .f3[ $$ `\begin{aligned} RT_i & \sim Gaussian(\mu, \sigma) \\ \\ \mu & \sim Gaussian(1046, 6.26) \\ \sigma & \sim Gaussian_+(347, 4.38) \end{aligned}` $$ ] -- - There is a 95% probability that `\(\mu\)` is between 1034 and 1059 ms. There is a 70% probability that it is between 1040 and 1053. - There is a 95% probability that `\(\sigma\)` is between 339 and 356 ms. There is a 70% probability that it is between 343 and 352. -- Great! But what about the RT when the word is a real word vs when it is not? --- layout: false layout: true ## Real vs nonce words --- ```r # Attach the brms package library(brms) # Run a Bayesian model rt_bm_2 <- brm( # This is the formula of the model. RT ~ IsWord, # This is the probability distribution family. family = gaussian(), # And the data. data = mald ) ``` --- ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ IsWord ## Data: mald (Number of observations: 3000) ## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup draws = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS ## Intercept 981.30 8.63 967.63 995.52 1.00 3804 3037 ## IsWordFALSE 132.77 12.38 113.01 152.14 1.00 3822 2809 ## ## Family Specific Parameters: ## Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS ## sigma 341.12 4.42 334.06 348.38 1.00 3932 2800 ## ## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` --- ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ IsWord ## Data: mald (Number of observations: 3000) ## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup draws = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS ## Intercept 981.30 8.63 967.63 995.52 1.00 3804 3037 ## IsWordFALSE 132.77 12.38 113.01 152.14 1.00 3822 2809 ## ## Family Specific Parameters: ## Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS ## sigma 341.12 4.42 334.06 348.38 1.00 3932 2800 ``` .bg-washed-yellow.b--gold.ba.bw2.br3.shadow-5.ph4.mt2[ - The `Intercept` is the mean RT when [IsWord=TRUE]. - `IsWordFALSE` is the **difference** between the mean RT when [IsWord=FALSE] and the mean RT when [IsWord=TRUE]. ] --- ``` ## Family: gaussian ## Links: mu = identity; sigma = identity ## Formula: RT ~ IsWord ## Data: mald (Number of observations: 3000) ## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup draws = 4000 ## ## Population-Level Effects: ## Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS ## Intercept 981.30 8.63 967.63 995.52 1.00 3804 3037 ## IsWordFALSE 132.77 12.38 113.01 152.14 1.00 3822 2809 ## ## Family Specific Parameters: ## Estimate Est.Error l-89% CI u-89% CI Rhat Bulk_ESS Tail_ESS ## sigma 341.12 4.42 334.06 348.38 1.00 3932 2800 ``` .bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[ There is an **89% probability** that the mean RT when [IsWord=FALSE] is **between 113 to 152 ms longer** than the mean RT when [IsWord=TRUE]. ] --- .f3[ $$ `\begin{aligned} RT_i & \sim Gaussian(\mu, \sigma) \\ \\ \mu & = \beta_0 + \beta_1 \times IsWord_{F} \\ \end{aligned}` $$ ] .bg-washed-yellow.b--gold.ba.bw2.br3.shadow-5.ph4.mt2[ - `\(\beta_0\)` is mean RT when [IsWord=TRUE]. - `\(\beta_1\)` is the **difference** between the mean RT when [IsWord=FALSE] and the mean RT when [IsWord=TRUE]. - `\(IsWord_F\)` is `0` when [IsWord=TRUE] and `1` when [IsWord=FALSE]. ] --- .f3[ $$ `\begin{aligned} RT_i & \sim Gaussian(\mu, \sigma) \\ \\ \mu & = \beta_0 + \beta_1 \times IsWord_{F} \\ \end{aligned}` $$ ] .bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[ **Mean RT when [IsWord=TRUE].** $$ `\begin{aligned} \mu & = \beta_0 + \beta_1 \times IsWord_{F} = \beta_0 + \beta_1 \times 0 = \beta_0 \\ \end{aligned}` $$ **Mean RT when [IsWord=FALSE].** $$ `\begin{aligned} \mu & = \beta_0 + \beta_1 \times IsWord_{F} = \beta_0 + \beta_1 \times 1 = \beta_0 + \beta_1 \\ \end{aligned}` $$ ] --- .f3[ $$ `\begin{aligned} RT_i & \sim Gaussian(\mu, \sigma) \\ \\ \mu & = \beta_0 + \beta_1 \times IsWord_{F} \\ \beta_0 & \sim Gaussian(\mu_0, \sigma_0) \\ \beta_1 & \sim Gaussian(\mu_1, \sigma_1) \\ \\ \sigma & \sim Gaussian_+(\mu_2, \sigma_2) \end{aligned}` $$ ]