class: center, middle, inverse, title-slide .title[ # Statistics and Quantitative Methods (S1) ] .subtitle[ ## Week 8 ] .author[ ### Dr Stefano Coretta ] .institute[ ### University of Edinburgh ] .date[ ### 2022/11/08 ] --- class: center middle .f1.link.dim.br3.ph3.pv2.mb2.dib.white.bg-purple[ [.white[Linear models illustrated]](https://stefanocoretta.shinyapps.io/lines/) ] --- layout: true ## Repeated measures: Multiple subjects --- ```r mald_1_1 ``` ``` ## # A tibble: 5,000 × 7 ## Subject Item IsWord PhonLev RT ACC RT_log ## <chr> <chr> <fct> <dbl> <int> <fct> <dbl> ## 1 15308 acreage TRUE 6.01 617 correct 6.42 ## 2 15308 maxraezaxr FALSE 6.78 1198 correct 7.09 ## 3 15308 prognosis TRUE 8.14 954 correct 6.86 ## 4 15308 giggles TRUE 6.22 579 correct 6.36 ## 5 15308 baazh FALSE 6.13 1011 correct 6.92 ## 6 15308 unflagging TRUE 7.66 1402 correct 7.25 ## 7 15308 ihnpaykaxrz FALSE 7.47 1059 correct 6.97 ## 8 15308 hawk TRUE 6.09 739 correct 6.61 ## 9 15308 assessing TRUE 6.37 789 correct 6.67 ## 10 15308 mehlaxl FALSE 5.80 926 correct 6.83 ## # … with 4,990 more rows ``` --- ```r mald_1_1 %>% mutate(ACC_num = ifelse(ACC == "correct", 1, 0)) %>% ggplot(aes(PhonLev, ACC_num, colour = IsWord)) + geom_point() + geom_smooth(method = "glm", method.args = list(family = "binomial")) ``` <img src="index_files/figure-html/mald-acc-1.png" height="400px" style="display: block; margin: auto;" /> --- layout: false class: center middle .f1.link.dim.br3.ph3.pv2.mb2.dib.white.bg-purple[ [.white[Linear models illustrated]](https://stefanocoretta.shinyapps.io/lines/) ] --- layout: true ## Random effects: varying intercept --- ```r mald_lm <- glm(ACC ~ IsWord * PhonLev, data = mald_1_1, family = binomial()) tidy(mald_lm) ``` ``` ## # A tibble: 4 × 5 ## term estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) 1.59 0.368 4.31 0.0000162 ## 2 IsWordFALSE -2.77 0.591 -4.68 0.00000285 ## 3 PhonLev 0.0719 0.0520 1.38 0.167 ## 4 IsWordFALSE:PhonLev 0.385 0.0861 4.47 0.00000784 ``` --- ```r mald_lm_p <- ggpredict(mald_lm, terms = c("PhonLev [all]", "IsWord")) %>% plot(limits = c(0.75, 1)) mald_lm_p ``` <img src="index_files/figure-html/mald-lm-p-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r subjs <- sample(mald_1_1$Subject, 9) mald_1_1 %>% filter(Subject %in% subjs) %>% ggplot(aes(Subject, fill = ACC)) + geom_bar(position = "fill") ``` <img src="index_files/figure-html/mald-subj-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r library(lme4) mald_lm_2 <- glmer(ACC ~ IsWord * PhonLev + (1 | Subject), data = mald_1_1, family = binomial()) cat(capture.output(summary(mald_lm_2))[13:23], sep = "\n") ``` ``` ## ## Random effects: ## Groups Name Variance Std.Dev. ## Subject (Intercept) 0.09121 0.302 ## Number of obs: 5000, groups: Subject, 50 ## ## Fixed effects: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 1.61126 0.37304 4.319 1.57e-05 ## IsWordFALSE -2.77650 0.59261 -4.685 2.80e-06 ## PhonLev 0.07383 0.05234 1.410 0.158 ``` --- ```r mald_lm_2_p <- ggpredict(mald_lm_2, terms = c("PhonLev [all]", "IsWord")) %>% plot(limits = c(0.75, 1)) mald_lm_2_p ``` <img src="index_files/figure-html/mald-lm-2-p-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r # library(patchwork) mald_lm_p + mald_lm_2_p + plot_layout(nrow = 2, guides = "collect") ``` <img src="index_files/figure-html/mald-lm-lm-p-1.png" height="400px" style="display: block; margin: auto;" /> --- Let's check the **conditional means** ```r mald_lm_2_r <- ranef(mald_lm_2) %>% as_tibble() mald_lm_2_r ``` ``` ## # A tibble: 50 × 5 ## grpvar term grp condval condsd ## <chr> <fct> <fct> <dbl> <dbl> ## 1 Subject (Intercept) 15308 0.450 0.234 ## 2 Subject (Intercept) 15345 0.0591 0.221 ## 3 Subject (Intercept) 15373 -0.0208 0.217 ## 4 Subject (Intercept) 15384 -0.392 0.201 ## 5 Subject (Intercept) 15388 0.00308 0.216 ## 6 Subject (Intercept) 15540 -0.388 0.204 ## 7 Subject (Intercept) 15739 -0.147 0.215 ## 8 Subject (Intercept) 15903 -0.251 0.208 ## 9 Subject (Intercept) 16018 -0.260 0.209 ## 10 Subject (Intercept) 16054 -0.402 0.205 ## # … with 40 more rows ``` --- ```r mald_lm_2_r %>% ggplot(aes(condval, grp)) + geom_vline(xintercept = 0, colour = "#7fc97f") + geom_pointrange( aes(xmin = condval + 1.96 * condsd, xmax = condval - 1.96 * condsd), alpha = 0.7 ) ``` --- .center[ ![:scale 50%](index_files/figure-html/mald-lm-2-r-p-1.png) ] --- ```r ggpredict(mald_lm_2, terms = c("PhonLev [all]", "IsWord", "Subject [sample=8]"), type = "re") %>% plot() ``` <img src="index_files/figure-html/mald-lm-2-pred-r-1.png" height="400px" style="display: block; margin: auto;" /> --- ### Reporting Easy! Same as before but in the model specification blurb add the following: > We also included a by-subject varying (or random) intercept. --- layout: false class: center middle .f1.link.dim.br3.ph3.pv2.mb2.dib.white.bg-purple[ [.white[Linear models illustrated]](https://stefanocoretta.shinyapps.io/lines/) ] --- layout: true ## Random effects: varying slopes --- ```r mald_1_1 %>% filter(Subject %in% sample(mald_1_1$Subject, 9)) %>% mutate(ACC_num = ifelse(ACC == "correct", 1, 0)) %>% ggplot(aes(PhonLev, ACC_num, colour = IsWord)) + geom_point() + geom_smooth(method = "glm", method.args = list(family = "binomial")) + facet_wrap(~ Subject) ``` --- .center[ ![:scale 50%](index_files/figure-html/mald-subj-2-1.png) ] --- ```r mald_lm_3 <- glmer(ACC ~ IsWord * PhonLev + (IsWord | Subject), data = mald_1_1, family = binomial()) cat(capture.output(summary(mald_lm_3))[13:24], sep = "\n") ``` ``` ## ## Random effects: ## Groups Name Variance Std.Dev. Corr ## Subject (Intercept) 0.3114 0.558 ## IsWordFALSE 1.4020 1.184 -0.89 ## Number of obs: 5000, groups: Subject, 50 ## ## Fixed effects: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 1.66918 0.38633 4.321 1.56e-05 ## IsWordFALSE -2.83255 0.63037 -4.493 7.01e-06 ## PhonLev 0.07740 0.05336 1.451 0.147 ``` --- ```r mald_lm_3_p <- ggpredict(mald_lm_3, terms = c("PhonLev [all]", "IsWord")) %>% plot(limits = c(0.75, 1)) mald_lm_3_p ``` <img src="index_files/figure-html/mald-lm-3-p-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r mald_lm_p + mald_lm_3_p + plot_layout(nrow = 2, guides = "collect") ``` <img src="index_files/figure-html/mald-lm-lm-3-p-1.png" height="400px" style="display: block; margin: auto;" /> --- Let's check the **conditional means** ```r mald_lm_3_r <- ranef(mald_lm_3) %>% as_tibble() mald_lm_3_r ``` ``` ## # A tibble: 100 × 5 ## grpvar term grp condval condsd ## <chr> <fct> <fct> <dbl> <dbl> ## 1 Subject (Intercept) 15308 0.124 0.340 ## 2 Subject (Intercept) 15345 0.714 0.358 ## 3 Subject (Intercept) 15373 -0.0677 0.321 ## 4 Subject (Intercept) 15384 0.380 0.350 ## 5 Subject (Intercept) 15388 0.229 0.349 ## 6 Subject (Intercept) 15540 -0.415 0.297 ## 7 Subject (Intercept) 15739 -0.689 0.294 ## 8 Subject (Intercept) 15903 0.920 0.367 ## 9 Subject (Intercept) 16018 0.0954 0.329 ## 10 Subject (Intercept) 16054 0.967 0.372 ## # … with 90 more rows ``` --- ```r mald_lm_3_r %>% filter(term == "IsWordFALSE") %>% ggplot(aes(condval, reorder(grp, condval))) + geom_vline(xintercept = 0, colour = "#7fc97f") + geom_pointrange( aes(xmin = condval + 1.96 * condsd, xmax = condval - 1.96 * condsd), alpha = 0.7 ) ``` --- .center[ ![:scale 50%](index_files/figure-html/mald-lm-3-r-p-1.png) ] --- ```r ggpredict(mald_lm_3, terms = c("PhonLev [all]", "IsWord", "Subject [sample=9]"), type = "re") %>% plot() ``` <img src="index_files/figure-html/mald-lm-3-pred-r-1.png" height="400px" style="display: block; margin: auto;" /> --- ### Reporting Easy! Same as before but in the model specification blurb add the following: > We also included a by-subject varying intercept plus a by-subject varying slope for IsWord. --- layout: false layout: true ## Repeated measures: Multiple subjects and words --- ```r shallow_filt <- shallow %>% filter(Branching == "Left") shallow_filt ``` ``` ## # A tibble: 1,170 × 11 ## Group ID List Target RT RT_log Criti…¹ Word_…² Relat…³ Branc…⁴ accur…⁵ ## <chr> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr> <fct> <chr> <chr> ## 1 L1 L1_01 A unawa… 603 6.40 Critic… Word Unrela… Left correct ## 2 L1 L1_01 A unhol… 739 6.61 Critic… Word Consti… Left correct ## 3 L1 L1_01 A unhap… 370 5.91 Critic… Word Unrela… Left correct ## 4 L1 L1_01 A unsha… 821 6.71 Critic… Word Consti… Left correct ## 5 L1 L1_01 A unrip… 1035 6.94 Critic… Word Unrela… Left incorr… ## 6 L1 L1_01 A unkin… 498 6.21 Critic… Word NonCon… Left correct ## 7 L1 L1_01 A unwar… 1133 7.03 Critic… Word NonCon… Left correct ## 8 L1 L1_01 A uncle… 513 6.24 Critic… Word Consti… Left correct ## 9 L1 L1_01 A reobt… 964 6.87 Critic… Word NonCon… Left incorr… ## 10 L1 L1_01 A resea… 709 6.56 Critic… Word Unrela… Left correct ## # … with 1,160 more rows, and abbreviated variable names ¹Critical_Filler, ## # ²Word_Nonword, ³Relation_type, ⁴Branching, ⁵accuracy ``` --- ```r unique(shallow_filt$ID) ``` ``` ## [1] "L1_01" "L1_02" "L1_03" "L1_05" "L1_06" "L1_07" "L1_08" "L1_09" "L1_10" ## [10] "L1_11" "L1_12" "L1_13" "L1_14" "L1_15" "L1_16" "L1_17" "L1_18" "L1_19" ## [19] "L1_20" "L1_21" "L1_22" "L1_23" "L1_24" "L1_25" "L1_26" "L1_27" "L1_28" ## [28] "L1_29" "L1_30" "L2_01" "L2_02" "L2_03" "L2_04" "L2_05" "L2_06" "L2_07" ## [37] "L2_08" "L2_10" "L2_11" "L2_12" "L2_13" "L2_14" "L2_16" "L2_17" "L2_18" ## [46] "L2_19" "L2_20" "L2_21" "L2_22" "L2_23" "L2_24" "L2_25" "L2_26" "L2_27" ## [55] "L2_28" "L2_29" "L2_30" "L2_31" "L2_32" "L2_33" "L2_34" "L2_35" "L2_36" ## [64] "L2_37" "L2_38" ``` ```r unique(shallow_filt$Target) ``` ``` ## [1] "unawareness" "unholiness" "unhappiness" "unsharpness" "unripeness" ## [6] "unkindness" "unwariness" "unclearness" "reobtainable" "resealable" ## [11] "recomputable" "readjustable" "reattachable" "resellable" "reclosable" ## [16] "uncleverness" "reusable" "rehydratable" ``` -- `ID` and `Target` are crossed, i.e. each subject has seen all the targets and each target has been seen by each subject. We talk about **crossed random effects**. --- ```r shal_lm <- lmer( RT ~ Relation_type * Group + (1 | ID) + (1 | Target), data = shallow_filt ) cat(capture.output(summary(shal_lm))[11:25], sep = "\n") ``` ``` ## Random effects: ## Groups Name Variance Std.Dev. ## ID (Intercept) 21495 146.61 ## Target (Intercept) 7928 89.04 ## Residual 86941 294.86 ## Number of obs: 1170, groups: ID, 65; Target, 18 ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 897.788 41.005 21.894 ## Relation_typeConstituent -95.502 31.619 -3.020 ## Relation_typeNonConstituent -18.350 31.619 -0.580 ## GroupL2 160.338 47.336 3.387 ## Relation_typeConstituent:GroupL2 4.362 42.488 0.103 ## Relation_typeNonConstituent:GroupL2 -45.018 42.488 -1.060 ``` --- ```r shal_lm_p <- ggpredict(shal_lm, terms = c("Group", "Relation_type")) %>% plot() shal_lm_p ``` <img src="index_files/figure-html/shal-lm-p-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r shal_lm_r <- ranef(shal_lm) %>% as_tibble() shal_lm_r ``` ``` ## # A tibble: 83 × 5 ## grpvar term grp condval condsd ## <chr> <fct> <fct> <dbl> <dbl> ## 1 ID (Intercept) L1_01 -106. 63.9 ## 2 ID (Intercept) L1_02 13.9 63.9 ## 3 ID (Intercept) L1_03 -12.6 63.9 ## 4 ID (Intercept) L1_05 164. 63.9 ## 5 ID (Intercept) L1_06 -83.9 63.9 ## 6 ID (Intercept) L1_07 -55.3 63.9 ## 7 ID (Intercept) L1_08 -16.9 63.9 ## 8 ID (Intercept) L1_09 -101. 63.9 ## 9 ID (Intercept) L1_10 29.2 63.9 ## 10 ID (Intercept) L1_11 -167. 63.9 ## # … with 73 more rows ``` --- ```r shal_lm_r %>% filter(grpvar == "ID") %>% ggplot(aes(condval, reorder(grp, condval))) + geom_vline(xintercept = 0, colour = "#7fc97f") + geom_pointrange( aes(xmin = condval + 1.96 * condsd, xmax = condval - 1.96 * condsd), alpha = 0.7 ) ``` --- .center[ ![:scale 40%](index_files/figure-html/shal-lm-r-p-1.png) ] --- ```r shal_lm_r %>% filter(grpvar == "Target") %>% ggplot(aes(condval, reorder(grp, condval))) + geom_vline(xintercept = 0, colour = "#7fc97f") + geom_pointrange( aes(xmin = condval + 1.96 * condsd, xmax = condval - 1.96 * condsd), alpha = 0.7 ) ``` --- .center[ ![:scale 40%](index_files/figure-html/shal-lm-r-p-2-1.png) ] --- ```r ggpredict(shal_lm, terms = c("ID [sample=6]", "Relation_type"), type = "re") %>% plot() ``` <img src="index_files/figure-html/shal-lm-pred-r-id-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r ggpredict(shal_lm, terms = c("Target [sample=6]", "Relation_type"), type = "re") %>% plot() ``` <img src="index_files/figure-html/shal-lm-pred-r-target-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r shal_lm_2 <- lmer(RT ~ Relation_type * Group + (Relation_type | ID) + (Relation_type | Target), data = shallow_filt) cat(capture.output(summary(shal_lm_2))[11:20], sep = "\n") ``` ``` ## ## Random effects: ## Groups Name Variance Std.Dev. Corr ## ID (Intercept) 24923.06 157.870 ## Relation_typeConstituent 517.75 22.754 -0.93 ## Relation_typeNonConstituent 837.73 28.944 -0.46 0.76 ## Target (Intercept) 7220.07 84.971 ## Relation_typeConstituent 70.07 8.371 -1.00 ## Relation_typeNonConstituent 468.28 21.640 1.00 -1.00 ## Residual 86554.56 294.202 ``` --- ```r cat(capture.output(summary(shal_lm_2))[22:29], sep = "\n") ``` ``` ## ## Fixed effects: ## Estimate Std. Error t value ## (Intercept) 897.854 41.930 21.413 ## Relation_typeConstituent -95.618 31.890 -2.998 ## Relation_typeNonConstituent -18.300 32.409 -0.565 ## GroupL2 160.432 49.499 3.241 ## Relation_typeConstituent:GroupL2 4.199 42.770 0.098 ``` --- ```r shal_lm_2_p <- ggpredict(shal_lm_2, terms = c("Group", "Relation_type")) %>% plot() shal_lm_2_p ``` <img src="index_files/figure-html/shal-lm-2-p-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r ggpredict(shal_lm_2, terms = c("ID [sample=10]", "Relation_type"), type = "re") %>% plot() ``` <img src="index_files/figure-html/shal-lm-2-pred-r-id-1.png" height="400px" style="display: block; margin: auto;" /> --- ```r ggpredict(shal_lm_2, terms = c("Target [sample=6]", "Relation_type"), type = "re") %>% plot() ``` <img src="index_files/figure-html/shal-lm-2-pred-r-target-1.png" height="400px" style="display: block; margin: auto;" /> --- ### Reporting Easy! Same as before but in the model specification blurb add the following: > We also included by-subject and by-target varying intercepts, plus by-subject and by-target varying slopes for relation type.