06 - What’s next?

Stefano Coretta

This is just the beginning

You have learnt a great deal already!

  • Statistics as a tool to deal with uncertainty and variability.

  • Regression models (Gaussian, log-normal, Bernoulli, Poisson, negative binomial, ordinal) for a variety of outcome variables.

  • One predictor: numeric or categorical.

  • Working with posterior draws.

But there is more you need to learn to use regression models in actual research!

Many predictors: interactions

Figure 1: Regression lines illustrating an interaction between phonetic distance and lexical status.

Other types of outcome variables

  • Variables bounded between 0 and 1, like proportions. Use a beta regression.

  • Categorical variables with more than two categories (but unordered). Use a categorical (multinomial) regression.

  • Multiple outcome variables! Use multivariate regression models.

  • Can you think of more types?

Non-linear effects: smooth terms

Figure 2: Vowel duration and speech rate.

Repeated measures: multilevel/hierarchical/mixed-effects

Figure 3: Jitter plot showing RTs by lexical status for a selection of participants.

Model diagnostics

  • Model diagnostics help us determine cases when there is something wrong with the model or the data or both.

  • The posterior predictive check plot is one type of diagnostic, but there are others (like \(\hat{R}\) and Effective Sample Size in the model summary).

Prior probability distributions

  • We’ve been using the default priors set by brms. They are usually fine.

  • It might be preferable to specify custom priors.

  • In any case, learning the basics of prior specification is fundamental.

Frequentist vs Bayesian statistics

  • Modern research is dominated by the “null ritual”, a degenerate form of frequentist statistics (Gigerenzer, Krauss, and Vitouch 2004; Gigerenzer 2004, 2018). It is based on “rejecting a nil hypothesis”.

  • The “null ritual” is not a robust statistical approach.

  • Proper frequentist methods are difficult to apply in practice: they tell us \(P(d | H)\) but we want \(P(H | d)\).

  • Bayesian statistics is about estimating uncertainty and accounting for variability with posterior probability distributions.

Dimensionality reduction

  • Methods to reduce “dimensionality” of the data.

    • Principal Component Analysis.

    • Multiple Correspondence Analysis.

    • Clustering Methods.

Causal inference

  • “Correlation is not causation!”

  • Well… it is if you adopt a causal inference approach.

  • Directed Acyclic Graphs (DAGs).

A DAG of project funding selection.

Causal inference: collider bias

From McElreath (2020).

Modelling mindsets

  • Molnar (2022): Modeling Mindsets: The Many Cultures Of Learning From Data.

  • Statistical modelling, frequentism, Bayesianims, likelihoodism, causal inference, machine learning, supervised learning, unsupervised learning, reinforcement learning, deep learning.

  • Be a T-shaped Modeller.

    • Deep knowledge of one approach, superficial knowledge of the rest.

Where can I learn all that?

Thank you!

Mr. Happy Face, the 2022 winner of the World’s Ugliest Dog Competition, looks towards the camera. Josh Edelson/AFP/Getty Images

References

Gigerenzer, Gerd. 2004. “Mindless Statistics.” The Journal of Socio-Economics 33 (5): 587–606. https://doi.org/10.1016/j.socec.2004.09.033.
———. 2018. “Statistical Rituals: The Replication Delusion and How We Got There.” Advances in Methods and Practices in Psychological Science 1 (2): 198218. https://doi.org/10.1177/2515245918771329.
Gigerenzer, Gerd, Stefan Krauss, and Oliver Vitouch. 2004. “The Null Ritual. What You Always Wanted to Know about Significance Testing but Were Afraid to Ask.” In, 391408.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Second edition. Chapman & Hall/CRC Texts in Statistical Science Series. Boca Raton: CRC Press.
Molnar, Christoph. 2022. Modeling Mindsets: The Many Cultures of Learning from Data. MUCBOOK.