library(tidyverse)
theme_set(theme_minimal()) # I just like this theme :)Plotting prior distributions with ggplot2
The choice of priors is a fundamental step of the Bayesian inference process. Vasishth et al. (2018) recommend plotting the chosen priors to see if they are reasonable.
In this post I will show how to easily plot prior distributions in ggplot2 (which is part of the tidyverse).
Let’s load the tidyverse first.
Plotting your priors
Let’s start with a simple normal prior with \(\mu\) = 0 and sd = 1.
The plot is initialised with an empty call to ggplot(). As aesthetics, you only need to specify the range of x values in aes(). Here, we use c(-4, 4), meaning that the x-axis of this plot will have these limits. For a normal distribution, it is useful to set the limits as the mean ± 4 times the standard deviation (this ensures all the distribution is shown).
The function ggplot2::stat_function() allows us to specify a distribution family with the fun argument. This arguments takes the density function (the R functions of the form dxxx) of the chosen distribution family, so for the normal (Gaussian) distribution we use dnorm(). The argument n specifies the number of points along which to calculate the distribution (here 101), while args takes a list with the parameters of the distribution (here the mean 0 and standard deviation 1).
ggplot(data = tibble(x = -4:4), aes(x)) +
stat_function(fun = dnorm, n = 101, args = list(1)) +
labs(title = "Normal (Gaussian) distribution")
A beta prior will be bounded between 0 and 1, so we can specify that in aes(). The beta distribution has two arguments, shape1 and shape2 (here 2 and 5).
ggplot(data = tibble(x = 0:1), aes(x)) +
stat_function(fun = dbeta, n = 101, args = list(2, 5)) +
labs(title = "Beta distribution")
Another common distribution is the Cauchy.
ggplot(data = tibble(x = -10:10), aes(x)) +
stat_function(fun = dcauchy, n = 201, args = list(-2, 1)) +
labs(title = "Cauchy distribution")
The Poisson distribution can be plotted by changing the type of geom and using an n that creates only integers.
# the range 0:20 includes 21 integers, so n = 21
ggplot(data = tibble(x = 0:20), aes(x)) +
stat_function(fun = dpois, n = 21, args = list(4), geom = "point") +
labs(title = "Poisson distribution")
Of course any family with a corresponding dxxx function can be plotted (see ?Distributions and package-provided families).