class: center, middle, inverse, title-slide .title[ # Statistics and Quantitative Methods (S2) ] .subtitle[ ## Week 3 ] .author[ ### Dr Stefano Coretta ] .institute[ ### University of Edinburgh ] .date[ ### 2023/01/31 ] --- # Data visualisation .center[ ![](../../img/data-viz.png) ] --- # Good data visualisation .bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[ Alberto Cairo has identified four common features of a good data visualisation ([Spiegelhalter 2019](https://www.penguin.co.uk/books/294857/the-art-of-statistics-by-spiegelhalter-david/9780241258767):64-66): 1. It contains **reliable information**. 2. The design has been chosen so that relevant **patterns become noticeable**. 3. It is presented in an **attractive** manner, but appearance should not get in the way of **honesty, clarity and depth**. 4. When appropriate, it is organized in a way that **enables some exploration**. ] --- # Quick poll .f3[*Which of the following are not continuous measures, i.e. they are discrete?*] <br> .pull-left[ .f3[Join at] .f1[slido.com] .f1[\#9921 685] ] .pull-right[ .center[ ![](../../img/QR-SQM-2-Week-3.png) ] ] ??? Slido poll. <https://app.sli.do/event/96zG4S3rZ4hyaoFNPcMqTg> --- layout: false # Quick poll .f3[*Which of the following graphs are you familiar with?*] <br> .pull-left[ .f3[Join at] .f1[slido.com] .f1[\#9921 685] ] .pull-right[ .center[ ![](../../img/QR-SQM-2-Week-3.png) ] ] ??? Slido poll. <https://app.sli.do/event/96zG4S3rZ4hyaoFNPcMqTg> --- # Bar chart <img src="index_files/figure-html/status-bar-1.png" width="60%" style="display: block; margin: auto;" /> ??? Bar charts are great for counts (of anything). The *x*-axis includes the level of status, while the *y*-axis shows the number of languages per status level. --- layout: true # Stacked bar chart --- <img src="index_files/figure-html/status-stack-1-1.png" width="60%" style="display: block; margin: auto;" /> ??? In this plot I separated endangered vs non-endangered languages. Within the endangered languages I further show the counts of different status levels. --- <img src="index_files/figure-html/status-stack-2-1.png" width="60%" style="display: block; margin: auto;" /> ??? Here, the *x*-axis corresponds to the language macro-areas in the data. Within each bar, the counts for each of the status levels is given. --- layout: false # Stacked proportion (filled) bar chart <img src="index_files/figure-html/status-filled-1.png" width="60%" style="display: block; margin: auto;" /> ??? So far we have seen raw counts. What about proportions? You can show proportions by using a "filled" bar chart. Each bar is stretched so that covers the entire range from 0 to 1. Note that proportions are between 0 and 1, while percentages are between 0 and 100%. By default, `geom_bar()` adds a *y* label "counts" so you have to manually change the label to "proportions". --- # Dot matrix chart <img src="index_files/figure-html/status-matrix-1.png" width="45%" style="display: block; margin: auto;" /> --- # Mosaic plot <img src="index_files/figure-html/status-mosaic-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: true # Line plot --- <img src="index_files/figure-html/forms-line-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/forms-point-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/forms-line-point-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/gest-line-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/gest-line-facet-1-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/gest-line-facet-2-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: false layout: true # Connected dots plot --- <img src="index_files/figure-html/gest-conn-1-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/gest-conn-2-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: false layout: true # Strip chart --- <img src="index_files/figure-html/pol-strip-f0-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-strip-hnr-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: false layout: true # Density plot --- <img src="index_files/figure-html/pol-dens-1-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-dens-2-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-dens-3-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-dens-4-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: false layout: true # Violin plot --- <img src="index_files/figure-html/pol-vio-1-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-vio-2-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-vio-3-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-vio-4-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: false layout: true # Scatter plot --- <img src="index_files/figure-html/pol-sca-1-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-sca-2-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-sca-3-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-sca-4-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/pol-sca-5-1.png" width="60%" style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/mald-1-1.png" width="60%" style="display: block; margin: auto;" /> --- layout: false class: center middle reverse # DO'S AND DON'TS --- layout: true # DO --- <img src="index_files/figure-html/mald-bar-1-1.png" width="60%" style="display: block; margin: auto;" /> ??? Bar charts should be used for discrete numeric variables, not for continuous variables. --- <img src="index_files/figure-html/mald-bar-2-1.png" width="60%" style="display: block; margin: auto;" /> ??? If you want to show proportions, instead of raw counts, use proportion bar charts (aka filled bar chart). --- <img src="index_files/figure-html/mald-bar-3-1.png" width="60%" style="display: block; margin: auto;" /> ??? To show proportions from multiple subjects/items, use strip charts. --- layout: false # DON'T <img src="index_files/figure-html/mald-dont-1.png" width="60%" style="display: block; margin: auto;" /> ??? Never ever ever use bar charts with error bars to show mean proportions. They are misleading: - The bars do not indicate a discrete numeric values: mean proportions are continuous variables. - Error bars mask the true variability of the data: show raw proportions instead. For more see: https://www.data-to-viz.com/caveat/error_bar.html, https://stats.stackexchange.com/questions/349422/does-it-make-sense-to-add-error-bars-in-a-bar-chart-of-frequencies/367889#367889 --- # DO <img src="index_files/figure-html/pol-do-1.png" width="60%" style="display: block; margin: auto;" /> ??? For continuous variables, like acoustic measures or reaction times, use violins with overlaid strip charts. You can include very narrow box plots, but remember that box plots mask variability in the raw data. --- # DON'T <img src="index_files/figure-html/pol-dont-1.png" width="60%" style="display: block; margin: auto;" /> ??? Can you see what difference it makes to use box plots only? --- # Summary .bg-washed-blue.b--dark-blue.ba.bw2.br3.shadow-5.ph4.mt2[ - Carefully think about which type of variable you are working with: **continuous or discrete**? - The type of variable allows you to select appropriate types of plots. Your **go-to plots** are: - Bar charts (and variants). - Strip charts. - Line plots. - Density plots. - Violin plots. - Be mindful of the **DOs and DON'Ts** of plotting. ]