17 Research cycle

In Chapter 1, you learned about the research process, which includes the research context, data acquisition, data analysis and communication. A different perspective on the research process that highlights the temporal succession of the process steps is the research cycle, represented in an idealised form in Figure 17.1.

The cycle starts with the development of research questions and hypotheses. This step involves a thorough literature review and the identification of the topic, research problem, goal, questions and, possibly, hypotheses (as described in Chapter 2). Once the research questions and hypotheses have been determined, the researcher proceeds with the design of the study which sets out to answer the research questions and assess the research hypotheses. The study design process includes determining a large number of interconnected aspects, like materials, procedures, data management and data analysis plans, target population, sampling method and so on. At times the study design process reveals shortcomings or unforeseen aspects of the research questions/hypotheses which can be updated accordingly.

Once the study design has been finalised, one proceeds with acquiring data based on the protocols detailed in their plan. After the completion of data acquisition, researchers analyse data and interpret the results in light of the research questions and hypotheses. Finally, the outcomes of the study are published in some form and the next study cycle begins once again.

This sounds all very reasonable, but in reality, the researchers’ practice is quite different. This chapter introduces the concept of “researcher’s degrees of freedom” and describes the so-called Questionable Research Practices (QRPs). We will review literature that shows the grim reality of how common QRPs are. In Chapter 33, you will learn about principles and tools that are designed to help minimise the presence and impact of QRPs in one’s own research.

17.1 Researcher’s degrees of freedom

This section is reproduced from Coretta et al. (2023) (CC-BY-NC) with minor edits.

Data analysis involves many decisions, such as how to operationalise and measure a given phenomenon or behaviour, which data to submit to statistical modelling and which to exclude in the final analysis, or which inferential approach to employ. This “freedom” can be problematic because humans show cognitive biases that can lead to erroneous inferences (Tversky and Kahneman 1974). For example, humans are prone to see coherent patterns even in the absence of them (Brugger 2001), convince themselves of the validity of prior expectations by cherry-picking evidence (aka confirmation bias, “I knew it,” Nickerson 1998), and perceive events as being plausible in hindsight (“I knew it all along,” Fischhoff 1975). In conjunction with an academic incentive system that rewards certain discovery processes more than others (Koole and Lakens 2012; Sterling 1959), we often find ourselves exploring many possible analytic pathways but reporting only a selected few depending on the quality of the narrative that we can achieve with them.

This issue is particularly amplified in fields in which the raw data lend themselves to many possible ways of being measured (Roettger 2019). Combined with a wide variety of conceptual and methodological traditions as well as varying levels of quantitative training across sub-fields, the inherent flexibility of data analysis might lead to a vast plurality of analytic approaches that can itself lead to different scientific conclusions (Roettger, Winter, and Baayen 2019). Analytic flexibility has been widely discussed from a conceptual point of view (Nosek and Lakens 2014; Simmons, Nelson, and Simonsohn 2011; Wagenmakers et al. 2012) and in regard to its application in individual scientific fields (e.g., Charles et al. 2019; Roettger, Winter, and Baayen 2019; Wicherts et al. 2016). This notwithstanding, there are still many unknowns regarding the extent of analytic plurality in practice.

Consequently, a substantial body of published articles likely present overconfident interpretations of data and statistical results based on idiosyncratic analytic strategies (e.g., Gelman and Loken (2014); Simmons, Nelson, and Simonsohn (2011)). These interpretations, and the conclusions that derive from them, are thus associated with an unknown degree of uncertainty (dependent on the strength of evidence provided) and with an unknown degree of generalizability (dependent on the chosen analysis). Moreover, the same data could lead to very different conclusions depending on the analytic path taken by the researcher. However, instead of being critically evaluated, scientific results often remain unchallenged in the publication record.

17.2 Questionable Research Practices

This section contains text from Coretta (2020) (CC-BY-NC) .

Figure 17.2: The research cycle and questionable research practices

Questionable research practices are practices, whether intentional or not, that undermine the robustness of research (Simmons, Nelson, and Simonsohn 2011; Morin 2015; Flake and Fried 2020). Questionable research practices are practices that negatively affect the research enterprise, but that are employed (most of the time unintentionally) by a surprisingly high number of researchers (John, Loewenstein, and Prelec 2012). For each step in the research cycle, questionable practices are available to researchers. These are part of the researcher’s degrees of freedom, introduced in the previous section. In this section, we will briefly review some of the most common questionable research practices identified in the literature.

Makel, Plucker, and Hegarty (2012) looked at the publication history of 100 psychological journals since 1900. They found that only 1.07% of the papers (that is, 1 in 100 papers) were replications of previous studies. This means that the vast majority of studies are run only once and the field moves on. As Tukey (1969, 84) said, “Confirmation comes from repetition. Any attempt to avoid this statement leads to failure and more probably to destruction”. This lack of replication attempts is problematic, given than we can’t be certain the results obtain from the one study would replicate if the study is run again. While the study in Makel, Plucker, and Hegarty (2012) focused on psychology, Kobrock and Roettger (2023) find that linguistics shows a more dire situation: only 0.08% of experimental articles contains an independent direct replication (1 in 1250).

Another issue that affects modern research regards study design, including aspects related to sample size. Several studies have found that most research employs study designs that grant a 50% probability of being able to find effects of medium size (Cohen 1962; Sedlmeier and Gigerenzer 1992; Bezeau and Graves 2001). Gaeta and Brydges (2020) find a similar scenario in speech, language and hearing research: the majority of studies they screened did not have an adequate sample size to be able to detect medium-sized effects.

In a study about the prevalence of questionable research practices, John, Loewenstein, and Prelec (2012) found that about 50% of the researchers interviewed admitted to selective reporting, i.e. reporting only some of the statistical analyses or studies conducted. Combined with a theoretical admission rate, the authors argue for a 100% rate of selective reporting (in other words, we can expect all published studies to be affected by selective reporting). They also found that about 35% of the researchers admitted to having changed the research question/hypothesis after seeing the results (or “claiming to have predicted an unexpected finding”), also known as HARKing (Hypothesising After the Results are Known, Kerr 1998). Combined with the theoretical admission rate, they estimate an actual rate of 90%.

We will talk more about sharing research data when you will learn about Open Research practices in Chapter 33, but Wicherts et al. (2006) contacted the authors of 141 articles in psychology asking to share the research data with them and a worrying 73% of the authors never shared their data. Bochynska et al. (2023) surveyed 600 linguistic articles and less than 10% of them shared their data as part of the publication.

Publication bias is used to refer to the bias towards publishing “positive” results (i.e. results that indicate the presence of an effect). Fanelli (2010); Fanelli (2012) found that about 80% of published results are positive results across disciplines, while the prevalence of positive results was higher in fields like psychology and economics (about 90%). Of course, the very high prevalence of positive results indicates that a lot of “negative” results (i.e. results that don’t suggest the presence of an effect) are not published, because in a neutral scenario (where researchers propose and test hypotheses, in an iterative process), there should be many more negative results. Ioannidis (2005), for example, shows through computational modelling that a prevalence rate of positive results of 50% or above would be very difficult to obtain and concludes that “most published research findings are false”. Relatedly, Nissen et al. (2016) also use computational modelling to show how false claims can frequently become canonized as fact, in the absence of sufficient negative results. Further to these points, Scheel (2022) stresses that “most psychological research findings are not even wrong”, in that most claims made in the literature are “so critically underspecified that attempts to empirically evaluate them are doomed to failure” (Scheel 2022, 1).

Quiz 1

a. True or false?

In the research cycle, hypotheses are always fixed after the study design is finalised and cannot be changed.
Researcher’s degrees of freedom refers to the many decisions involved in data analysis, which can influence outcomes and interpretations.
Publication bias describes the tendency for journals to publish studies with negative results more often than those with positive results.

b. Which of the following is considered a Questionable Research Practice.

Running a replication study to confirm findings. Selectively reporting only some of the analyses conducted. Increasing sample size to ensure adequate statistical power. Publishing negative results alongside positive ones.

Bezeau, Scott, and Roger Graves. 2001. “Statistical Power and Effect Sizes of Clinical Neuropsychology Research.” Journal of Clinical and Experimental Neuropsychology 23 (3): 399–406. https://doi.org/10.1076/jcen.23.3.399.1181.

Bochynska, Agata, Liam Keeble, Caitlin Halfacre, Joseph V. Casillas, Irys-Amélie Champagne, Kaidi Chen, Melanie Röthlisberger, Erin M. Buchanan, and Timo B. Roettger. 2023. “Reproducible Research Practices and Transparency Across Linguistics.” Glossa Psycholinguistics 2 (1). https://doi.org/10.5070/g6011239.

Brugger, Peter. 2001. “From Haunted Brain to Haunted Science: A Cognitive Neuroscience View of Paranormal and Pseudoscientific Thought.” Hauntings and Poltergeists: Multidisciplinary Perspectives, 195213.

Charles, Sarah J., James E. Bartlett, Kyle J. Messick, Thomas J. Coleman, and Alex Uzdavines. 2019. “Researcher Degrees of Freedom in the Psychology of Religion.” The International Journal for the Psychology of Religion 29 (4): 230245.

Cohen, Jacob. 1962. “The Statistical Power of Abnormal-Social Psychological Research: A Review.” The Journal of Abnormal and Social Psychology 65 (3): 145–53. https://doi.org/10.1037/h0045186.

Coretta, Stefano. 2020. “Open Science in Phonetics and Phonology.” https://doi.org/10.31219/osf.io/4dz5t.

Coretta, Stefano, Joseph V. Casillas, Simon Roessig, Michael Franke, Byron Ahn, Ali H. Al-Hoorie, Jalal Al-Tamimi, et al. 2023. “Multidimensional Signals and Analytic Flexibility: Estimating Degrees of Freedom in Human-Speech Analyses.” Advances in Methods and Practices in Psychological Science 6 (3). https://doi.org/10.1177/25152459231162567.

Fanelli, Daniele. 2010. “Do Pressures to Publish Increase Scientists’ Bias? An Empirical Support from US States Data.” Edited by Enrico Scalas. PLoS ONE 5 (4): e10271. https://doi.org/10.1371/journal.pone.0010271.

———. 2012. “Negative Results Are Disappearing from Most Disciplines and Countries.” Scientometrics 90 (3): 891–904. https://doi.org/10.1007/s11192-011-0494-7.

Fischhoff, Baruch. 1975. “Hindsight Is Not Equal to Foresight: The Effect of Outcome Knowledge on Judgment Under Uncertainty.” Journal of Experimental Psychology: Human Perception and Performance 1 (3): 288.

Flake, Jessica Kay, and Eiko I. Fried. 2020. “Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them.” Advances in Methods and Practices in Psychological Science 3 (4): 456465. https://doi.org/10.1177/2515245920952393.

Gaeta, Laura, and Christopher R. Brydges. 2020. “An Examination of Effect Sizes and Statistical Power in Speech, Language, and Hearing Research.” Journal of Speech, Language, and Hearing Research 63 (5): 15721580. https://doi.org/10.1044/2020_jslhr-19-00299.

Gelman, Andrew, and Eric Loken. 2014. “The Statistical Crisis in Science: Data-Dependent Analysis. A “Garden of Forking Paths”explains Why Many Statistically Significant Comparisons Don’t Hold Up.” American Scientist 102 (6): 460466.

Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124. https://doi.org/10.1080/09332480.2019.1579573.

John, Leslie K., George Loewenstein, and Drazen Prelec. 2012. “Measuring the Prevalence of Questionable Research Practices with Incentives for Truth Telling.” Psychological Science 23 (5): 524532. https://doi.org/10.1177/0956797611430953.

Kerr, Norbert L. 1998. “HARKing: Hypothesizing After the Results Are Known.” Personality and Social Psychology Review 2 (3): 196217. https://doi.org/10.1207/s15327957pspr0203_4.

Kobrock, Kristina, and Timo B. Roettger. 2023. “Assessing the Replication Landscape in Experimental Linguistics.” Glossa Psycholinguistics 2 (1). https://doi.org/10.5070/g6011135.

Koole, Sander L, and Daniël Lakens. 2012. “Rewarding Replications: A Sure and Simple Way to Improve Psychological Science.” Perspectives on Psychological Science 7 (6): 608614.

Makel, Matthew C., Jonathan A. Plucker, and Boyd Hegarty. 2012. “Replications in Psychology Research: How Often Do They Really Occur?” Perspectives on Psychological Science 7 (6): 537–42. https://doi.org/10.1177/1745691612460688.

Morin, Olivier. 2015. “A Plea for “Shmeasurement” in the Social Sciences.” Biological Theory 10 (3): 237245. https://doi.org/10.1007/s13752-015-0217-z.

Nickerson, Raymond S. 1998. “Confirmation Bias: A Ubiquitous Phenomenon in Many Guises.” Review of General Psychology 2 (2): 175220. https://doi.org/10.1037/1089-2680.2.2.175.

Nissen, Silas Boye, Tali Magidson, Kevin Gross, and Carl T. Bergstrom. 2016. “Publication Bias and the Canonization of False Facts.” Elife 5: e21451. https://doi.org/10.7554/eLife.21451.

Nosek, Brian A, and Daniël Lakens. 2014. “A Method to Increase the Credibility of Published Results.” Social Psychology 45 (3): 137141.

Roettger, Timo B. 2019. “Researcher Degrees of Freedom in Phonetic Sciences.” Laboratory Phonology: Journal of the Association for Laboratory Phonology 10 (1): 127.

Roettger, Timo B., Bodo Winter, and Harald Baayen. 2019. “Emergent Data Analysis in Phonetic Sciences: Towards Pluralism and Reproducibility.” Journal of Phonetics 73: 17. https://doi.org/10.1016/j.wocn.2018.12.001.

Scheel, Anne M. 2022. “Why Most Psychological Research Findings Are Not Even Wrong.” Infant and Child Development 31 (1): e2295. https://doi.org/10.1002/icd.2295.

Sedlmeier, Peter, and Gerd Gigerenzer. 1992. “Do Studies of Statistical Power Have an Effect on the Power of Studies?” In, 389–406. Washington: American Psychological Association. https://doi.org/10.1037/10109-032.

Simmons, Joseph P, Leif D Nelson, and Uri Simonsohn. 2011. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22 (11): 13591366.

Sterling, Theodore D. 1959. “Publication Decisions and Their Possible Effects on Inferences Drawn from Tests of Significanceor Vice Versa.” Journal of the American Statistical Association 54 (285): 3034.

Tukey, John W. 1969. “Analyzing Data: Sanctification or Detective Work?” American Psychologist 24 (2): 83–91. https://doi.org/10.1037/h0027108.

Tversky, Amos, and Daniel Kahneman. 1974. “Judgment Under Uncertainty: Heuristics and Biases: Biases in Judgments Reveal Some Heuristics of Thinking Under Uncertainty.” Science 185 (4157): 11241131. https://doi.org/10.1126/science.185.4157.1124.

Wagenmakers, Eric-Jan, Ruud Wetzels, Denny Borsboom, Han L. J. van der Maas, and Rogier A. Kievit. 2012. “An Agenda for Purely Confirmatory Research.” Perspectives on Psychological Science 7 (6): 632638. https://doi.org/10.1177/1745691612463078.

Wicherts, Jelte M., Denny Borsboom, Judith Kats, and Dylan Molenaar. 2006. “The Poor Availability of Psychological Research Data for Reanalysis.” American Psychologist 61 (7): 726.

Wicherts, Jelte M., Coosje L. S. Veldkamp, Hilde E. M. Augusteijn, Marjan Bakker, Robbie C. M. van Aert, and Marcel A. L. M. van Assen. 2016. “Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking.” Frontiers in Psychology 7. https://doi.org/10.3389/fpsyg.2016.01832.