Abstract
Invalid responses pose a significant risk of distorting survey data, compromising statistical inferences, and introducing errors in conclusions drawn from surveys. Given the pivotal role of surveys in research, development, and decision-making, it is imperative to identify careless survey respondents. The existing literature on this subject comprises two primary categories of approaches: methods that rely on survey items and methods involving post hoc analyses. The latter, which does not demand preemptive preparation, predominantly incorporates statistical techniques or metadata analysis aimed at identifying distinct response patterns that are associated with careless responses. However, several inherent limitations limit the precise identification of careless respondents. One notable challenge is the lack of consensus concerning the thresholds to use for the various measures. Furthermore, each method is designed to detect a specific response pattern associated with carelessness, leading to conflicting outcomes. In this article, we seek to assess the efficacy of the existing methods using a novel survey methodology encompassing responses to both meaningful and meaningless gibberish scales in which the latter compels respondents to answer without considering item content. Using this approach, we propose the application of machine learning to identify careless survey respondents. Our findings underscore the efficacy of a methodology using supervised machine learning combined with unique gibberish data as a potent method for the identification of careless respondents, aligning with and outperforming other approaches in terms of effectiveness and versatility.
| Original language | English |
|---|---|
| Article number | 25152459251378420 |
| Journal | Advances in Methods and Practices in Psychological Science |
| Volume | 8 |
| Issue number | 4 |
| DOIs | |
| State | Published - 1 Oct 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025. This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
Keywords
- careless responding
- data cleaning
- gibberish scales
- insufficient effort responding
- invalid response
- meaningless scales
- open data
- open materials
- preregistration
Fingerprint
Dive into the research topics of 'Identifying Careless Survey Respondents Through Machine Learning Using Responses to a Gibberish Scale'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver