Reliability of ordinal outcomes in forensic black-box studies

Hina M. Arora*, Naomi Kaplan-Damary, Hal S. Stern

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Forensic science disciplines such as latent print examination, bullet and cartridge case comparisons, and shoeprint analysis, involve subjective decisions by forensic experts throughout the examination process. Most of the decisions involve ordinal categories. Examples include a three-category outcome for latent print comparisons (exclusion, inconclusive, identification) and a seven-category outcome for footwear comparisons (exclusion, indications of non-association, inconclusive, limited association of class characteristics, association of class characteristics, high degree of association, identification). As the results of the forensic examinations of evidence can heavily influence the outcomes of court proceedings, it is important to assess the reliability and accuracy of the underlying decisions. “Black box” studies are the most common approach for assessing the reliability and accuracy of subjective decisions. In these studies, researchers produce evidence samples consisting of a sample of questioned source and a sample of known source where the ground truth (same source or different source) is known. Examiners provide assessments for selected samples using the same approach they would use in actual casework. These studies often have two phases; the first phase comprises of decisions on samples of varying complexities by different examiners, and the second phase involves repeated decisions by the same examiner on a (usually) small subset of samples that were encountered by examiners in the first phase. We provide a statistical method to analyze ordinal decisions from black-box trials with the objective of obtaining inferences for the reliability of these decisions and quantifying the variation in decisions attributable to the examiners, the samples, and statistical interaction effects between examiners and samples. We present simulation studies to judge the performance of the model on data with known parameter values and apply the model to data from a handwritten signature complexity study, a latent fingerprint examination black-box study, and a handwriting comparisons black-box study.

Original languageAmerican English
Article number111909
JournalForensic Science International
StatePublished - Jan 2024

Bibliographical note

Publisher Copyright:
© 2023 The Authors


  • Bayesian Methodology
  • Black-Box Study
  • Ordinal Decisions
  • Reliability
  • Two-way ANOVA


Dive into the research topics of 'Reliability of ordinal outcomes in forensic black-box studies'. Together they form a unique fingerprint.

Cite this