When are there too many collisions? Variants of the birthday problem

Research output: Contribution to journalArticlepeer-review

Abstract

Due to restrictions on the use of unique identifiers of individuals in data sets, there may be instances in which two or more data sets have some of the individuals in common, with no direct way to detect such occurrences. More generally, a collision occurs when two or more observations are in agreement with respect to variables associated with the observations. This article discusses several possible statistical/probabilistic approaches to determining when the number of collisions (or near-collisions) exceeds what would be expected by chance if in fact the observations are all distinct. The methods and results are related to the Birthday Problem and to Occupancy Problems.

Original languageEnglish (US)
JournalCommunications in Statistics - Theory and Methods
DOIs
StateAccepted/In press - 2023

Bibliographical note

Publisher Copyright:
© 2023 Taylor & Francis Group, LLC.

Keywords

  • birthday problem
  • coincidences
  • collisions
  • occupancy problem

Fingerprint

Dive into the research topics of 'When are there too many collisions? Variants of the birthday problem'. Together they form a unique fingerprint.

Cite this