Popis: |
When searching for adversarial activity within multiple networks, one of the greatest challenges is how to accurately align entities across different channels of information. This task becomes increasingly difficult when minimal additional information is known about each individual besides a name. Within this study, we analyze name rarity and how it can be used to align people on three distinct data channels: Venmo financial transactions, Reddit online discussions, and a bibliographic data source of academic writings. We explore how the uniqueness of a name can be used to decide if a person is likely the same as another across networks, in the absence of any additional ground truth. While 100 percent confidence cannot be gained, we can use this information to clarify when a possible alignment is more or less likely to be the same individual, increasing our confidence of accurately detecting adversarial behavioral patterns. From the data collected, we found that 0.1% of people had the same name across data sets, and 22.5% of those names are considered rare by our threshold. In our study, we also examine the accuracy of our method and show how real names can be extracted from account usernames, and compared in a similar manner. |