Sunday, February 10, 2013

Fake it properly

The article "Are You Faking It?" is an interesting read for anyone dealing with De-Identification, specifically Pseudonymization. I cover Pseudonymization in a few articles, it is the art of modifying identifiers on data in a way that preserves linkage over time of data about the same subject, but breaks the ability to figure out who that subject is. The process of Pseudonymization is to create pseudonym for the subject, or an alias. The process of Anonymization simply removes identifiers outright. Both are forms of De-Identification. There are special cases of De-Identification that meet specific government requirements, such as HIPAA

The really useful part of 'Are You Faking It' that there is guidance on the creation of pseudonyms. Specifically guidance on what NOT to do as well as what to do. Interesting result of poorly done database shuffling. The article starts by explaining some actual negatives of poorly executed faked databases. It then explains that even faked identities need to be carefully crafted. Finishing with an explanation of the USA SSN format, and how to fake identities that can’t possibly be real SSN values.

It is interesting that SSN starting with 666 have been reserved and will never be issued. Given the religious sensitivity of 666 it is not too surprising. I wonder how many other identity systems do this same reservation. It does open up a really great pattern for pseudonyms. Since the pseudonym is not associated with an actual human, there should be no problem using it as a fake identity. Further since it is rather easy to see, it will be a quick way to identify data that are pseudonymized.