Sunday, December 12, 2010

k-Anonymity for Our Public Lives

I was just reading a paper that proposes a way to provide anonymity for location based services, and it really got me thinking about the priorities in information security and privacy.  The authors are in the College of Computing at the Georgia Institute of Technology, and cite information leakage and a slippery slope slide into a 1984, George Orwell inspired, world where large amounts of location based information combined with public records provides a complete picture of the individual.

Take for example that a particular mobile device always looks for places to eat near a rehab hospital, that same mobile device begins long trips (GPS data) from a particular (home) address, and that home address is linked to a name in the white-pages.  This contrived example is not bullet-proof, but it illustrates how information leakage can, over time, expose a lot more about ourselves than we originally thought.

While all of this is well and good, and I appreciate the efforts of academic researchers; however, I can't help but think that we are not addressing the real problem.  If you are worried about data leakage, then you should be terrified about the way that most of us throw lots of information about our professional and private lives into the public domain readily.  Facebook is the prime example, and to their credit they have taken steps to remedy the data disclosure problems in older versions of their system.  People just have no notion of how the things they publish can effect them later on.  I even have an example where the person in question didn't even intend on providing a lot of information...I give you as an example the 

Jack and Jane Smith case...

Jack and Jane Smith think they are a clever pair, because every-time there is a marketing or signup sheet that asks for an email address they list mine instead of their own.  This seems innocent enough, but what they don't realize is that every piece of solicited junk that shows up in my inbox fills in more details about their personal and private lives.  This compounded with the fact that they have chosen to use their real names in coordination with these mailings accelerates this process even further.  The real funny part is that Jack fancies himself as an IT professional (don't get me started on this).

This project all got started when I got a "happy thanksgiving" email from Freedom Toyota in Hamburg, PA.  This got me to thinking that maybe this wasn't random internet spam, but that maybe all the emails that I keep getting for Jack and Jane are substantiated in some way.  Then I decided to do a bit of googling to find out more of who these people are...

Jack Smith is a 49 year old IT guy who lives in Schuylkill Haven, PA (I have the exact address and phone number) and I know when and where he went to high-school (Facebook), where he has worked (Linkedin), and lots of seemingly benign things like kids names and relatives (mom?).

Jane Smith is the 45 year old wife of Jack and lives with her husband in a modest home in an relatively urban area.  Not to be outdone by her husband she also provides lots of personal information on her website.

They have some kids and here is where things aren't quite so clear, and this is probably a result of multiple marriages and kids with former spouses, but they definitely have a kid named Jen.  What is interesting is that Jen is probably not Jane's daughter, because the Name Donna pops up quite a bit.  Jen lists a number of siblings on her Facebook page that don't show up on Jack, Jen or Donna's pages (family relationships can be quite complex).  But what is clear is that we know who Jen's main crush is.  She is also quite the aspiring photographer and fond of the UK...

If you still think all of this is pretty innocent then let me propose the following...I know names, and addresses and phone numbers, mothers, daughters, maiden names, high-schools cities/dates of birth and what kind of house/cars they own.  These sound a lot like the types of security questions that get asked when you call a customer service desk right?  Further all of this was acquired without spending a dime of my money.  I can't fathom the information I would get by paying a few dollars for background and records searches or the parents!  This could all lead to identity theft or worse.

When I got to this point I realized that I should go back and change the names...If their is a real Jack and Jane Smith, then you know that your name is just too common and generic and should have expected this or at least seen it coming...

A broader perspective...

The point is that there is no point in comprising intricate and complex algorithms for hiding personal information if people are going to be stupid and give it way.  This is a new area of concern for adults, but what will it mean for our kids who's photos and lives have been the subject of baby blogs and Twitter feeds since before they were even born?  Will the definition of privacy change as the founder of Facebook claims?  

In design/engineering you are taught to tackle the problem that has the greatest weight (opportunity for improvement) first.  Said another way...if you are drowning and holding onto an anvil, let go of that before you attempt to save yourself by emptying your pockets of lose change!

PS:  I have no intention of releasing the information about the Smiths, but I do intend to continue collecting information about them for as long as they are dumb enough to use my email address for marketing signups and spam likely forms...