Wednesday, 2 September 2015

Only metadata

This is interesting.   A journalist posted his phone metadata online and challenged people to find stuff out about his life.  That data included:14837076187 2d50ac79a6

  • Who he called and texted (in our dataset, exact phone numbers have been hidden and replaced by unique identifying codes).
  • How long each phone call lasted.
  • The time of the communication.
  • The location of the cell tower contacted when outgoing calls were initiated.
  • The location of the cell tower contacted for SMS and internet connections.

The results were hit and miss but accurate about the big stuff (and some people did very well on the little stuff as well).  Because I’m me, I’m more interested in the stuff they got wrong.  Some highlights:

  • Some people thought the journalist was partying all night on New Years Eve because his phone was active all night and all morning.  In fact, he was in bed before 12.  The pings probably came from other people sending him HNY messages and he had a 5am shift the following day.  It’s interesting that our perception of the norm colours our expectations so completely; nobody guessed the truth.
  • Some people inferred that he’s a member of a particular golf course, which isn’t true.  This is interesting because it would likely be a very easy thing to check by ringing the place up and asking.  They might not tell you outright but there’d probably be ways to wheedle the information out of them.  I’d probably try asking them to give the guy an important message when he was next in.  People like to be helpful  My point is that although people used other datasets to help their analysis, they apparently didn’t use social engineering.  The first thing I’d have done is work out where he works and then ring them up and ask questions.  Maybe even ring him up.  That would likely have told me at least that he works shifts, which nobody in the challenge got from the metadata alone.
  • Actually, thinking about that it’s surprising that nobody guessed that from the data, especially since some people accurately guessed his bus route.  I haven’t looked at the data but patterns surrounding when he used the bus, ferry or drove to work seem like they’d stand out.  Perhaps people just didn’t to think to look, since shift work is relatively uncommon these days.  Perhaps it was a limitation in the tools people used.  If nothing else, this might give us some insight into how to design better surveillance tools.  Which is all we need, right?
  • It was easy to identify domestic flights but much harder to guess international ones for obvious reasons.  There are ways to track down international flights without access to foreign metadata (law enforcement agencies would rarely face this problem) but they are tricky often time-limited.and you’d probably need to be there in person. It’s interesting to think about ways to do this.

People getting stuff wrong is an aspect of privacy that a lot of people forget. I’ve written about it before in various places.  I’m out and about so can’t look up the refs just now, but I know I’ve written about an interview with a security expert.  The security expert looked at the journalist’s Foursquare checkins and noted that he checked in often at a deli and a doctor’s surgery.  You can see how an insurance company might construct a narrative involving the journalist eating a lot of fatty deli meat and having to see his doctor often as a result.  The security expert pointed out exactly this narrative.  In fact, the doctor was the journalist’s daughter’s  paediatrician.  The journalist liked to check in at the deli because hardly anyone else ever did, making him the Foursquare mayor of the place.  But the fake narrative was fairly convincing.

I’ve given a few talks about this sort of thing at conferences and other venues.  When I do this, I usually also talk about ways to control the narrative, which I’ve also written about before.  It’s an interesting subject.  Being very private can increase your risks of being misconstrued but false trails and misinformation is surprisingly difficult to pull off convincingly and introduces new risks (that guy must have some really juicy stuff to hide).  Pre-emptive strikes ,such as posting embarrassing photos before anyone else does so you can control the context, carry their own dangers.

It’s tricky and it’s what I’m convinced is at the heart of whatever privacy is.

Photo credit.

No comments:

Post a Comment