Monday, July 09, 2007

Astroglide Data Loss Could Result In $18 Million Fine

[scroll way down for a spreadsheet containing numbers of Astroglide requests per state]

Executive Summary

In April 2007, Biofilm Inc. accidentally published on the Internet the names and addresses of over 200,000 customers who had requested a free sample of their popular sex lubricant Astroglide. This blog post highlights the fact that the leaked data could serve as highly effective bait for targeted phishing attacks and other kinds of scams. A full breakdown of numbers of requests for each state are released. These numbers are then used to estimate potential fines against Biofilm should state Attorneys General wish to get involved.


Privacy is a strange beast. It is one of our "rights" least well defined and protected by the law. The U. S. Constitution contains no express right to privacy. Likewise, data protection is something that has yet to be properly addressed by US law.

Consumers regularly surrender their personal information to random strangers in return for t-shirts and teddy bears as credit card sign-up bonuses. Similarly, many consumers permit the tracking of individual items in their supermarket purchases by companies in return for modest discounts or "points" through loyalty schemes.

Data protection and privacy become far more important when they relate to personal and sexual information. Most consumers would probably be more concerned about someone else gaining access to the order info for ther their Good Vibrations (an online seller of marital aids) account than for their past book purchases from Amazon.

Likewise, when Congress rushed to pass extremely pro-privacy restrictions on the release of video-rental records in 1988, it was not because they were concerned about tabloid journalists learning how many times a particular Senator had rented Citizen Kane.

A Slippery Problem

The main subject of this blog post relates to a data loss/accidental release by a California company named Biofilm, Inc. They are the makers of Astroglide, a popular sexual lubricant.

For most of April 2007, a database of names and addresses of individuals who had requested free samples of Astroglide was inadvertently left unprotected on the company's website. In addition to random visitors being able to access the database, Google's search engine spider software made copies of the database - cached copies of which continued to be available online from Google's site for more than a week after Astroglide removed the data from their own website.

Within hours of Wired News picking up the Astroglide story, fellow Indiana University PhD student Sid Stamm and I began frantically downloading all the data from Google's cache. The leaked Astroglide database contains the names and addresses of individuals who requested a samples between 2003 to 2007. With a bit of effort to clean out duplicate entries, we soon had a database of just over two-hundred thousand unique names and addresses.

I've been struggling to come up with an interesting, useful and ethical way to use this data. While the obvious Yahoo Maps mashup is amusing (and scarily mind blowing), it's just not fair to the people who gave Astroglide their data in good faith. They do not deserve to have their privacy violated and abused more than they have already suffered. The screenshot posted at the top of this blog post is real - but out of respect to the people in the database, I will not be putting the mashup online.

More Than Just Embarassment

There is almost no chance that the Astroglide data could be used to steal someone's identity. Unfortunately, the data loss laws passed by the various states only really have identity theft in mind, and so they did not kick-in in this incident. This is primarily due to the fact that the data that was exposed does not match the strict definition of PII (personally identifiable information), as in this case, no social security, credit card or other account numbers were revealed.

Adam Shostack is quite vocal about his belief that data breaches/data loss incidents are not just about identity theft. He writes that "[Data Breaches] are about honesty about a commitment that an organization has made while collecting data, and a failure to meet that commitment."

My immediate reaction and concern when reading about the Astroglide incident was, "how embarrassing." Yes, it would be quite unpleasant for the people in the database if their colleagues, friends and attendees of their church learned that they had requested a sexual lubricant. Having this information come up in a Google search for the person's name could even pose a problem during some job interviews.

The Astroglide incident is bigger than just the issue of embarrassment. The smallest bit of information about an individual can serve as a vehicle for targeted phishing and other kinds of fraud. I discussed this with Prof. Markus Jakobsson and he came up with two fantastic examples of scams that could use this data.
  • A version of the spanish lottery scam with a spear phishing touch: A would-be phisher could send a postcard to each name on the list, advising them that since they are fans of the product, they were enrolled in an online lottery - and that they have won. All that they need to do is to go online to claim their winnings.

  • A class action version of the Nigerian 419 scam: A swindler could send a postcard to victims, notifying them of the data loss, and stating that they have been invited to join a class action lawsuit against Biofilm/Astrolide. The victim would be told that they will receive several hundred dollars as part of the settlement, and all that they need to do to claim their share is to fill out the postcard with their banking details and send it off.

These and other similar attacks would be much easier (and cheaper for the attacker) if they could be conducted by email. Turning each of the 200,000 names addresses into a valid email address is not an easy task - thankfully. This at least raises the cost of any attempted scam to the cost of a stamp for each potential victim.

A few months ago, I highlighted an incident at Indiana University where phishers were able to obtain a list of valid email addresses for IU students. They were then able to use this list, which consisted solely of users' names and email addresses to launch a highly successful spear phishing attack against the IU Credit Union.

Likewise, my colleagues in the Stop Phishing Research Group at Indiana University have conducted several targeted phishing studies that have clearly demonstrated the impact that of even the smallest bit of accurate information on a user can have on the effectiveness of a phishing attack. Simply put, Anything that is known about people can be used to win their trust. Such insights are used to improve consumer education in the recent effort

I suspect that most phishing attacks against credit unions and small regional banks already involve some form of data breach/loss. The economics of phishing simply do not add up otherwise - a phisher would be far better off claiming to be Citibank/Chase if they are sending out an email to 3 million randomly collected email addresses. I predict that we'll see a lot more of these kinds of phishing attacks. Although, due to the fact that notification won't be required in data loss incidents where social security or credit card numbers are not lost, the public will not be told how the phishers got their target list.

Phishers are constantly evolving their techniques. As in-browser anti-phishing technology becomes the norm, and spam filters mature, we will likely see a shift towards more targeted phishing. These attacks involve far less email messages, and are thus likely to better stay below the radar of the anti-phishing blacklist teams at Google, Microsoft and Phishtank. While data loss/breach incidents involving social security numbers of course pose a identity theft risk - the risk of this information being used for phishing and other scam attacks is currently being completely overlooked.

The solution to this, of course, is to amend the data breach/loss notification laws to apply when any customer information is lost or released to unauthorized parties. Companies will fight this, citing the high cost of notification and a desire to avoid needlessly worrying their customers. The laws will stay the same, and phishers will laugh all the way to the bank.

Could Biofilm/Astroglide be fined?

Contrast the Astroglide data loss to a completely separate yet similar incident:

Between August and November of 2002, the order information (name, address, items purchased) for over 560 customers was available to any curious visitor on the website of American underwear retailer Victoria's Secret. This was due to a web security snafu, which was soon fixed after it was reported. The following year, New York Attorney General Eliot Spitzer negotiated a settlement with Victoria's Secret, in which the company agreed to pay the state of New York $50,000 as well as notifying each customer whose data was inadvertently made available online. The New York Times had a full write up of the story online.

I think it's really useful to compare the two different cases. In both, data was accidentally put on the Internet. Neither dataset contained credit card numbers, social security numbers, or what we would usually think of as PII. As such, the various state data breach/loss laws didn't kick in.

However, while the data lost (name and address) wasn't particularly sensitive - after all, in many cases, it can be looked up in the phone book - it is the combination of that data with a highly sensitive and sexual product which would give the average consumer a legitimate cause for concern.

Victoria's Secret agreed to notify every customer whose data was accidentally put online. Astroglide has not told a single customer. Victoria's secret agreed to pay $50,000 to the state of NY for about 560 customers, although only 26 of them were actually NY residents. Astroglide has not paid a single penny to any state as a result of this incident.

I think Biofilm should be held accountable for the accidental publication of the names and addresses of 200,000 customers. To remedy this, I have spent quite a bit of time over the past couple weeks filing complaints with numerous state Attorneys General, including the notoriously pro-privacy AGs in California and New York. I have filed a complaint with the Federal Trade Commission. A few hundred overseas consumers tried to get Biofilm to send them a sample by airmail. Thus, I'm working with The Canadian Internet Policy and Public Interest Clinic to file a complaint with the appropriate Canadian authorities. I've also already filed complaints with the data protection agencies in the UK, Ireland, Belgium, The Netherlands and Finland.

A wise lawyer has informed me that the ultimate way to kickstart things is to find a California resident victim, and have that person file an action under CA Business & Professional Code 17-200. My name is not in the database and I do not live in California. Furthermore, I do not feel that is would be ethical to go through the list of 17 thousand California residents, looking them up in google, hopefully finding an email address, and then contacting those individuals to ask them to file a complaint. Thus, as much as I'd like to get a CA Business & Professional code complaint filed against Biofilm, my hands are currently tied.

There are two ways to judge the cost of data loss per customer for Victoria's Secret. $50,000 divided by 26 New York residents equals approximately $1925 per customer. However, given that no other state fined Victoria's Secret, it is probably safer to divide the $50,000 fine by all 560 customers, which gives us a fine of approximately $90 per customer.

Using that $90 per customer figure, I decided to figure out how large of a fine Astroglide could potentially face, assuming of course, that one or more state Attorneys General began investigating.

I pulled per-state stats from the database - which are broad enough that I feel confident that I can release them without putting any individual user's privacy at risk. Using state population estimates from the US Census Bureau, I was also able to calculate a ratio for the number of people in each state per Astroglide request. As much as I was hoping that KY (Kentucky) would win - I could already visualize the Fark headline - North Dakota won, with one Astroglide sample request per 908 state residents. New Mexico came in "last" with one request per 2656 state residents. Analysis of what these numbers actually mean is an exercise best left to the reader.

While it may not be realistic to expect Biofilm to pay $18 million in fines, it's quite surprising that they've been able to get away without even having to notify all of their customers. My hope is that by putting this limited bit of information online, I can hopefully start a debate on this issue.


This blog post will hopefully raise the profile of the Astroglide data loss incident, which unfortunately disappeared from the headlines after a day or two without Biofilm being held accountable for the massive breach of customer trust. It should also highlight the fact that once data has been cached by Google, putting the proverbial genie back in the bottle is next to impossible. If two PhD students can pull a copy of the database from Google's servers, so can malicious parties, including would-be phishers. It is perfectly reasonable to expect that multiple copies of the database were downloaded before Google heeded Biofilm's request, a few days later, and removed the data from its cache servers. Likewise, it is quite reasonable to expect that at least one of the downloaders has criminal intentions - or at least a willingness to sell the data on to others.

Consumers in the database face more than just embarrassment. To minimize the risk associated with phishing and other scam attacks, Biofilm should be forced to notify each of the 200,000 + exposed individuals. The take home lesson from all of this, is that these kinds of data loss incidents will continue to occur in the future and it's highly unlikely that consumers will be told. Existing data breach/data loss laws have been narrowly focused to target the threat of identity theft, a noble goal, but by no means the only threat that consumers face. These laws should be amended to correct this problem. Consumers have a right to be told whenever their information is inadvertently released to unauthorized parties.


Anonymous said...

What I'm not clear about is why the data is embarrassing. Do people get embarrassed to admit they have children because that reveals tha they've been having sex?

Anonymous said...

@anonymous - That's just silly. Kids come from hospitals, which is why I stay far away from them.

Christopher Soghoian said...

@anonymous 1:

What happens when someone creates a mashup of the astroglide database with state marriage records?

Suddenly, you have a nice big list of people who are engaging in sex acts before marriage. Tsk tsk. What would your grandparents think?

Anonymous said...

what is really embarrassing is the use of yahoo maps. eww.

Unknown said...

My grandparents know I fuck. My girlfriend's grandparents know that she fucks me. I don't think anyone who knows us would really be surprised. Hell, we might be on your list -- we live in Bloomington -- but I couldn't care.

You have good intentions, Chris. Try directing your energy at something where you can make a difference.

Anonymous said...

My grandparents know I fuck. My girlfriend's grandparents know that she fucks me. I don't think anyone who knows us would really be surprised. Hell, we might be on your list -- we live in Bloomington -- but I couldn't care.

Cool. Please provide more information and I'll create a website in tribute to you and your girlfriend's intimate moments.

Anonymous said...

The fact that somebody's relatives now that you get laid is not the point. The point is that potentially embarrassing information on the net was released without consent and privacy was violated. I am sure there are plenty of people that would rather not be on that list.

Anonymous said...

I think this post is more symptomatic of a problem in American society than anything else.

It does not say anywhere that Biofilm published the data willfully. It is a fuck up, fuck ups do happen. Nobody died, and if you have people anal enough to look up the name of the people on the list so that they blacklist them from their circle of friends, it begs the question of why you would want to know people like that in the 1st place.

As for the phishing problem, as you say yourself, it is not that simple and phishers have plenty of other ways.

In any case, if you are stupid enough to give away your personal informations on the back of a postcard, you deserve all you will get.

It has nothing to do with you, but you still get involved under some spurious and self righteous reasons. I think the problem lies with you.

Unknown said...

the use of the phrase "marital aids" is offensive. They are _sex toys_ my friend. or "sexual aids" if you must shy away from the obvious (and imho accurate) connotation of fun.

To imply that they are only for married people implies that sex is only for married people. I strenuously object that we go quietly along with the puritanical idiocy that pretends that sex outside of marriage is not the usual -and healthy occurrence.

Anonymous said...

People who have premarital sex are idiots. They say it's "natural," but don't want to deal with the NATURAL consequence: pregnancy. If this weren't the NATURAL consequence, we'd have to take a pill to GET pregnant, not to stay childless. Use your brains for once, instead of your nuts.

Anonymous said...

Utah has a pretty steamy ratio ;>