Google.org, which is the philanthropic arm of Google, has released Google Flu Trends to great fanfare and criticism.
Google Flu tracks searches for flu symptoms on Googles search service. So if I type “achy headache” into Google, it might count the search as evidence that I, or someone I was caring for, had the flu. Enough people use Google for search that Google can use searches like this to track the spread of the virus across the country. The science of tracking diseases is called epidemiology.
Currently epidemiologist use anonymized data from several sources to track the outbreak of disease. They can get data from pharmacy purchases, or from Emergency Room visits. They merge this data against other information like weather patterns. Using these data sources the Centers for Disease Control and Prevention (CDC) can get a pretty good picture of what is happening in the US regarding the outbreak of disease. It should be noted that these “traditional methods” allow the CDC to watch out for far more than just influenza. They use the system to ensure that any number of potentially catastrophic diseases to silently spread across the planet. What is interesting about Google Flu is that it is more effective than the methods mentioned above at predicting flu outbreaks by two weeks.
While I think that Google Flu Trends is fascinating, I am more interested in the privacy implications. I use gmail. I use Google Maps extensively ( I make map labelled with the cool things in my neighbourhood). Google has a photo of the front of my house on Street View. I have used Google Checkout to make purchases so Google knows my credit card information (or did). It is pretty obvious that Google is sensitive enough to make an educated guess that I might have influenza based on a search that I make. It is probably capable of making a guess that I have HIV, or Cancer, or Diabetes. All of this is independent of me using their Google Health application to track even more detailed information about allergies, procedures and drugs. “Google knows” is a bloody good assumption without evidence to the contrary.
Sounds pretty scary doesn’t it? The only reason I am even the least bit comfortable with this is the Google Corporate Motto: Don’t Be Evil.
Google takes this pretty seriously, you can tell because they loose money not offering gmail in China, where they cannot guarantee privacy of communications. They also told the Justice Department to shove off, when they asked for search histories. Both of these efforts cost them money so that they could live up to their motto.
That does not mean that I trust Google, it means that I do not trust them less.
I have been an on/off critic of Dr. Peel and her Patient Privacy Rights group for quite some time. But I must applaud her recent efforts to advocate for patient privacy rights regarding Google Flu Trends.
In move consistent with their model Google responded to the Google Flu Trends concerns. Google specifically claims that their search data retention policy applies to the flu related data as well. That is very good news for people like me, who tend to obsess about the details of security and privacy of health information.
-FT
You have to read Google’s response to see why I find it comforting…. First, they do not provide the CDC anything but aggregate data. Second, the data that they are using for this project -is- subject to their current internal de-identification policy.
What I meant is: why did you need comforting in the first place? Google isn’t collecting any more data than prior to FluTrends, and FluTrends is aggregates at the State level. In other words, I don’t see why the release of FluTrends is, in itself, anything to worry about. If you were worried about Google privacy before FluTrends, you should still be worried now, and if you weren’t, there’s no additional threat.
There are no privacy implications.
Google can tell when its Thanksgiving by the searches for Turkey, but they can’t infer from this whether a specific individual is a vegetarian or not.
I think you have it backwards. Google is in a position to aggregate your searches. That means that it can infer that you have HIV and/or other private information, based on aggregating all of your search terms. Google definitely can see that. The question is what does Google release and what does it keep in confidence?
Now if Google provides the CDC with “dietary” search information (which is not a stretch) then the CDC “might be able to tell that it is Thanksgiving, by searches for turkey”, without knowing about the vegetarian status of a particular individual. But Google sure as hell knows.