Work of the week: The Parable of Google Flu: Traps in Big Data Analysis

This paper’s[1] release in the spring of 2014 made a big splash, with ripples of media attention following from outlets including The New York Times[2], NPR[3], Slate[4], and The Guardian[5]. In it, David Lazer and his collaborators question the accuracy of Google Flu Trends (GFT), a highly celebrated algorithm that uses Google search queries to estimate flu activity in real time, and which, since its development in 2008[6], has become a poster child for the promise of big data. The authors are not the first to criticize GFT[7], nor are they the first to receive media attention for it (1).  Their argument’s value lies in its scope. By using GFT as an example, the authors send out a broader plea that big data analysts carry out their work with transparency, that they balance their analyses with traditional sources, and that they incorporate a certain finesse into their algorithms, so that they remain accurate even when the methods of data generation change. 

Lazer et al. make the important assertion that the size of our data sets can never replace good, robust statistical thinking, though the former can greatly enhance the latter’s capabilities. The creators of GFT knew this, but in the excitement that followed its development, there was, and still remains, the risk of others proceeding without due caution. Those of us who make daily use of large data sets must remember that big data is powerful, but won't walk on water. 

(1) This Nature News article[8] from 2013 received a similar cascade of popular media responses. 


[1] Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). Big data. The parable of Google Flu: traps in big data analysis. Science (New York, N.Y.), 343(6176), 1203–5. doi:10.1126/science.1248506

[2] Lohr, S. (2014, March 28). Google Flu Trends: The Limits of Big Data. New York Times. Retrieved from

[3] Harris, R. (2014, March 13). Google’s Flu Tracker Suffers From Sniffles. NPR. Retrieved from

[4] Auerbach, D. (2014, March 19). The Mystery of the Exploding Tongue: How Reliable is Google Flu Trends? Slate. Retrieved from

[5] Arthur, C. (2014, March). Google Flu Trends is no longer good at predicting flu, scientists find. The Guardian. Retrieved from

[6] Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012–4. doi:10.1038/nature07634

[7] Olson, D. R., Konty, K. J., Paladini, M., Viboud, C., & Simonsen, L. (2013). Reassessing Google Flu Trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales. PLoS Computational Biology, 9(10), e1003256. doi:10.1371/journal.pcbi.1003256

[8] Butler, D. (2013). When Google got flu wrong. Nature, 494(7436), 155–6. doi:10.1038/494155a

The respiratory illness season is off to an eventful start

Recent reports describe an outbreak of Enterovirus 68, which in mid-August caused some cases of severe respiratory illness in Missouri and Illinois [1], and which may be the cause of a spike in respiratory illness across the Midwest region of the United States. As of September 5th, the Children's Hospital Colorado had treated over 900 cases of severe respiratory illness in three weeks [2]. This NPR blog provides a good overview of the disease [3]. The Centers for Disease Control and Prevention also have published an informative fact sheet [4]. Little surveillance information is available, since the disease is "not a reportable disease in the United States" (see [4]). Most cases affect children under 5, especially if they have pre-existing respiratory conditions such as asthma.  As there are no specific treatments or vaccination, prevention and early response are the best safeguards. 

Keeping an eye on the Enterovirus outbreak could be particularly important with the coming influenza season; recent research suggests that, at least with the H1N1 pandemic influenza strain, co-infection with other respiratory illness can predispose a patient toward substantially more severe disease [5]. 


[1] Claire M. Midgley, Mary Anne Jackson, Rangaraj Selvarangan, George Turabelidze, Emily Obringer, Daniel Johnson, B. Louise Giles, Ajanta Patel, Fredrick Echols, M. Steven Oberste, W. Allan Nix, John T. Watson, Susan I. Gerber. (2014) Severe Respiratory Illness Associated with Enterovirus D68 — Missouri and Illinois, 2014. Morbidity and Mortality Weekly Report, 63(Early Release): 1-2.

[2] Electra Draper. (5 Sept. 2014) Colorado children's hospitals see spike in severe respiratory illness. The Denver Post,

[3] Nancy Shute. (8 Sept. 2014) CDC warns of fast-spreading enterovirus afflicting children. NPR,


[5] Frank P. Esper, Timothy Spahlinger, Lan Zhou. (2011) Rate and influence of respiratory virus co-infection on pandemic (H1N1) influenza disease. Journal of Infection, 63 (4): 260–266. doi: 10.1016/j.jinf.2011.04.004



Why I got a flu shot

While visiting Seattle a few months ago, a friend asked me why she should get a flu shot. "I've never gotten the shot. I've only had the flu a few times, and when I did, it wasn't bad. I just don't see the point."

"It's true," I replied. She's a generally healthy twenty-four year old, and, statistically speaking, has very little to fear from the flu. "Chances are getting the flu won't affect you much. But, if you get infected, and then pass it on to someone who's over 75 with weak lungs - that could be really, really awful."

She was silent for a moment.

"Sorry," I said. "I don't mean to be harsh."

"No," she replied, "I see your point. I'll get the shot as soon as I get back home."

We young adults often see little point in sticking ourselves with a needle each year to provide just a 60% reduction in risk from getting the flu. Aside from the momentary discomfort and the cost (mine was about $30; see below for resources to find reduced-price and free immunizations), it doesn't take long to do a web search and find all kinds of alarming pages devoted to exposing the supposed harms of vaccination. It's enough to make an otherwise-healthy person decide to keep their money and take their risks.

For me, though, getting the flu vaccine isn't about myself. I don't hope to disprove or discount the fears that people have about vaccines, but, given the extensive reading I've done and my own personal experience, I've decided that getting the vaccine is worth it. The vaccine tends to be least effective in the people who are at highest risk of developing severe complications from influenza, namely the 65+ crowd. If my getting vaccinated can save them a trip to the hospital or worse, then I see it as my duty to get the vaccine. Zooming out to a population scale, some recent work [1] [2] [3] suggests that school-age children and young adults tend to drive influenza epidemics in their greater communities, so it's likely that wide vaccination coverage for this age group - the one least likely to suffer acute symptoms - will, somewhat counter-intuitively, provide the most effective prevention against severe cases and large epidemics.

That's not to say that the individual benefit for young adults to get vaccinated should be overlooked. While seasonal flu can and does cause severe complications in young adults every year, pandemic flu tends to disproportionately affect school-age children and young adults, often with uncommonly severe symptoms [4] [5]. Vaccination can help in such cases, even if the vaccine is mis-matched to the pandemic strain; Garcia-Garcia et al. note that the general 2009 seasonal flu vaccine likely provided some immunity against the H1N1 pandemic strain, particularly against severe forms [6]. That's important, because we often forget that symptoms can be quite severe; I had no idea that the influenza virus could attack heart cells until I met someone to whom it had happened, and who, in her early twenties, suffered (and thankfully survived) cardiac arrest as a result. 

Flu is serious business. When weighing whether or not to get the vaccine this year, I implore you to think both of yourself and of the others who might be impacted by your decision. The CDC provides lots of great tools for those interested in reading more about the flu, including information about the virus, vaccination, and tracking. 

Take care of yourselves, and each other, this flu season.

Resources for cheap or free flu shots:


[1] Dennis L ChaoM Elizabeth Halloranand Ira M Longini. (2010) School opening dates predict pandemic influenza A(H1N1) outbreaks in the United States. J Infect Dis. 202 (6): 877-80. doi:10.1086/655810

[2] Gog JR, Ballesteros S, Viboud C, Simonsen L, Bjornstad ON, et al. (2014) Spatial Transmission of 2009 Pandemic Influenza in the US. PLoS Comput Biol 10(6): e1003635. doi: 10.1371/journal.pcbi.1003635

[3] Dena Schanzer, Julie Vachon, and Louise Pelletier. Age-specific Differences in Influenza A Epidemic Curves: Do Children Drive the Spread of Influenza Epidemics? Am. J. Epidemiol. (2011) 174 (1): 109-117. doi:10.1093/aje/kwr037

[4] Karageorgopoulos DE, Vouloumanou EK, Korbila IP, Kapaskelis A, Falagas ME (2011) Age Distribution of Cases of 2009 (H1N1) Pandemic Influenza in Comparison with Seasonal Influenza. PLoS ONE 6(7): e21690. doi: 10.1371/journal.pone.0021690 

[5] Lone Simonsen, Matthew J. Clarke, Lawrence B. Schonberger, Nancy H. Arden, Nancy J. Cox, and Keiji Fukuda. Pandemic versus Epidemic Influenza Mortality: A Pattern of Changing Age Distribution. J Infect Dis. (1998) 178 (1): 53-60. doi:10.1086/515616

[6] Garcia-Garcia Lourdes, Valdespino-Gómez Jose Luis, Lazcano-Ponce Eduardo, Jimenez-CoronaAida, Higuera-Iglesias Anjarath, Cruz-Hervert Pabloet al. Partial protection of seasonal trivalent inactivated vaccine against novel pandemic influenza A/H1N1 2009: case-control study in Mexico City. BMJ 2009; 339:b3928