Quality Data, Quality Decisions: Why Web Scraping is Essential for Advanced Analytics
Gediminas Rickevičius·9 min


But the authorities still wanted to know what we were saying. First, the Department of Homeland Security instructed the social media giants to watch the network for chatter on crime and terrorist threats. Then Influenza came around.
Google was the first to monitor our annual battle with the flu with their discontinued Flu Trends service. Google cancelled the public service five years ago after announcing that the figures were spotty and unreliable. However, colleges and universities can still purchase access to the figures today. The decline of Flu Trends opened the door for social media to get into the business of Influenza monitoring. Four years ago Stanford University ran a symposium on the state of health data mining on Twitter. They found that Natural Language Processing (NLP) is good at identifying specific medical symptoms that people complain about in their tweets. Further insights into people’s moods about disease can be achieved by analyzing their sentiment. IBM provides the Tone Analyzer tool for analyzing user sentiment and you could build a social media API with the right expertise. Scientists have already used sentiment analysis to better understand people’s feelings in response to drug use like marijuana, but the effectiveness of Sentiment analysis is limited by the short duration of tweets.
No high-level engineering knowledge is required to build a health sentiment system. IBM already provides the tools.
No high-level programming skill is required to implement a health surveillance feature. You just need to train Alexa with a custom skill.
First, create a skill and train it to recognize sickness, “cough!, cough!”. Skip the response. You can also add additional triggers for the illness.
With this skill created you can now create a custom Lambda function to push the response to the skill into the cloud.
Alexa has always been listening to you. Amazon likely knows when you’re sick as soon as you do.

I began my career in manufacturing as a tech analyzing the origin of production problems. Then after ten years I realized that I was only investigating part of the problem. Today I am a freelance research analyst and ghostwriter for consulting companies. My work has also been published in Curious Droid and DisrupterDaily.