Skip to main content
Tracking Health on Twitter

You are listening to Health Library:

Tracking Health on Twitter

May 04, 2015

Some of our health decisions are at least in part a product of where we live. For example, it might be more tempting to buy a fast food burger if there are few restaurants and grocery stores with healthy choices nearby. But historically it has been difficult for epidemiologists to get a handle on factors that influence the health within specific neighborhoods. In a project called Hashtag Health, Quynh Nguyen, Ph.D., an assistant professor in the department of Health Promotion and Education at the University of Utah, is using Twitter to track some of these factors and how they impact communities. She talks about what she is looking for, and why it is important to gauge their impact on overall happiness, as well as health.

Episode Transcript

Interviewer: Tracking Health on Twitter, up next on The Scope.

Announcer: Examining the latest research and telling you about the latest breakthroughs. The Science and Research Show is on The Scope.

Interviewer: I'm talking with Dr. Quynh Nguyen, Assistant Professor of Health Promotion and Education at the University of Utah.

Dr. Nguyen, every day we make choices that affect our health. You know, we decide whether we're going to eat that donut or smoke that cigarette. But I think you would argue there's forces beyond our own decisions that influence our health.

Dr. Nguyen: Yes, as an epidemiologist, a lot of what I see in the research is focused on individualized risk factors, like their own health behaviors, what people decide to do or not do. Exercise, eat that donut, eat that burger, or French fry. But our behaviors are in a setting, in a context. That's not looked at enough.

Interviewer: So what are the things that drive certain behaviors that impact our health?

Dr. Nguyen: There's definitely some hot topics in this area. We're interested in the term "food desert" so that's one particularly growing field. Food deserts are places where there is a lack of healthy foods. There you would find small convenience stores rather than supermarkets, or fast food chains instead of more local restaurants. Perhaps that's influencing your behavior. Maybe it's easier to grab that burger instead of making a chicken meal.

Interviewer: So how do you go about measuring that?

Dr. Nguyen: We are looking at tweets, the Twitter data in particular. What people are saying they're eating and tweeting about. We can also use particularly like people mention fast food chains. We're also keeping track of whether they're mentioning McDonald's or KFC. And also, social media data can be used because people check into places, so you can use their check in indicators as how they're utilizing the neighborhood resources.

Then we kind of look at that as an alternative to what has been conventionally done, and hopefully using social media data is going to be cheaper, more efficient, and can be updated more easily than other types of neighborhood data.

Interviewer: How do you go about measuring that? What has been the problem up to this point?
Dr. Nguyen: Well I think that the idea for this dataset called "Hashtag Health" came about during my post-doc years when I was preparing data for five urban cities, Boston, Chicago, New York, Baltimore, and Los Angeles. The face value of it, you think, "That's not going to be a problem at all. Those cities are well-known." But it was very hard to get consistent data across geographies. The only consistent data we had were [inaudible 00:02:46] data, and violent crime rates, and everything else varied. So if you're a neighborhood researcher, you find it very hard to study beyond a city.

I wanted to overcome the data problem by using a relatively vague and cheap, and hopefully once we build that algorithm, cost-efficient way to categorize neighborhoods, and also to categorize it in a different way than has been before. Because a lot of it has been focused on the fiscal resources of a neighborhood, the transit, like is there a bus or train that runs through the neighborhood, or is there a grocery store.

But what about the people who live there, how are they interacting with each other, what are they saying? So we want to capture more of the cultural and social processes. So it's both an untapped source that we're using and also, we're capturing a different side of neighborhoods that hasn't been captured.

Interviewer: I mean, the old way of doing things, you have to commission someone to look at one very specific thing, but you're just taking advantage of something that people are doing anyway.

Dr. Nguyen: I think Twitter data, like you said, is very good. For one thing, it's continuous. There's always a continuous stream. There's someone tweeting in the middle of the night. Instead of getting participants for maybe 30 minutes, you're going to get a continuous stream and you can update your data very easily. Also, it can allow for more massive studies for the project I'm working on. It's first going to try to capture neighborhood data for the entire state of Utah. For each census track, we're going to have indicators of food, exercise, and happiness as a starting point. Then we're going to grow from Utah to the U.S., so it's kind of imagining a bigger type of neighborhood study, than could be possible with more conventional data.

Interviewer: We had talked the food example before, the fast food example, where you can look for words like McDonald's What about happiness? How would you gauge happiness? What are you looking for?

Dr. Nguyen: Yes. That's a really great question. Happiness has actually been the hardest out of the three indicators we proposed to grade for Utah, because we're using a machine learning algorithm to do that, and so you're imagining, "How can a computer program predict the sentiment of a tweet?" And that's going to be quite difficult.

So we're starting with a continuous scale of one to nine, so we're using a data set that has about 10,000 words that have been scored for happiness. Each word is assigned a happiness value, and then each tweet we summarize the happiness values for that tweet. So we compared it with human ratings, so that's the gold standard. If a human is reading it, what are they seeing? Is this tweet neutral, negative, or positive? The agreement is around 73%.

Interviewer: I imagine you're selecting for sort of a young population. It's probably the 20, 30-year-old set mostly tweeting.

Dr. Nguyen: More people use Twitter are younger. So we're going to pilot our social media database looking at young adult outcomes, that was intentional because we recognize social media users tend to be younger.

Interviewer: How do even you find out what they're unhappy about?

Dr. Nguyen: For us, we think that happiness is important because happiness might be related to an array of different things. Our first [pilots] will look at whether these indicators are going to predict young adult obesity, but then I could see it also predicting perhaps suicide, or depression in the community.

As health professionals, we know that health is not just the absence of disease. It's like a little bit more. I think the ultimate goal is happiness. When you're around happy people, maybe that also breeds happiness. It might be a contagion effect, and vice-versa. If you're in a sad community, maybe that's also detrimental to your own mental health.

Announcer: Interesting. Informative. And all in the name of better health. This is the Scope Health Sciences Radio.