{"id":2461,"date":"2014-12-03T04:55:23","date_gmt":"2014-12-03T04:55:23","guid":{"rendered":"http:\/\/www.lifeandnews.com\/articles\/?p=2461"},"modified":"2016-08-10T16:46:23","modified_gmt":"2016-08-10T16:46:23","slug":"studying-society-via-social-media-is-not-so-simple","status":"publish","type":"post","link":"https:\/\/www.lifeandnews.com\/articles\/studying-society-via-social-media-is-not-so-simple\/","title":{"rendered":"Studying society via social media is not so simple"},"content":{"rendered":"<p>By <a href=\"http:\/\/theconversation.com\/profiles\/jurgen-pfeffer-146258\">J\u00fcrgen Pfeffer<\/a><em>, Carnegie Mellon University<\/em> and <a href=\"http:\/\/theconversation.com\/profiles\/derek-ruths-146533\">Derek Ruths<\/a><em>, McGill University<\/em><\/p>\n<p>Behavioral scientists have seized on social media and their massive data sets as a way to quickly and cheaply figure out what people are thinking and doing. But some of those tweets and thumbs ups can be misleading. Researchers must figure out how to make sure their forecasts and analyses actually represent the offline world.<\/p>\n<h2>Big Data\u2019s overwhelming appeal<\/h2>\n<p>Imagine you\u2019re interested in analyzing society to learn the answers to questions like: how bad is the flu this year? How will people vote in an upcoming election? How do people talk about and cope with diabetes? You could interview people on the street or call them on their phones. That\u2019s what traditional polling firms do \u2013 but it takes time and can be quite costly. A promising alternative involves collecting and analyzing social media data \u2013 quickly and for free.<\/p>\n<p>Hundreds of millions of people use social media platforms like <a href=\"http:\/\/newsroom.fb.com\/company-info\/\">Facebook<\/a> and <a href=\"https:\/\/about.twitter.com\/company\">Twitter<\/a> every day. Individually, they create traces of their activities when they tweet, like and friend each other. Collectively, these users have produced massive, real-time streams of data that offer minute-by-minute updates on social trends \u2013 where people are, what people are doing and what they are thinking about. For the last several years, researchers in academia and industry have been developing ways to utilize this flood of data in their investigations and have published thousands of papers drawing on it.<\/p>\n<p>A typical Twitter study could look like the following. Imagine you\u2019re interested in information diffusion after a tragic event. The moment you hear about such an event \u2013 for instance, the Boston Marathon bombing \u2013 you activate software on your computer that collects in real time Tweets that contain your keywords of interest \u2013 maybe Boston in this case. Since there are no Twitter archives available for researchers, you\u2019d utilize Twitter\u2019s data interface and collect all data that come for free. After a couple of hours or days you stop the data collection and start with the analysis.<\/p>\n<figure class=\"align-center zoomable\"><a href=\"https:\/\/62e528761d0685343e1c-f3d1b99a743ffa4142d9d7f1978d9686.ssl.cf2.rackcdn.com\/files\/65737\/area14mp\/image-20141127-16934-12xp0j3.jpg\"><img src=\"https:\/\/62e528761d0685343e1c-f3d1b99a743ffa4142d9d7f1978d9686.ssl.cf2.rackcdn.com\/files\/65737\/width668\/image-20141127-16934-12xp0j3.jpg\" alt=\"\" \/><\/a><\/figure>\n<p><span class=\"caption\">So much data, there for the taking\u2026.<\/span><br \/>\n<span class=\"attribution\"><a class=\"source\" href=\"https:\/\/www.flickr.com\/photos\/peterras\/15149258618\/in\/photolist-p5FUN7-8CBj3s-8CAgVh-8ku7v6-aQy1JM-6yNCXa-6z8prX-dGK1zs-cCyLwu-9BBi5g-eeyh3d-cCyLyE-aSKGWZ-6ADDuF-dU7Cn1-8RszNr-7kETST-7U4KnJ-6dENoE-93daad-7H6fnx-98eeX8-8BKnwi-7mrUDT-7mvLx9-7mrS54-oh7hti-jfYZRK-6tXvwF-nZULtL-oHsWUa-84Gxki-xn5e8-78y1BK-dDyZav-dyxsH5-aQt1AB-bqJgjX-6MPfLz-7e5YK6-82TGSb-9Yf1ju-dvGh9G-6h65V7-cYiDZd-81gMx7-8CBeSS-cCyLxA-7E1AHQ-7E1WjA\" rel=\"nofollow\">Peter Kirkeskov Rasmussen<\/a>, <a class=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\" rel=\"nofollow\">CC BY-NC-SA<\/a><\/span><\/p>\n<h2>What to watch out for<\/h2>\n<p>Not surprisingly, this effort to measure and predict human behavior from social media data is fraught with pitfalls \u2013 both obvious and very subtle. For instance, we know that different social media platforms are preferred by <a href=\"http:\/\/www.pewinternet.org\/Reports\/2013%20\/Social-media-users.aspx\">different demographic groups<\/a>. However, most social media studies don\u2019t carefully account for the fact that Twitter is used mostly in cities or that most Pinterest users are upper middle-class and female. This oversight can introduce serious errors into predictions and measurements.<\/p>\n<p>Many of the \u201cindividuals\u201d that populate social media platforms are actually accounts managed by public relations companies (think Justin Bieber or Nike) or not even humans at all but automated robots. Because these accounts aren\u2019t portraying anything that even approximates normal human behavior, studies need to remove such accounts before making predictions. However, finding robot accounts can be quite hard.<\/p>\n<p>Another big issue is how the data are collected to be studied. Academic researchers need free \u2013 or at least very cheap \u2013 access to social media data to perform their studies. Few social media outlets provide this, with Twitter being the exception. Because social media studies tend to be often based on data that are sampled (researchers get about 1% from the free Twitter interface), it\u2019s often the case that what\u2019s available to researchers might not be a <a href=\"http:\/\/arxiv.org\/abs\/1306.5204\">representative sample<\/a> of the overall social media data.<\/p>\n<figure class=\"align-center zoomable\"><a href=\"https:\/\/62e528761d0685343e1c-f3d1b99a743ffa4142d9d7f1978d9686.ssl.cf2.rackcdn.com\/files\/65735\/area14mp\/image-20141127-21951-1wyygg6.jpg\"><img src=\"https:\/\/62e528761d0685343e1c-f3d1b99a743ffa4142d9d7f1978d9686.ssl.cf2.rackcdn.com\/files\/65735\/width668\/image-20141127-21951-1wyygg6.jpg\" alt=\"\" \/><\/a><\/figure>\n<p><span class=\"caption\">Simply collecting billions of data points isn\u2019t enough.<\/span><br \/>\n<span class=\"attribution\"><a class=\"source\" href=\"https:\/\/www.flickr.com\/photos\/geoliv\/6481563277\/in\/photolist-aSKGWZ-6ADDuF-dU7Cn1-8RszNr-7kETST-7U4KnJ-6dENoE-93daad-7H6fnx-98eeX8-8BKnwi-7mrUDT-7mvLx9-7mrS54-oh7hti-jfYZRK-6tXvwF-nZULtL-oHsWUa-84Gxki-xn5e8-78y1BK-dDyZav-dyxsH5-aQt1AB-bqJgjX-6MPfLz-7e5YK6-82TGSb-9Yf1ju-dvGh9G-6h65V7-cYiDZd-81gMx7-8CBeSS-cCyLxA-7E1AHQ-7E1WjA-hhwd41-8CAnKJ-8CAoBW-8Cx9hT-8Cx6Qt-8ywotc-6ADDuK-bqNiKF-8h6sWa-8Wb619-5n4FWw-6u1GFx\" rel=\"nofollow\">Geoff Livingston<\/a>, <a class=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/\" rel=\"nofollow\">CC BY-SA<\/a><\/span><\/p>\n<h2>How to do it better<\/h2>\n<p>In order to realize the immense potential of social media-based studies of human populations, research must tackle these kinds of issues head-on. In our <a href=\"http:\/\/www.sciencemag.org\/content\/346\/6213\/1063.summary\">recent paper<\/a> in Science on caveats for social media researchers, we discuss the need to control for bias in all the ways it appears \u2013 through platform-specific <a href=\"http:\/\/www.aaai.org\/ocs\/index.php\/ICWSM\/ICWSM13\/paper\/viewFile\/6128\/6347\">population makeup<\/a>, data collection and user sampling. This will involve improvements both in how data is collected and in how data is processed: for example, better methods for identifying non-human accounts on social media are needed.<\/p>\n<p>Ultimately, researchers must be more aware of what is being analyzed when they work with social media data. What data are actually being collected? What systems are actually being studied? What social processes are actually being observed? Through greater awareness of and attention to these questions, the research community will be better able to realize the great promise of social media-based studies.<\/p>\n<p><img loading=\"lazy\" src=\"https:\/\/counter.theconversation.edu.au\/content\/34631\/count.gif\" alt=\"The Conversation\" width=\"1\" height=\"1\" \/><\/p>\n<p>This article was originally published on <a href=\"http:\/\/theconversation.com\">The Conversation<\/a>.<br \/>\nRead the <a href=\"http:\/\/theconversation.com\/studying-society-via-social-media-is-not-so-simple-34631\">original article<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By J\u00fcrgen Pfeffer, Carnegie Mellon University and Derek Ruths, McGill University Behavioral scientists have seized on social media and their massive data sets as a way to quickly and cheaply figure out what people are thinking and doing. But some of those tweets and thumbs ups can be misleading. Researchers must figure out how to [&hellip;]<\/p>\n","protected":false},"author":39,"featured_media":5512,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[36,38],"tags":[],"_links":{"self":[{"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/posts\/2461"}],"collection":[{"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/comments?post=2461"}],"version-history":[{"count":2,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/posts\/2461\/revisions"}],"predecessor-version":[{"id":5513,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/posts\/2461\/revisions\/5513"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/media\/5512"}],"wp:attachment":[{"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/media?parent=2461"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/categories?post=2461"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lifeandnews.com\/articles\/wp-json\/wp\/v2\/tags?post=2461"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}