In any community theres bound to be friction, but some… take it further than others. Redditis a platform for thousands of online communities (known as “subreddits”), where community members can submit content, and upvote, downvote, or comment on content that others have submitted. Topics of discussion on Reddit run the gamut of human interest, but one of Reddits favorite topics to talk about is, unsurprisingly, Reddit itself.
Arecent poston AskReddit posing the question – What popular subreddit has a really toxic community? – surged to the top of the front page with 4,000 upvotes and over 10,000 comments as Redditors voiced their opinions on which Reddit communities they found to be the most abhorrent (the /r/ prefix denotes a subreddit):
As I sifted through the thread, my data geek sensibilities tingled as I wondered Why must we rely upon opinion for such a question? Shouldnt there be an objective way to measure toxicity?
With this in mind, I set out to scientifically measure toxicity and supportiveness in Reddit comments and communities. I then compared Reddit’s own evaluation of its subreddits to see where they were right, where they were wrong, and what they may have missed. While this post is specific to Reddit, our methodology here could be applied to offer an objective score of community health for any data set featuring user comments.
So what is Toxicity? Before we could do any analysis around which subreddits were the most Toxic, we needed to define what we would be measuring. At a high level, Toxic comments are ones that would make someone who disagrees with the viewpoint of the commenter feel uncomfortable and less likely to want to participate in that Reddit community. To be more specific, we defined a comment as Toxic if it met either of the following criteria:
However, the problem with only measuring Toxic comments is it biases against subreddits that simply tend to be more polarizing and evoke more emotional responses generally. In order to account for this, we also measured Supportiveness in comments – defined as language that is directly addressing another Redditor in a supportive (e.g. Were rooting for you!) or appreciative (e.g. Thanks for the awesome post!) manner.
By measuring both Toxicity and Supportiveness we are able to get a holistic view of community health that can be used to more fairly compare and contrast subreddit communities.
Comments were pulled via the Reddit APIfrom the top 250 subreddits by number of subscribers, in addition to any subreddit mentioned in the AskReddit thread with over 150 upvotes. Comments were pulled from articles on the front page of each subreddit, 1000 comments were randomly chosen from each subreddit for analysis, and any subreddit that had fewerthan 1000 comments was excluded from the analysis.
Idibonspecializes in combining machine learning with human annotation of text, and for this task I was able to take advantage of our technology to improve both the efficiency and accuracy of our experiment. Specifically, a task as nuanced as labelling comments as Toxic/non-Toxic given our definition requires human annotation, but if we had annotated all 250 subreddits with 1,000 comments the task would have, at about 11 seconds/annotation (the average amount of time it took for our contributors) and 3 annotations per comment (in order to get multiple opinions for consensus), required nearly 23,000 person-hours to annotate.
Instead, we were able to use Idibons Sentiment Analysis model to narrow down the number of comments human annotators would need to see to only those that were most likely to carry negative or positive sentiment (a good high-level proxy for Toxicity/Supportiveness), and also only for subreddits which contained highly negative or positive sentiment generally. Using this tool, we narrowed down our dataset to 100 subreddits and 100 comments per subreddit, cutting our total number of annotations from 250,000 to 10,000, a decrease of 96%.
At Idibon, we have three primary ways of engaging a third party to annotate text: the crowd, a global network of analysts, and experts who are analysts for our clients. In this case, we took our 10,000 comments to the crowd with CrowdFlower, an online human annotation service, where nearly 500 annotators from around the globe labeled our Reddit comments based on our criteria, until each comment had been labeled 3 times.
In determining what makes a Subreddit community Toxic or Supportive, simply counting the number of Toxic and Supportive comments wouldnt be sufficient. One of the unique aspects of Reddit is that members of the community have the ability to upvote and downvote comments, which gives us a window into not only what individual commenters are saying, but whether or not and to what extent the community as a whole supports those comments. With this in mind, overall Toxicity/Supportiveness of a subreddit was determined as a function of the scores1 of all the Toxic and Supportive comments in a subreddit2.
Here are the results for subreddits plotted by Toxicity and Supportiveness:
In the interactive chart above, the red bubbles represent subreddits that were mentioned in the thread What popular subreddit has a really toxic community? post with a score greater than 150 (upvotes – downvotes), while those in gray were picked from the top 250 subreddits by subscribers. As we move up and right in the chart, subreddits were found to be more Toxic and lessSupportive, while those in the bottom left are the least Toxic and most Supportive. Bubbles are sized by number of subscribers in the subreddit.
So how good was Reddit at picking out its most Toxic communities? Well, it seems they got most of the big ones with a few exceptions. The winner by far with 44% Toxicity and 1.7% Supportiveness, /r/ShitRedditSays, received 4,234 upvotes on the thread. /r/ShitRedditSays is, somewhat ironically, a subreddit dedicated to finding and discussing bigoted posts around Reddit – where the term Redditor is often used as an insult, and the Toxicity was generally directed at the Reddit community at large. However, it’s also important to note that a significant portion of their Toxicity score came from conversations between SRS members and other Redditors who come specifically to disagree and pick fights with the community, a trap that many members tend to fall into, and which lead to some rather nasty and highly unproductive conversations.
While many of the most Toxic subreddits were mentioned in the thread, there were also a number of highly Toxic subreddits that Reddit seemed to miss, such as /r/SubredditDrama, /r/TumblrinAction (a subreddit dedicated to mocking Tumblr – where marginalized groups, particularly LGBTQ, post about their experiences), /r/4chan, and /r/news.
On the other end of the spectrum, it seems that some of the subreddits that were picked out as being Toxic were found to be some of the most Supportive communities by our study. In particular, /r/GetMotivated, with 50% Supportiveness and 6% Toxicity, seemed far from the Toxic community described by /u/LookHardBodyas comprised of two type[s] of people […] The people that post content to motivate others or because it motivated them and commenters who comment why it’s bullshit, stupid and unmotivational because it wasn’t specifically tailored to them.
However, upon inspection of the data, there certainly were thesetypesof negative posts in /r/GetMotivated as claimed, but they were not supported by the community at large. In fact, the average score for Supportive posts in /r/GetMotivated was 41, while Toxic posts had an average score of only 1.4. Overall, /r/GetMotivated fits in nicely next/r/loseit and /r/DIY as a subreddit built specifically for members to seek/give advice and support from the community, an unsurprisingly supportive bunch.
Another example of why its important to look at comment scores comes when we look at bigotry across subreddits:
Looking specifically at bigoted comments, the importance of taking score into account rather than number of comments becomes even more apparent. For a small number of communities (/r/Libertarian, /r/Jokes, /r/community, and /r/aww) the total aggregated score of comments that our annotators labeled as bigoted was actually negative – so despite having bigoted comments present in their communities, those bigoted comments were rejected by the community as a whole. On the other end of the spectrum we see /r/TheRedPill, a subreddit dedicated to proud male chauvinism3, where bigoted comments received overwhelming approval from the community at large.
In researching this post, I have delved deep into the darkest recesses of the interwebs, I have read comments that cannot be unread, seen things that cannot be unseen but for good cause!
Sentiment analysis is only the tip of the iceberg in understanding how people relate to one another, how communities form and what characteristics make up a community abstracted from its individual members. In the case of subreddits, hopefully this post will give you some idea of what communities youd want to be a part of and which you might want to avoid.
On a broader scale, these methods help answer larger questions like, How can we build communities that were proud of and that encourage effective communication?, and How should we structure our discourse so that people really hear one another? Answering these questions will allow us to strengthenourconnectionswith those around usandimprove our daily experiences in an increasingly digital world.
– Ben Bell (@BenSethBell)
PSLike this article? Check outour AMA on it!
Ben excels at sciencing the holy heck out of your data. He owns the creation of machine learning applications from data creation (building labeled data sets through controlled human annotation) to model building, evaluation, and improvement. Ben is a language nerd, and is specifically interested in how culture is manifested in language. He is also passionate about international development, and has worked across Latin America developing initiatives in education and entrepreneurship, and he is excited to use NLP to bring text analytics to all the worlds languages. While Ben is at work, his mom puts on a patriotic leotard, a cape, and knee-high, high heeled boots and fights crime as a super heroine.
Three separate threads have been whirling around my head for the last few months, so I was glad to have the opportunity to connect them a few weeks ago at UC Merced. Thread #1: Fraud Fraud is a big deal–the Association of Certified Fraud Examiners places the amount of global fraud loss at $3.7 trillion…Read More
Did a South Park episode change the meaning of the panda emoji ()? Why is the Japanese fish cake () the most negative of all foods?What do emoji mean and how does their meaning change over time? A recent study from Slovenia which gives each emoji a sentiment rankingprovides us with more questions than answers….Read More
On August 31, 1990, the McDonalds corporation opened its very first store in Russia on Moscows Pushkin Square. This was the largest McDonald’s store in the world at the time, with 28 cash registers and a capacity of 700 customers. The Pushkin Square store beat the record number of inaugural sales with 30,000 customers, and…Read More