Popular discussion forums like Reddit, 4chan and 8chan have gained popularity with the evolution of media and the internet. These online forum communities gained their reputations because it allowed for individuals all over the world to post anything they wanted and interact with other posts under an anonymous account. There is one caveat to anonymity, though, it is the presence of a plethora of expressions ranging from aimless thoughts to harmful opinions. This study then argues the prevalence of hate speech across the three previously mentioned websites as the difference between them comes down to the regulation of posts in the communities. The initial approach was the use of a web scraper to flag the harmful comments in several boards on each website before being compared to each other. The findings of the study show that despite hate speech being prevalent in all three online sites, it suggests that the lack of regulated content allows for higher prevalence of hate speech.
Over the course of several years, online discussion forums have gained extreme popularity over the new idea that individuals can now post anything they want under a random, anonymous user name. With the lengths of the internet reaching places far and wide, opinions are bound to be made. The rise of these popular websites and the anonymity of it birthed a place to exercise ones free speech that can easily cross the line to the creation of comments that are considered hate speech. We studied the interplay between free speech and hate speech as online “trolls” hide behind the freedom of expression when it really only leaves space for hate speech to thrive. So how does the prevalence of these types of comments differ across the three platforms? We predict that forums with more moderation have less presence of “hate speech” because more regulation allows for the flagging of hateful comments and the ability to monitor and delete posts and replies that are deemed discriminatory.
The line between hate speech and free speech is an especially blurry one, often due to the fact that, as American citizens, it is within our first amendment to encourage free speech. So what exactly do we consider hate speech to be?
Under the definition of the law and the United Nations, hate speech is “any form of expression through which speakers intend to vilify, humiliate, or incite hatred against a group or a class of persons on the basis of race, religion, skin color sexual identity, gender identity, ethnicity, disability, or national origin.” These are usually hateful and offensive comments against marginalized groups to target their characteristics. Hate speech is essentially protected unless reviewed to be incitement; which essentially is an extended definition of speech. but illegal. To define free speech, then, is anything else other than incitement and serious threats.
Addressing hate speech is not limiting freedom of speech it is merely preventing it from becoming a bigger danger and turning into violence. While we additionally must be cautious of the new concept of our digital footprint.
The reason Reddit, 4chan and 8chan are used to study is because these sites are the most active and where we commonly expect such hateful language to appear. The sites are comparable as they distinctly differ in moderation which we study to have some sort of effect on the presence of hate speech in its communities. Reddit is primarily considered to be the most regulated, often to be used more for “memes” and shared thoughts. It is also the only out of the three to require the creation of an account, making it easier to track the individual behind the IP address. 4chan is substantially less regulated as no account is needed on this site and is completely anonymous. Though the content is marginally different than that of Reddit, hate speech is predicted to be more prevalent here as it is home to the infamously political imageboard /pol/ known for the alt-right community to interact (Hagen, 2023). Lastly, 8chan is considered to be more lax than 4chan, despite being the most similar to each other mainly due to the history in which 8chan originated. With 4chan’s incorporation of new regulation on their site, members of the alt-right movement shifted toward 8chan where less moderation lied (Greengard, 2023). From there, these sites continue to be used and have yet to be more attended to in terms of research.
Unfortunately, due to personal difficulties in collaboration, the basis of this study relies heavily on Rieger et. al’s data and results as we were lucky to find that this study is the most similar to our goals despite the lack of research in this field of study.
Rieger et. al focuses on the relationship between active right-wing extremists in online fringe platforms and the alt-right movement. They gather data from 4chan & 8chan’s /pol/ imageboard and Reddit’s “TheDonald” subreddit. They uses a combination of a manual quantitative content analysis and topic modeling to “understand the extent and nature of different types of hates speech and its thematic clusters.” (Riger et. al, 2021). Their definition of hate speech is the expression of “hatred or degrading attitudes toward a collective” referring to that of a group similar in race, ethnicity, religion, sexual orientation, etc. as they use a hatebase dictionary in their methods to flag words and phrases they consider “hate speech”.
Using Rieger et. al’s Table 4 data, we focus on the first row of “Hate speech total” as that shows the total amount of hate speech they were able to scrape from each forum from their respective boards. We ignore the other data as they are unrelated to our study because Rieger et. al focuses on hate speech targeted towards specific minority groups.
Despite some internal issues, we were able to conduct a web scrape of hate speech present in several of 4chan’s image boards. Though, at the time of writing, we are unable to provide an in depth analysis of the data, we can infer that using the data from Rieger et. al’s total amount of hate speech in each board is accurate to be applied amongst other boards because from our conducted quantitative analysis, 4chan’s imageboard /pol/ had the highest percentage of hate speech present. From this, we can accurately assume that since 4chan has the moderate regulation and /pol/ has the highest presence of hate speech and from Rieger et. al’s study it’s relative percentage is also between Reddit’s “The_Donald” and 8chan’s /pol/ we can conclude that there is a relationship between a lack of regulation and the amount of hate speech present in each website.
Graph 1.
This study has several limitations. The data does not account for some words that may not be addressed that can also be considered hate speech, as well as photos and audios. Additionally, the time in which the data was scraped allows for a discrepancy in the amount of hate speech present, for example, in the Rieger et. al study was during the elections so clearly there was more activity for the alt-right, this, therefore, is subject to biases to occur as time frame and other unmentioned words are not accounted for. Furthermore, we are generally unsure if the lack of regulation is truly the reason for prevalence of hate speech, but we do predict a correlation to exist amongst severity of moderation as hate speech is bound to exist with or without regulation.
While we are ultimately aware that most of the research, data and analysis is incomplete, upon further study, this aspect can be improved and built upon if more work were to be done and if collaboration were to be completed and utilized to the fullest of its extent. As it goes, this study can be improved if hate speech were to be reviewed for a longer period of time and if we can differentiate between simple insults and actual discrimination and threats. If more boards were to be scraped on Reddit and 8chan, then compared to our data in 4chan, this improvement can result in more accurate findings.
Darcy Leigh, The Settler Coloniality of Free Speech, International Political Sociology, Volume 16, Issue 3, September 2022, olac004, https://doi.org/10.1093/ips/olac004
Geiger, Jonathon, “Hate Speech, Habitus, and Identity Signaling on 4chan’s Politically Incorrect Board” (2020). Master’s Theses. 772. https://aquila.usm.edu/masters_theses/772
Greengard, Samuel. “4chan and 8chan (8kun)”. Encyclopedia Britannica, 9 Jan. 2023, https://www.britannica.com/topic/4chan.
Hagen, Sal and Tuters, Marc, The Internet Hate Machine: On the Weird Collectivity of Anonymous Far-Right Groups (June 1, 2021). Hagen, Sal and Marc Tuters. 2020. “The Internet Hate Machine: On the Weird Collectivity of Anonymous Far-Right Groups.” In Rise of the Far Right: Technologies of Recruitment and Mobilization, edited by Melody Devries, Judith Bessant and Robb Watts. Lanham: Rowman Littlefield: 171-192., Available at SSRN: https://ssrn.com/abstract=4383332
Hate speech versus freedom of speech. (n.d.). Retrieved March 16, 2023, from https://www.un.org/en/hate-speech/understanding-hate-speech/hate-speech-versus-freedom-of-speech#:~:text=%E2%80%9CAddressing%20hate%20speech%20does%20not,is%20prohibited%20under%20international%20law.%E2%80%9D
Rieger, D., Kümpel, A. S., Wich, M., Kiening, T., & Groh, G. (2021). Assessing the Extent and Types of Hate Speech in Fringe Communities: A Case Study of Alt-Right Communities on 8chan, 4chan, and Reddit. Social Media + Society, 7(4). https://doi.org/10.1177/20563051211052906
Sellars, Andrew, Defining Hate Speech (December 1, 2016). Berkman Klein Center Research Publication No. 2016-20, Boston Univ. School of Law, Public Law Research Paper No. 16-48, Available at SSRN: https://ssrn.com/abstract=2882244 or http://dx.doi.org/10.2139/ssrn.2882244
Wermiel, S. J. (n.d.). The Ongoing Challenge to Define Free Speech. Retrieved March 16, 2023, from https://www.americanbar.org/groups/crsj/publications/human_rights_magazine_home/the-ongoing-challenge-to-define-free-speech/the-ongoing-challenge-to-define-free-speech/