麻豆视频

Study finds racial bias in tweets flagged as hate speech

Tweets believed to be written by African Americans are much more likely to be tagged as hate speech than tweets associated with whites, according to a Cornell study analyzing five collections of Twitter data marked for abusive language.

All five datasets, compiled by academics for research, showed bias against Twitter users believed to be African American. Although social media companies 鈥 including Twitter 鈥 probably don鈥檛 use these datasets for their own hate-speech detection systems, the consistency of the results suggests that similar bias could be widespread.

鈥淲e found consistent, systematic and substantial racial biases,鈥 said Thomas Davidson, a doctoral candidate in sociology and first author of  which was presented at the Annual Meeting of the Association for Computational Linguistics, July 28-Aug. 2 in Florence, Italy.

鈥淭hese systems are being developed to identify language that鈥檚 used to target marginalized populations online,鈥 Davidson said. 鈥淚t鈥檚 extremely concerning if the same systems are themselves discriminating against the population they鈥檙e designed to protect.鈥

As internet giants increasingly turn to artificial intelligence to flag hateful content amid millions of posts, concern about bias in machine learning models is on the rise. Because bias often begins in the data used to train these models, the researchers sought to evaluate datasets that were created to help understand and classify hate speech.

To perform their analysis, they selected five datasets 鈥 one of which Davidson helped develop at Cornell 鈥 consisting of a combined 270,000 Twitter posts. All five had been annotated by humans to flag abusive language or hate speech.

For each dataset, the researchers trained a machine learning model to predict hateful or offensive speech. 

They then used a sixth database of more than 59 million tweets, matched with census data and identified by location and words associated with particular demographics, in order to predict the likelihood that a tweet was written by someone of a certain race.

Though their analysis couldn鈥檛 conclusively predict the race of a tweet鈥檚 author, it classified tweets into 鈥渂lack-aligned鈥 and 鈥渨hite-aligned,鈥 reflecting the fact that they contained language associated with either of those demographics.

In all five cases, the algorithms classified likely African American tweets as sexism, hate speech, harassment or abuse at much higher rates than those tweets believed to be written by whites 鈥 in some cases, more than twice as frequently.

The researchers believe the disparity has two causes: an oversampling of African Americans鈥 tweets when databases are created; and inadequate training for the people annotating tweets for potential hateful content.

鈥淲hen we as researchers, or the people we pay online to do crowdsourced annotation, look at these tweets and have to decide, 鈥業s this hateful or not hateful?鈥 we may see language written in what linguists consider African American English and be more likely to think that it鈥檚 something that is offensive due to our own internal biases,鈥 Davidson said. 鈥淲e want people annotating data to be aware of the nuances of online speech and to be very careful in what they鈥檙e considering hate speech.鈥

The paper was co-authored with Debasmita Bhattacharya 鈥21 and Ingmar Weber of the Qatar Computing Research Institute.

More News from A&S

 A hand texting on a phone