IBM researchers have created an algorithm to police the internet’s potty mouths.
Instead of outright removing the offensive language, the algorithm suggests alternative, more polite words to use in their place.
The researchers collected roughly 10 million posts from Twitter and Reddit, labeling them as containing offensive or non-offensive language.
IBM researchers have created an algorithm to police the internet’s potty mouths. Instead of removing the offensive language, the algorithm suggests more polite words to use
They chose to suggest non-offensive words, rather than replacing them entirely, so as to prevent authoritarian governments or companies from abusing the tool, as they might use it to clamp down on critical or political commentary.
Ultimately, the goal is to reduce the prevalence of hate speech on popular social media platforms like Twitter, Reddit, Facebook and others.
‘The use of offensive language is a common problem of abusive behavior on online social media networks,’ the researchers explained.
‘Various work in the past have attacked this problem by using different machine learning models to detect abusive behavior’
‘Most of these work follow the assumption that it is enough to filter out the entire offensive post’
‘However, a user that is consuming some online content may not want an entirely filtered out message, but instead have it in a style that is non-offensive and still be able to comprehend it in a polite tone,’ they added.
The researchers collected roughly 10 million posts from Twitter and Reddit, labeling them as containing offensive or non-offensive language
The algorithm begins by analyzing the sentence’s meaning and whether it includes offensive language.
From there, once the text is confirmed to include offensive terms, the algorithm generates a less offensive phrase.
A third prong of the algorithm analyzes whether the new sentence has changed in tone.
The result was that, in almost all cases, the algorithm was able to produce ‘reliable, non-offensive transferred sentences’.
‘Our proposed method achieves high accuracy on both datasets, which means that almost 100% of the time [the] classifier detects that the transferred sentences are non-offensive,’ according to the study.
The researchers provide several examples in the study of instances where the algorithm was able to effectively produce less offensive phrasing.
The researchers provided several examples in the study (pictured) of instances where the algorithm was able to effectively produce less offensive phrasing
Microsoft has continued to face roadblock after roadblock after two of its chatbots, Zo and Tay, began spouting off racist, sexist and controversial commentary
For example, one post from Reddit said: ‘For f*** sake, first world problems are the worst’.
After the algorithm analyzed the sentence, it produced the following: ‘For hell sake, first world problems are the worst’.
Interestingly, the researchers also compared their algorithm to previous work in the field.
For the first world problems scenario, previous research generated: ‘For the money, are one different countries’.
In that case, the difference was clearly noticeable, in that the phrasing in previous research seemed more like gibberish than a natural sentence.
In another example, the algorithm transformed ‘i’m back b******!’ to ‘i’m back bruh!’.
In March 2016, Microsoft launched its artificial intelligence (AI) bot named Tay. Within hours of it going live, Twitter users took advantage of flaws in Tay’s algorithm
The researchers have yet to release a tool incorporating their algorithm for public use.
They also note that there are limitations to the algorithm, such as the fact that the hateful language must include swear words.
So the system would likely be less successful at identifying hate speech that’s sarcastic or includes greater nuance.
However, it could have far-ranging benefits for improving conversational AI.
For example, Microsoft has continued to face roadblock after roadblock after two of its chatbots, Zo and Tay, began spouting off racist, sexist and controversial commentary.
They were ultimately forced to shut Tay and Zo after the experiments ran wild.
Microsoft was ultimately forced to shut Tay and Zo after the experiments ran wild. The Zo AI was launched in July 2017 but said controversial comments about the Quran and other things
Experts say IBM’s algorithm has the potential for censorship and the restriction of free speech.
But Cicero Nogueira dos Santos, a co-author of the study, believes it could be a helpful tool in clamping down on posts containing hate speech, racism and sexism, according to New Scientist.
‘This work is a first step in the direction of a new promising approach for fighting abusive posts on social media,’ the study states.
‘Although we focus on offensive language, we believe that further improvements on the proposed methods will allow us to cope with other types of abusive behaviors.’