The anti-swearing AI trying to clean up abuse on Reddit and Twitter

IBM researchers have created an algorithm to police the internet’s potty mouths. 

Instead of outright removing the offensive language, the algorithm suggests alternative, more polite words to use in their place. 

The researchers collected roughly 10 million posts from Twitter and Reddit, labeling them as containing offensive or non-offensive language.  

 

IBM researchers have created an algorithm to police the internet’s potty mouths. Instead of removing the offensive language, the algorithm suggests more polite words to use

They chose to suggest non-offensive words, rather than replacing them entirely, so as to prevent authoritarian governments or companies from abusing the tool, as they might use it to clamp down on critical or political commentary.  

Ultimately, the goal is to reduce the prevalence of hate speech on popular social media platforms like Twitter, Reddit, Facebook and others. 

‘The use of offensive language is a common problem of abusive behavior on online social media networks,’ the researchers explained. 

‘Various work in the past have attacked this problem by using different machine learning models to detect abusive behavior’

‘Most of these work follow the assumption that it is enough to filter out the entire offensive post’

‘However, a user that is consuming some online content may not want an entirely filtered out message, but instead have it in a style that is non-offensive and still be able to comprehend it in a polite tone,’ they added.       

The researchers collected roughly 10 million posts from Twitter and Reddit, labeling them as containing offensive or non-offensive language 

The researchers collected roughly 10 million posts from Twitter and Reddit, labeling them as containing offensive or non-offensive language 

The algorithm begins by analyzing the sentence’s meaning and whether it includes offensive language.

From there, once the text is confirmed to include offensive terms, the algorithm generates a less offensive phrase. 

A third prong of the algorithm analyzes whether the new sentence has changed in tone. 

The result was that, in almost all cases, the algorithm was able to produce ‘reliable, non-offensive transferred sentences’.

‘Our proposed method achieves high accuracy on both datasets, which means that almost 100% of the time [the] classifier detects that the transferred sentences are non-offensive,’ according to the study.   

The researchers provide several examples in the study of instances where the algorithm was able to effectively produce less offensive phrasing. 

The researchers provided several examples in the study (pictured) of instances where the algorithm was able to effectively produce less offensive phrasing

The researchers provided several examples in the study (pictured) of instances where the algorithm was able to effectively produce less offensive phrasing

Microsoft has continued to face roadblock after roadblock after two of its chatbots, Zo and Tay, began spouting off racist, sexist and controversial commentary

Microsoft has continued to face roadblock after roadblock after two of its chatbots, Zo and Tay, began spouting off racist, sexist and controversial commentary

For example, one post from Reddit said: ‘For f*** sake, first world problems are the worst’.

After the algorithm analyzed the sentence, it produced the following: ‘For hell sake, first world problems are the worst’. 

Interestingly, the researchers also compared their algorithm to previous work in the field. 

For the first world problems scenario, previous research generated: ‘For the money, are one different countries’. 

In that case, the difference was clearly noticeable, in that the phrasing in previous research seemed more like gibberish than a natural sentence. 

In another example, the algorithm transformed ‘i’m back b******!’ to ‘i’m back bruh!’.

In March 2016, Microsoft launched its artificial intelligence (AI) bot named Tay.  Within hours of it going live, Twitter users took advantage of flaws in Tay's algorithm

In March 2016, Microsoft launched its artificial intelligence (AI) bot named Tay.  Within hours of it going live, Twitter users took advantage of flaws in Tay’s algorithm

WHAT HAPPENED TO TAY?

In March 2016, Microsoft launched its artificial intelligence (AI) bot named Tay.

It was aimed at 18 to-24-year-olds and was designed to improve the firm’s understanding of conversational language among young people online.

But within hours of it going live, Twitter users took advantage of flaws in Tay’s algorithm that meant the AI chatbot responded to certain questions with racist answers.

These included the bot using racial slurs, defending white supremacist propaganda, and supporting genocide.

The bot managed to spout offensive tweets such as, ‘Bush did 9/11 and Hitler would have done a better job than the monkey we have got now.’

And, ‘donald trump is the only hope we’ve got’, in addition to ‘Repeat after me, Hitler did nothing wrong.’

Followed by, ‘Ted Cruz is the Cuban Hitler…that’s what I’ve heard so many others say’

The offensive tweets have now been deleted.  

The researchers have yet to release a tool incorporating their algorithm for public use.

They also note that there are limitations to the algorithm, such as the fact that the hateful language must include swear words. 

So the system would likely be less successful at identifying hate speech that’s sarcastic or includes greater nuance. 

However, it could have far-ranging benefits for improving conversational AI. 

For example, Microsoft has continued to face roadblock after roadblock after two of its chatbots, Zo and Tay, began spouting off racist, sexist and controversial commentary. 

They were ultimately forced to shut Tay and Zo after the experiments ran wild.

Microsoft was ultimately forced to shut Tay and Zo after the experiments ran wild. The Zo AI was launched in July 2017 but said controversial comments about the Quran and other things

Microsoft was ultimately forced to shut Tay and Zo after the experiments ran wild. The Zo AI was launched in July 2017 but said controversial comments about the Quran and other things

Experts say IBM’s algorithm has the potential for censorship and the restriction of free speech.

But Cicero Nogueira dos Santos, a co-author of the study, believes it could be a helpful tool in clamping down on posts containing hate speech, racism and sexism, according to New Scientist. 

‘This work is a first step in the direction of a new promising approach for fighting abusive posts on social media,’ the study states.   

‘Although we focus on offensive language, we believe that further improvements on the proposed methods will allow us to cope with other types of abusive behaviors.’ 



Read more at DailyMail.co.uk