IEEE SB NITP: FAKE NEWS DETECTION

Greeting readers,

While researching, we have found many tools and researches in this domain. We tried really hard to come up with the most trending tools. Have a good read while you find WhatsApp case study, Facebook case study, ways to unearth fake news, ways to combat fake news, AI tools, some exemplary researches in this field and ways to check fake news generated by AI chatbots.

FAKE NEWS DETECTION

Today in this era of internet where new digital platforms have unleashed innovative forms of communication and greater global reach, but on the other hand, disinformation and hoaxes that are popularly referred to as “fake news” are accelerating and affecting the way individuals interpret daily developments.

India’s battle against the novel Covid-19 has many impediments – huge population, inadequate infrastructure and many more. But beyond these, familiar foes are rearing their heads: misinformation and fake news.

Not only in this situation of pandemic but also the past records of various situations have proved that a series of fake news and rumours spread like a wildfire in the country. A havoc is created as fake news rise and send shockwaves across the media and the world.

Biggest market for WhatsApp

India is the biggest market for the Facebook-owned messaging application, WhatsApp, with more than 400 million users in a country as per reports of 2019.

Dozens of people have died in mob lynchings by furious crowds during the past four years over WhatsApp rumours. Since the coronavirus outbreak last December in China, misinformation and misleading facts, especially through WhatsApp, have increased, says Shachi Sutaria, a fact checker focused on science and health, with Boom, one of India's leading fact-checking websites. She says:
"Normally, we don't see such high levels of misinformation around health issues in India. Earlier, we would get two to three messages a week on health issues that we would fact check. Now, we get up to five to six messages every day, much of it on coronavirus."[Extracted from her latest interview to Mr. Purohit Jain.]

WhatsApp comes forward for fake news detection

With over four billion users, WhatsApp is one of the most tremendous mobile application in spreading information. That’s why, in order to curb fake news, spread across social media, especially in the crucial time of coronavirus outbreak, WhatsApp developers according to trusted sources have been found to be working on a feature called Search Messages on the Web.

The version, WhatsApp beta for Android 2.20.94, may bring in a feature that will help WhatsApp users combat fake news and misinformation. The feature called Search Messages on The Web will direct a user to check if a message is true or authentic. It has not been very long since WhatsApp started labelling text messages, photos or videos as forwarded. This is one of the determining factors that would help the user track information which could be misleading. It is already part of beta, is under testing and may roll out in coming months.

But now the question arises that what could be the possible solution to cease this fire. Technologists have been constantly addressing this using Artificial Intelligence and Machine Learning. So,IS ARTIFICIAL INTELLIGENCE-DRIVEN DATA VERACITY THE LENS ON MISINFORMATION?

Let’s check how.

A comprehensive study, in March 2018 , from MIT looks at a decade of tweets, and finds that not only is the truth slower to spread, but that the threat of bots and the natural network effects of social media are no excuse: we’re doing it to ourselves.

The study looked at the trajectories of more than 100,000 news stories, independently verified or proven false, as they spread (or failed to) on Twitter. The conclusion, as summarized in the abstract: “Falsehood diffused farther, faster, deeper, and more broadly than the truth in all categories of information.”

When it comes to combat rumours, Facebook has emerged as a brave face by using Artificial Intelligence to leverage to search for words or their pattern to check for fake stories.

Facebook: Case Study viz. Fake News Detection

One of the major platforms where fake news is spread mercilessly is Facebook.Since last year, after Zuckerberg’s Congress session, Facebook is trying to curb fake news with new enthusiasm. The efforts, though commendable are not very effective.

Facebook uses various machine learning algorithms to identify hate text and figure out the context of the text but it still depends on manual flagging of fake news. It has got a team of AP behind them which manually flags news as fake or real. One of the major hindrances is the different languages being used on the platform and the lack of language specific reviewer.

ML is also used to generate fake news which is more sophisticated and that leads to the probability of their detection being very less. For now, manual detection and reporting of news is the best way Facebook can curb fake news.

AI is now looked as a turning point in the detection and checking of fake news.
AI enables to understand behaviours, through pattern recognition and taking the help from stories that were flagged as fake in the past. As the volume of data increases day-by-day, so is the need to handle hoaxes. AI has turned into a beacon of hope for assurance of data veracity and fake news detection, majorly.

So, now the question arises, HOW TO DEAL WITH MISINFORMATION DYNAMICS? Misinformation dynamics tends to link the fake news to big data concept called data veracity. Reports suggest that AI again enables us to sort this problem.

METHODS TO UNEARTH FAKE NEWS

AI is all set to identify fake news as we are keen to find how. Some methods are:

So, now we would like to bring you some tools that could help combat fake news:

NEWSWHIP

It is a social media engagement tracking firm(www.newswhip.com) that tracks and predicts the impact of millions of stories, empowering the world’s news and communications professionals. It tracks content by amount and location of user engagement and also tracks audience interest and changes in interest over time.

HOW DOES IT WORKS?

1)Social monitoring platform- The social monitoring platforms (NewsWhip Spike), use APIs from leading social platforms to provide accurate and real-time insights into the stories, videos, pictures and topics that are spreading fastest on social media using an algorithm, to provide a picture of what content is getting attention among different audiences.

2) Social media intelligence-The NewsWhip Spike and the social analytics platform NewsWhip Analytics utilize the data signals like the social media reactions to provide content intelligence.

3) Facebook analytics tool- NewsWhip Analytics provides insight into the responses to the Facebook pages content.

4) Social influencer tracking-Using Newswhip discovery platform one can monitor the social influencers and then using the analytics platform can look at historical trends in social influencer monitoring.

NewsWhip has really great reviews from news agencies, publishers, brands, creative agencies, marketing agencies, PR and communications, governments and non-profit organisations.

SNOPES

Snopes is a fact checking website which has helped in determining the veracity of latest news. The website works by doing extensive research on an article from various sources and then publishing it, hence it is considered as an unimpeachable source of information. The website has been used by prominent newspapers and magazines as well.

CROEDTANGLE

CroEdTangle is a social networking monitoring and content discovery tool for social networks and has multi user capabilities. It has developed an intelligence tool that discovers which posts on various social media pages are performing best and enables news outlets to compare their different account's performance on upto five social accounts on Facebook, Twitter or Reddit.

It is a web based service and integrates with other apps like Slack. Its main users are social media strategists in small, medium and large enterprises. It costs for about $449 per month for one platform,$899 for three platforms and $ 1299 for 5 platforms with an annual contract and scored 86 /100 in the Social Media category on the basis of its user satisfaction with a score of 93/100.

MEEDAN

It is a non-profit, social technology company. It aims to increase global interaction on the web by translating texts in English and Arabic. The technology used is Machine translation and machine augmented translation. People all across the world appreciate the initiative taken by the company and find it user friendly, it makes their work easier. Meanwhile a check is alsolaid on facts and news they translate.

GOOGLE TRENDS

This is a website by Google that analyses the popularity of top search queries in Google search engines. Through this one can detect how search volume has varied for that term over time and in different location. Changes can be made to the location, time frame, category or industry and type of search (web, news, shopping or YouTube) for more fine grained data as shown in the graph.

HOW IT WORKS?

By sampling the Google searches, we can look at its database representative through its graphical representation while finding insights that can be processed within minutes of an event happening in the real world.

NORMALISATION OF DATA

Google Trends normalizes search data to make comparisons between terms easier. Search results are normalized to the time and location of a query by the following process:

• Each data point is divided by the total searches of the geography and time range it represents, to compare relative popularity. Otherwise, places with the most search volume would always be ranked highest.

• The resulting numbers are then scaled on a range of 0 to 100 based on a topic’s proportion to all searches on all topics.

Google trends is very popular among industrialists, competitors and professionals of all major companies across the globe and its Use Cases and Deployment Scope has received great ratings.

LE DECODEURS

Les` Decodeurs or Decoders is an initiative taken in 2014 by the French Daily Newspaper,‘Le Monde.’ It is a section of this newspaper with purpose of verifying information given on various themes. In 2017, those journalists created Decodex, a search engine whose objective is to provide as many simple tools as possible to facilitate the verification of information. The creators believe that maybe it is not possible to verify all the information circulating online but it will offer everyone the means of discerning the most obvious of them, and being warned when consulting them; a site known for spreading false information.

PHEME

Pheme is a tool powered by Artificial Intelligence and Machine Learning algorithms, which has brought about a technology leap to read the veracity of user-generated and online content.

Pheme intends to better understand and automate the question of veracity. The interdisciplinary big data project is being funded by the European Union. The word Pheme is named after the Greek goddess of fame and rumours. It brings together partners from the domains of natural language processing and text mining, web science, social network analysis, and information visualization.

It also aims to develop and release veracity intelligence algorithms as open source material so that we can all benefit from them. Such algorithms could then be applied on social media networks, web search or email systems to detect rumour, lies, or any kind of misinformation being spread.

SOME OTHER WAYS FOR FAKE NEWS DETECTION

Fake news detection through Geometric Deep learning

UK startup ‘Fabula AI’ reckons its devised way for artificial intelligence to help user generated content platforms get on the top of the disinformation crisis that keeps rocking the world of social media with antisocial scandals.

Fabula, which has patented what it dubs a “new class” of machine learning algorithms to detect “fake news” — in the emergent field of “Geometric Deep Learning”; where the datasets to be studied are so large and complex that traditional machine learning techniques struggle to find purchase on this ‘non-Euclidean’ space.

Geometric Deep learning- It is the class of Deep Leaning that can operate on the non-Euclidean space (like Molecules, Graphs, Trees, Networks etc.) with the goal of teaching models how to perform predictions and classifications on the datatypes.

The approach it’s taking to detect disinformation relies not on algorithms parsing news content to try to identify malicious nonsense but instead looks at how such stuff spreads on social networks — and also therefore who is spreading it.

There are characteristic patterns to how ‘fake news’ spreads vs the genuine article (the MIT study mentioned above).

The essence of geometric deep learning is it can work with network-structured data. So here we can incorporate heterogeneous data such as user characteristics; the social network interactions between users; the spread of the news itself; so many features that otherwise would be impossible to deal with under machine learning techniques.

Fabula envisages its own role, as the company behind the tech, as that of an open, decentralised “truth-risk scoring platform” — akin to a credit referencing agency just related to content, not cash.
Scoring comes into it because the AI generates a score for classifying content based on how confident it is looking at a piece of fake vs true news.

In its own tests Fabula says its algorithms were able to identify 93 percent of “fake news” within hours of dissemination — which is “significantly higher” than any other published method for detecting ‘fake news’.

For their training dataset Fabula relied on true/fake labels attached to news stories by third party fact checking NGOs, including Snopes and PolitiFact. And, overall, pulling together the dataset was a process of “many months”.

Fake news detection through Stance Detection

In a paper presented at the 2019 NeurIPS AI conference, researchers at Darwin AI and Canada’s University of Waterloo presented an AI system that uses advanced language models to automate stance detection, an important first step towards identifying disinformation.

Before creating an AI system that can fight fake news, we must first understand the requirements of verifying the veracity of a claim. In their paper, the AI researchers break down the process into the following steps:

1-Retrieving documents that are relevant to the claim.

2-Detecting the stance or position of those documents with respect to the claim.

3-Calculating a reputation score for the document, based on its source and language quality.

4-Verify the claim based on the information obtained from the relevant documents.

Instead of going for an end-to-end AI-powered fake-news detector that takes a piece of news as input and outputs “fake” or “real”, the researchers focused on the second step of the pipeline. They created an AI algorithm that determines whether a certain document agrees, disagrees, or takes no stance on a specific claim.

The University of Waterloo researchers used a deep bidirectional transformer, RoBERTa, which was developed by Facebook in 2019, is an open source language model.

Transformers, a type of deep learning algorithm, use special techniques to find the relevant bits of information in a sequence of bytes instead. This enables them to become much more memory efficient than other deep learning algorithms in handling large sequences. Transformers are also an unsupervised machine learning algorithm, which means they don’t require the time- and labour-intensive data-labelling work that goes into most contemporary AI work.

For stance detection, the researchers used the dataset used in the Fake News Challenge (FNC-1), a competition launched in 2017 to test and expand the capabilities of AI in detecting online disinformation. The dataset consists of 50,000 articles as training data and a 25,000-article test set. The AI takes as input the headline and text of an article, and outputs the stance of the text relative to the headline. The body of the article may agree or disagree with the claim made in the headline, may discuss it without taking a stance, or may be unrelated to the topic.

The RoBERTa-based stance-detection model presented by the University of Waterloo researchers scored better than the AI models that won the original FNC competition.

A significant advantage of deep bidirectional transformer language models is that we can harness pre-trained models, which have already been trained on very large datasets using significant computing resources, and then fine-tune them for specific tasks such as stance-detection.

THE OTHER SIDE

Every coin has two sides. We have relied so much on AI for checking fake news. But reports also say that AI can be used to spread fake news, write fake reviews and to create a pretend mob of social media users aimed at bombarding comments sections with specific agendas by AI bots.
However, Harvard University and MIT-IBM Watson AI Lab researchers recently developed a new tool that spots text that has been generated by AI.

The tool, called the Giant Language Model Test Room (GLTR), takes advantage of the fact that AI text generators use fairly predictable statistical patterns in text.

While these patterns might not be easy to spot by an average reader, it seems that an algorithm can do a pretty good job at it. The AI tool, essentially, can tell if the text is too predictable to have been written by a human.

How does it work?

GLTR tests for predictability in texts by looking at the statistical probability of one word being chosen after another in a sentence.

We ran the opening passage of 1984 through the tool in order to put George Orwell through his paces — and to prove he wasn't really an AI sent back in time from the future.

Less predictable words are flagged by purple — a sprinkling of these throughout a text shows that it was most likely written by a human hand.

Green words, on the other hand, are the most predictable, while yellow and red fall in between.

Of course, we can envision a future in which text-generating machine-learning tools are trained on these purple words in order to trick GLTR by coming across as more human — here's hoping the GLTR researchers can keep up.

To test GLTR, the researchers asked Harvard students to identify AI-generated text — firstly with the tool, and then without it.

The students only successfully spotted half of all fake texts on their own. With the tool, on the other hand, they spotted 72%.

So, we could conclude that Artificial Intelligence has come up with many ways to unearth the fake news, combat it and even detect fake news spread by itself in other forms.

Thanks for reading. We hope you liked the article and would utilise one of the tools to detect a fake news that reaches you amidst this chaos and confusion.

IEEE SB NITP would like to once again remind its readers to obey the guidelines as issued by the Government and WHO. Stay at home and be safe. Please drop your views in the comment section.

IEEE SB NITP

Thursday, 26 March 2020

FAKE NEWS DETECTION

FAKE NEWS DETECTION

No comments:

Post a Comment