Propaganda is the new weapon which influences people's opinions or beliefs about a certain ideology, whether that ideology is right or wrong.
Working toward that direction will make the domain popular and many open-sourced solutions can be generated. Early identification of propaganda is crucial to start fighting the manipulation spread in news and test hypothesis such as which news sources are biased and propagandistic. It will also benefit the individual user who will be able to check the integrity of news sources.
This motivated us to organize a "Hack the News" Datathon, a Data Science Hackathon open to the global community.
Data Enthusiasts and Experts in the area of text-mining and NLP are challenged to develop a machine learning model for identifying propaganda using their preferred data analysis methods and tools and following cutting-edge methodology.
We believe that having fun all together solving such a problem is something we need to focus on.
Everything began on April 11, 2018, when three of our most active members and experienced Data Scientists - Laura, Preslav, and Viktor came up with the idea of having a Datathon on news analysis. Alberto, Giovanni, and Preslav were already actively investigating on automatic propaganda identification and focusing the datathon on this task came naturally. Then, Alberto and Giovanni took the lead in defining the data case and its unique dataset. This could not happen without the support of our partners from A Data Pro who helped with the data annotation for building the unique data set. Thanks to the teams’ effort and dedication, the data case is now ready for the global community to dive deep into the problem and find sophisticated solutions.
Propaganda is defined as spreading of ideas, facts, or allegations deliberately to influence opinions of the audience with reference to predetermined ends. This case is focused on detecting the use of propaganda, in news articles. Two training sets of news articles written in English are provided. One is annotated at the article level whether it is propagandistic or not. The other one is annotated at the fragment level with one out of 18 propaganda techniques (or no technique). The purpose of the Datathon is to develop intelligent systems using the data training sets provided that are able to classify entire articles as well as text fragments as propagandistic or not. A publicly available leaderboard will rank the solutions of the participants in real time, as measured on a validation set. The final evaluation will be on a separate test set, on which no feedback will be provided until the end of the competition.
The challenge is divided in three levels of difficulty, which correspond to the expertise and personal preference of the participants.
Given a news article, you are required to build an intelligent system that is able to detect whether the article is propagandistic or not.
Given a news article, you are required to build an intelligent system that is able to detect whether each of its sentences is propagandistic or not. A sentence is considered propagandistic if it contains at least one out of eighteen propagandistic techniques (the definitions are available in the detailed description).
Given a news article, you are required to build an intelligent system that is able to spot the location of the uses of each of the propagandistic techniques, if any.
All times are in UTC time zone (± 00:00);
If you participate from a Datathon location, please be aware there will be changes due to the different time zones.
Pre-event week
Get to know other participants & Pre-event team forming
21.Jan Registration on the DDS website and log in the Data.Chat - mandatory for all participants
Get to know the DSS platform
22.Jan Deadline for filling in your profiles
Dataset is open for start exploring
23.Jan Ask advisors your questions
24.Jan End registration for matchmaking process (if you don't have a team, we will suggest you the smartest participants to team up with! 🙂 )
If you want to participate leave your email to remind you. The good cause needs a good stimulus - Let's raise a bigger crowdfunded award for those who will develop the tool for fighting propaganda!
Fukuyama couldn’t have been more wrong when he in 1989 predicted the end of history and the triumph of liberal democracy. Bad actors are using fake news, propaganda, and disinformation to advance dangerous ideologies. Can we use AI to regain trust in journalism and weed out biased and untrustworthy news sources?
Viktor Senderov Guest Researcher @ Naturhistoriska riksmuseet
A tool for detecting the use of propaganda may have an impact on the way readers will consume news in the future, and it could be used by media platforms to establish their credibility. I am looking forward to see many smart ideas since it is a challenging task probably requiring to think out of the box to deliver a successful solution.
Giovanni Da San Martino Scientist @ Qatar Computing Research Institute
The best way to fight disinformation is by raising awareness. Disinformation comes in different flavors, e.g., fake news, propaganda, bias. DSS already organized a hackathon on “Fake News” last year, and now the aim is at a harder nut to crack: detecting propaganda and at spotting the use of propaganda techniques in news article texts. I am very excited and looking forward to a great hackathon.
Preslav Nakov Senior Scientist @ Qatar Computing Research Institute
Fake news has reached alarming and worrying levels and needs to be dealt with immediately. In many cases, propaganda is equivalent to fake news as it aims to mislead readers using different strategies. Detecting propaganda will help users perceive news in a better way, and consequently, shape the public opinion according to more truthful information.
Ramy Baly Postdoctoral associate @ MIT CSAIL
As never before, spreading propaganda is at the fingertips of anybody, big or small business. Making people aware of it is crucial to reduce its impact.
Alberto Barrón-Cedeño Scientist @ QCRI
Propaganda in news is ubiquitous, ranging from blatant to extremely subtle and effective. As it has in the past, it can lead to economic and social disasters. To fight it at scale, algorithms are necessary. Thankfully, the first annotated datasets of propaganda in the news are emerging. I am looking forward to the ingenious Machine Learning models that can reveal patterns useful for automatic detection.
Laura Tolosi Data Scientist
Contributors
The companies, organizations, and people who support us in this good cause.
Data Science Society (a volunteer organization) develops a friendly environment where data enthusiasts are able to learn, share and experiment with real data cases within our global family. We organize online Datathons, monthly challenges, digital meetups, webinars, workshops, summer schools, and many others events. At the Data.Platform there are more than 1900 registered data scientists from 50 plus countries.