Why Hack the News Datathon?

Propaganda is the new weapon which influences people's opinions or beliefs about a certain ideology, whether that ideology is right or wrong.

Working toward that direction will make the domain popular and many open-sourced solutions can be generated. Early identification of propaganda is crucial to start fighting the manipulation spread in news and test hypothesis such as which news sources are biased and propagandistic. It will also benefit the individual user who will be able to check the integrity of news sources.

This motivated us to organize a "Hack the News" Datathon, a Data Science Hackathon open to the global community.

Data Enthusiasts and Experts in the area of text-mining and NLP are challenged to develop a machine learning model for identifying propaganda using their preferred data analysis methods and tools and following cutting-edge methodology.

We believe that having fun all together solving such a problem is something we need to focus on.

 

 

Watch the Finals

Organizers

Everything began on April 11, 2018, when three of our most active members and experienced Data Scientists - Laura, Preslav, and Viktor came up with the idea of having a Datathon on news analysis. Alberto, Giovanni, and Preslav were already actively investigating on automatic propaganda identification and focusing the datathon on this task came naturally. Then, Alberto and Giovanni took the lead in defining the data case and its unique dataset. This could not happen without the support of our partners from A Data Pro who helped with the data annotation for building the unique data set.
Thanks to the teams’ effort and dedication, the data case is now ready for the global community to dive deep into the problem and find sophisticated solutions.

The Case

Propaganda is defined as spreading of ideas, facts, or allegations deliberately to influence opinions of the audience with reference to predetermined ends. This case is focused on detecting the use of propaganda, in news articles.  

Two training sets of news articles written in English are provided. One is annotated at the article level whether it is propagandistic or not. The other one is annotated at the fragment level with one out of 18 propaganda techniques (or no technique).

The purpose of the Datathon is to develop intelligent systems using the data training sets provided that are able to classify entire articles as well as text fragments as propagandistic or not.

A publicly available leaderboard will rank the solutions of the participants in real time,  as measured on a validation set. The final evaluation will be on a separate test set, on which no feedback will be provided until the end of the competition.

The challenge is divided in three levels of difficulty, which correspond to the expertise and personal preference of the participants.

Difficulty Levels

Given a news article, you are required to build an intelligent system that is able to detect whether the article is propagandistic or not.

Given a news article, you are required to build an intelligent system that is able to detect whether each of its sentences is propagandistic or not. A sentence is considered propagandistic if it contains at least one out of eighteen propagandistic techniques (the definitions are available in the detailed description).

Given a news article, you are required to build an intelligent system that is able to spot the location of the uses of each of the propagandistic techniques, if any.

Test your models with the DEV set

What to check your results with the DEV set?

Advisors

Local Hosts

Click on the locations to get the logistics information

Schedule

All times are in UTC time zone (± 00:00);
If you participate from a Datathon location, please be aware there will be changes due to the different time zones.

Pre-event week

  1. Get to know other participants & Pre-event team forming
  2. Registration on the DDS website and log in the Data.Chat - mandatory for all participants
  3. Get to know the DSS platform
  4. Deadline for filling in your profiles
  5. Dataset is open for start exploring
  6. Ask advisors your questions
  7. End registration for matchmaking process (if you don't have a team, we will suggest you the smartest participants to team up with! 🙂 )

Ready, Set, Go on 25th Jan!

  1. Official Opening
  2. Team article Deadline
  3. The Hacking begins
  4. Win the #HackSocialMedia game: Post your video and image stories on Twitter and Facebook with #HackNews and #Datathon!

The Hacking continues on 26th Jan!

  1. Deep Data Learning and News Hacking
  2. Cases Q&A by Industry Experts
  3. The test set is uploaded

Last push on 27th January

  1. The Hacking continues!
  2. Deadline to upload a final version of your article
  3. Finalists Announcements

It’s Show Time on 29th Jan!

  1. Presentations by the Finalists (19:00 BG time)
  2. Finalists' Jury (20:30 - 21:00 BG time )
  3. Winners announcement (21:00 BG time )
  4. Local Data.Celebration Party (21:15 BG time )

How to join?

Partners

The communities, organizations, media, and institutions which are helping us reach the global Data Science network.

The Award

Launch: 10th of Dec at GoGetFunding!

If you want to participate leave your email to remind you. The good cause needs a good stimulus - Let's raise a bigger crowdfunded award for those who will develop the tool for fighting propaganda!

 

The award amount

1770

N Contributors

6

The Contributors

Data Science Society

Ontotext

GemSeek

Fukuyama couldn’t have been more wrong when he in 1989 predicted the end of history and the triumph of liberal democracy. Bad actors are using fake news, propaganda, and disinformation to advance dangerous ideologies. Can we use AI to regain trust in journalism and weed out biased and untrustworthy news sources?

Viktor Senderov Guest Researcher @ Naturhistoriska riksmuseet

A tool for detecting the use of propaganda may have an impact on the way readers will consume news in the future, and it could be used by media platforms to establish their credibility. I am looking forward to see many smart ideas since it is a challenging task probably requiring to think out of the box to deliver a successful solution.

Giovanni Da San Martino Scientist @ Qatar Computing Research Institute

The best way to fight disinformation is by raising awareness. Disinformation comes in different flavors, e.g., fake news, propaganda, bias. DSS already organized a hackathon on “Fake News” last year, and now the aim is at a harder nut to crack: detecting propaganda and at spotting the use of propaganda techniques in news article texts. I am very excited and looking forward to a great hackathon.

Preslav Nakov Senior Scientist @ Qatar Computing Research Institute

Fake news has reached alarming and worrying levels and needs to be dealt with immediately. In many cases, propaganda is equivalent to fake news as it aims to mislead readers using different strategies. Detecting propaganda will help users perceive news in a better way, and consequently, shape the public opinion according to more truthful information.

Ramy Baly Postdoctoral associate @ MIT CSAIL

As never before, spreading propaganda is at the fingertips of anybody, big or small business. Making people aware of it is crucial to reduce its impact.

Alberto Barrón-Cedeño Scientist @ QCRI

Propaganda in news is ubiquitous, ranging from blatant to extremely subtle and effective. As it has in the past, it can lead to economic and social disasters. To fight it at scale, algorithms are necessary. Thankfully, the first annotated datasets of propaganda in the news are emerging. I am looking forward to the ingenious Machine Learning models that can reveal patterns useful for automatic detection.

Laura Tolosi Data Scientist

Contributors

The companies, organizations, and people who support us in this good cause.

The Society

Data Science Society (a volunteer organization) develops a friendly environment where data enthusiasts are able to learn, share and experiment with real data cases within our global family.
We organize online Datathons, monthly challenges, digital meetups, webinars, workshops, summer schools, and many others events. At the Data.Platform there are more than 1900 registered data scientists from 50 plus countries.

Datathon 2017

Academia Datathon 2018

Global Datathon 2018

Global Datathon 2018 2.0