We have achieved a lot during 2018 and we are proud of it but what is much more important is that we have established traditions that we plan to continue throughout 2019 and beyond.
One of those traditions is to present the amazing people behind each of our events. As some of you already know they are always top experts, amazing researchers and fantastic people.
Hack the news Datathon would not be possible without the 5 amazing individuals that we will present below:
Dr. Preslav Nakov is a Senior Scientist at the Qatar Computing Research Institute, HBKU (QCRI is ranked #3 in Asia, and #11 in the world in NLP). His research interests include computational linguistics and natural language processing (for English, Arabic, Chinese, Malay/Indonesian, Bulgarian and other languages), machine translation, question answering, fact-checking, sentiment analysis, lexical semantics, Web as a corpus, and biomedical text processing.
Preslav received a PhD degree in Computer Science from the University of California at Berkeley (supported by a Fulbright grant and a UC Berkeley fellowship), and an MSc degree from the Sofia University. He was a Research Fellow at the National University of Singapore (2008-2011), an honorary lecturer in the Sofia University (2008, 2014-present), research staff at the Bulgarian Academy of Sciences (2008), and a visiting researcher at the University of Southern California, Information Sciences Institute (2005).
Preslav co-authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and many research papers in top-tier conferences and journals. He received the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President’s John Atanasoff award, named after the inventor of the first automatic electronic digital computer.
Preslav is a Secretary of ACL SIGLEX, the Special Interest Group (SIG) on the Lexicon of the Association for Computational Linguistics (ACL). He is also Secretary of SIGSLAV, the ACL SIG on Slavic Natural Language Processing. Preslav is an Action Editor of the Transactions of the Association for Computational Linguistics (TACL) journal, a Member of the Editorial Board of the Journal of Natural Language Engineering, an Associate Editor of the AI Communications journal, and Editorial Board member of the Language Science Press Book Series on Phraseology and Multiword Expressions.
He served on the program committees of the major conferences and workshops in Computational Linguistics, including as a co-organizer and as an area/publication/tutorial/shared task chair, Senior PC member, student faculty advisor, etc.; Preslav co-chaired SemEval 2014-2016 and was an area co-chair of ACL, EMNLP, NAACL-HLT, and *SEM, a Senior PC member of AAAI and IJCAI, a shared task co-chair of IJCNLP, and a tutorial chair of ACL.
Preslav is a dear friend of Data Science Society and has supported many of our events in the past. Here is what he said about the Hack the news Datathon:
The best way to fight disinformation is by raising awareness. Disinformation comes in different flavors, e.g., fake news, propaganda, bias. DSS already organized a hackathon on “Fake News” last year, and now the aim is at a harder nut to crack: detecting propaganda and at spotting the use of propaganda techniques in news article texts. I am very excited and looking forward to a great hackathon.
Laura Tolosi – Halacheva
Laura Tolosi – Halacheva has over 14 years of experience in the field of machine learning. She has worked on medical applications, applying ML modeling to cancer genetics data. As part of the Research and Development team at Ontotext she has worked on many NLP projects, such as rumor detection in Social Media and sentiment analisys around the Brexit event. Recently, she is developing reinforcement learning algorithms for automated financial trading. She is an enthusiastic participant in Data Science competitions, both as contestant and as mentor.
Here is what Laura has to say about the importance of fighting propaganda in news and what her expectations are:
Propaganda in news is ubiquitous, ranging from blatant to extremely subtle and effective. As it has in the past, it can lead to economic and social disasters. To fight it at scale, algorithms are necessary. Thankfully, the first annotated datasets of propaganda in the news are emerging. I am looking forward to the ingenious Machine Learning models that can reveal patterns useful for automatic detection.
Giovanni Da San Martino
Giovanni Da San Martino is a scientist at Qatar Computing Research Institute.
Giovanni received a PhD in Computer Science from the University of Bologna in 2009. He has been a post doc at the University of Padua until 2014.
His research interests are at the intersection of machine learning and natural language processing, including learning algorithms for structured data and applications to paraphrasing, community question answering and propaganda detection. He is the author of 35 publications in the field.
He also served as program committee of several machine learning and natural language processing conferences and journals: IJCAI, IJCNN, ESANN, ICANN, NAACL, NAACL, ACL, COLING.
Here is what Giovanni has to say about the solutions:
A tool for detecting the use of propaganda may have an impact on the way readers will consume news in the future, and it could be used by media platforms to establish their credibility. I am looking forward to see many smart ideas since it is a challenging task probably requiring to think out of the box to deliver a successful solution.
Viktor Senderov is a guest Researcher at Naturhistoriska riksmuseet.
In the late 90’s Viktor attended SMG – the notorious Sofia High School for Mathematics – where his teacher was Ivan Simeonov.
From that time Viktor got his love for competitions in math and computer science. He participated in two datathons organized by Data Science society in 2017 and 2018 and got a golden and a silver placement respectively. After this, he was invited to join the org-crew and is now a mentor in Natural Language Processing.
Viktor is a Marie-Sklodovska Curie (a E.U. funded project) Ph.D. Candidate in Computer Science (expected award Dec 2018), with a dissertation on the Semantic Web and its application in biodiversity informatics. He is currently working at the Swedish Museum for Natural History, in the lab of Fredrik Ronquist, on developing methods to use probabilistic programming techniques to solve complex questions in the area of evolutionary genomics. Previously, he was at Pensoft Publishers and at the Bulgarian Academy of Sciences.
He has B.Sci. and M.Sci. in mathematics and statistics from Germany (Uni Magdeburg, Uni Munich). His current interests are probabilistic programming, natural language processing, and Bayesian methods in evolutionary biology. He is also an “advanced amateur” when it comes to software development: he likes to play around with functional languages such as OCaml, considers learning Rust, and has authored a few R packages. He also teaches R to biologists in his spare time.
Here is what Viktor has to say to those of you who dare to join the challenge:
Fukuyama couldn’t have been more wrong when he in 1989 predicted the end of history and the triumph of liberal democracy. Bad actors are using fake news, propaganda, and disinformation to advance dangerous ideologies. Can we use AI to regain trust in journalism and weed out biased and untrustworthy news sources?
Last but not least, we would like to introduce Alberto Barrón-Cedeño who is a Scientist at Qatar Computing Research Institute.
Alberto has been working in the natural language processing domain for 10+ years and has published 60+ research papers.
He is interested in the development of technology for the analysis and exploitation of (multilingual) text and has worked in various European and national projects on natural language processing, information retrieval, and machine translation.
As part of his activities, he has co-organised various editions of the PAN Lab on Text Forensics and the CheckThat! Lab on the identification and verification of political claims (both at CLEF) as well as a shared task on community question answering at ECML.
Alberto shares his view on why it is important to fight propaganda in news:
As never before, spreading propaganda is at the fingertips of anybody, big or small business. Making people aware of it is crucial to reduce its impact.
The story behind
Everything began on April 11, 2018, when three of our most active members and experienced Data Scientists – Laura, Preslav, and Viktor came up with the idea of having a Datathon on news analysis. Alberto, Giovanni, and Preslav were already actively investigating on automatic propaganda identification and focusing the datathon on this task came naturally.
Then, Alberto and Giovanni took the lead in defining the data case and its unique dataset. This could not have happened without the support of our partners from A Data Pro who helped with the data annotation for building the unique data set.
Thanks to the teams’ effort and dedication, the data case is now ready for the global community to dive deep into the problem and find sophisticated solutions.
Do you want to learn more about what propaganda is – visit our article explaining what propaganda is?
If you believe in the cause but cannot join the Datathon, you can still contribute by donating. A huge portion of the donations will go towards awards for the best solutions.