The first global datathon to fight the propaganda in the news finished last night. Hack the News Datathon was co-organized by the Data Science Society and the Qatar Computing Research Institute, HBKU, which were further supported by A Data Pro, who took care of the data annotation. More than 300 data science enthusiasts, experts, and […]
January 26, 2019, Sofia/Doha — The global Hack the News Datathon kicked off last night, gathering together more than 250 AI and data science academics, professionals and aficionados from over 50 countries to help develop a tool that can automatically identify propaganda in the news (winners to be announced on January 29, 2019). Unlike previous […]
Why you should join the Data Science Monthly Challenge and what you can expect? The Data Science Monthly Challenge provides an exceptional opportunity for participants, no matter of their background and previous experience, to be involved in finding a solution to a real data science problem step by step. The proposed gradual approach towards advanced business […]
Monthly Challenge: https://www.datasciencesociety.net/events/text-mining-data-science-monthly-challenge/ Mentors’ Weekly Instructions: https://www.datasciencesociety.net/text-mining-data-science-monthly-challenge/ Real Business Problem Classification of companies into industry sectors is a fundamental task for unlocking advanced business intelligence capabilities. However different data sources rarely use the same classification system if any. This is a huge obstacle for taking advantage of the available details in Open Data and very niche commercial […]
Introduction to NLP
Natural Language Processing (NLP) is the field of computer science that is concerned with developing algorithms for analysis of human languages. Artificial Intelligence approaches( eg. Machine Learning) have been used for solving many tasks of NLP such as parsing, POS tagging, Named Entity Recognition, word sense disambiguation, document classification, machine translation, textual entailment, question answering, summarization, etc. Natural languages are notoriously difficult to understand and model by machines mostly because of ambiguity (eg. humor, sarcasm, puns), lack of clear structure, diversity (eg. models for English are not directly applicable to Chinese). Even so, in recent years we’re witnessing rapid progress in the field of NLP, due to deep learning models, which are becoming more and more complex and able to capture subtleties of human languages.