Datathon – HackNews – Solution – TEXT_MINERS

Posted Leave a commentPosted in Datathons Solutions

Propaganda is a form of communication that is aimed at influencing the attitude of a community toward some cause or position. It often presents facts selectively to encourage a particular synthesis. The disinformation damages the reputation of respectable news outlets, organisations and very bad for business indeed. The objective of the Hackathon is to be able to detect the Propaganda and Non-propaganda news as well as to develop a model that can help with the venture. The other objectives of this work includes detecting phrases which are propagandist and also finding out the type of propaganda it is. The algorithms that we will be taking help from are Passive Aggressive, Multiple Layer Perceptron Network, Logistic Regression, AdaBoost, Decision Tree, Random Forest, KNN, SVM and Naive Bayes to detect the potentially propagandistic and non-propagandistic sentences in a news article. For the evaluation, we are calculating F1 Score to measure the class imbalance in the testing dataset. We have used the best model for detecting propagandist and non-propagandist articles, phrases and also type of propaganda.

“What’s in a Jargon?”

Posted Leave a commentPosted in Learn, NLP

Everyday we come across fancy jargon like data science, machine learning , artificial intelligence, computer vision, NLP, etc. You must have wondered as why terms like data science and AI are used together in names of research institutes like the Alan Turing Institute for Data Science and Artificial Intelligence. Does these two words mean the same ? Does it not? If it is same , why not club them into a single term , if not then why not have two different names instead of using them along side one another.

Monthly Challenge – Sofia Air – Solution – Jacob Avila

Posted 8 CommentsPosted in Prediction systems

Preliminary Analisys Due to the objective focused on predicting air quality forecast for the next 24 hours per station, first step should be data understanding for citizen science air quality measurements to group it by station and summarize them by day. To complete this task for inspection and pre-processing in order to find missing data, outliers and […]

Datathon Air Sofia Solution – Team Teljapenosss

Posted 3 CommentsPosted in Prediction systems

— Team Teljapenosss Team Members — Jalapeno (Nasiba Zokirova) Team Mentor: petya-par   Business Understanding The levels of air pollution allegedly caused by solid fuel heating and motor vehicle traffic are ever growing in the City of Sofia. The primary economical impact for the City of Sofia was a ruling by the European Court of […]

Datathon Sofia Air Mentors’ Guidelines – On IOT Prediction

Posted Leave a commentPosted in GD2018 Mentors, Mentors

In this article the mentors give some preliminary guidelines, advice and suggestions to the participants for the case. Every mentor should write their name and chat name in the beginning of their texts, so that there are no mix-ups with the other menthors. By rules it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]

Datathon Kaufland Mentors’ Guidelines – On Predictive Maintenance

Posted Leave a commentPosted in GD2018 Mentors, Mentors

In this article, the mentors give some preliminary guidelines, advice, and suggestions to the participants for the case. Every mentor should write their name and chat name at the beginning of their texts so that there are no mix-ups with the other mentors. By rules, it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]

Using Machine Learning to explain and predict the life expectancy of different countries

Posted Leave a commentPosted in Prediction systems

The project tries to create a model based on data provided by the World Health Organization (WHO) to evaluate the life expectancy for different countries in years. The data offers a timeframe from 2000 to 2015. The data originates from here: https://www.kaggle.com/kumarajarshi/life-expectancy-who/data The output algorithms have been used to test if they can maintain their accuracy in predicting the life expectancy for data they haven’t been trained. Four algorithms have been used:

Linear Regression
Ridge Regression
Lasso Regression
ElasticNet Regression
Linear Regression with Polynomic features
Decision Tree Regression
Random Forest Regression