Datathon – HackNews – Solution – DataExploiters

Posted 5 CommentsPosted in Datathons Solutions

This article describes our submission for the Hack the News Datathon 2019 which focuses on Task 2, Propaganda sentence classification. It outlines our exploratory data analysis, methodology and future work. Our work revolves around the BERT model as we believe it offers an excellent language model that’s also good at attending to context which is an important aspect of propaganda detection.

Hack the News Datathon Case – Propaganda Detection

Posted 2 CommentsPosted in Cases, Datathon cases, NLP

1. Business Problem Formulation The current political landscape is shaped by extreme polarization of opinions and by the proliferation of fake news. For example, a recent study published in Science has found that rumors and fake news tend to spread six times faster than truthful information. This situation both damages the reputation of respectable news outlets and […]

Datathon Kaufland Solution – Predictive Maintenance Based on Sensor Data for Forklifts

Posted 2 CommentsPosted in Prediction systems

Kaufland-Case 1. Business Understanding Industrial vibration analysis is a measurement tool used to identify, predict, and prevent failures. Implementing vibration analysis on the machines will improve the reliability of the machines and lead to better machine efficiency and reduced down time eliminating mechanical or electrical failures. Vibration analysis are used to identify faults in machinery, plan machinery […]

Datathon Kaufland Solution – Kaufland case – Team3

Posted 1 CommentPosted in Datathons Solutions

In [1]: import s3fs import pandas as pd import matplotlib.pyplot as plt import matplotlib.dates as mdates import seaborn as sns import numpy as np import pywt In [2]: fs = s3fs.S3FileSystem(anon=True)‘datacases/datathon-2018-2/’) Out[2]: [‘datacases/datathon-2018-2/kaufland’, ‘datacases/datathon-2018-2/nsi’, ‘datacases/datathon-2018-2/ontotext’, ‘datacases/datathon-2018-2/telelink’, ‘datacases/datathon-2018-2/telenor’] In [3]:‘datacases/datathon-2018-2/kaufland’) Out[3]: [‘datacases/datathon-2018-2/kaufland/20180820_Kaufland_case_IoT_and_predictive_maintenance_events.xlsx’, ‘datacases/datathon-2018-2/kaufland/20180920_Kaufland_case_IoT_and_predictive_maintenance.csv’, ‘datacases/datathon-2018-2/kaufland/sample_Kaufland_case_IoT_and_predictive_maintenance.csv’] Events¶ In [4]: with‘datacases/datathon-2018-2/kaufland/20180820_Kaufland_case_IoT_and_predictive_maintenance_events.xlsx’, ‘rb’) as f: df_events = pd.read_excel(f) In [5]: df_events Out[5]: […]

Datathon NSI Solution – Predicting Household Budgets

Posted 2 CommentsPosted in Datathons Solutions

Predicting Houshold Budgets¶Authors: SoRd1, Jack, pr0faka, Kolio¶Team: Pigeons¶ Statistics is the painful elaboration of the obvious. Hello everyone 🙂 We all hope that you had a great time during the Datathon, because we did. We are working on the case from NSI – to predict the household expenditures per group for the years in which […]

Datathon NSI Solution – The curious case of ‘Household Budget Survey(HBS)’

Posted 6 CommentsPosted in Prediction systems

The National Statistical Institute of Bulgaria (NSI) conducts annually a Household Budget Survey (HBS) with an objective to get reliable and scientifically founded data on the income, expenditure, consumption and other elements of the living standard of the population as well as changes, which have occurred during the years. NSI is considering a change in the periodicity of the Household Budget Survey from yearly to once on every five years,In order to optimize the cost of carrying out the survey. Hence We are creating a model which will predict household expenditure for the next four years using linear regression model and time series. The algorithms that we will be taking help from are linear regression model & Autoregressive integrated moving average(ARIMA). So lets not waste any time and move on with it !

Datathon Sofia Air Solution – Telelink Case Solution

Posted 5 CommentsPosted in Prediction systems

Telelink Case Solution Team Dimas The Team Members – apetkov – desinik – rdimitrov – melania-berbatova – vrategov Github Repo: Workflow The main workflow happens over at our github page. You can read the latest version of this article here: ## Content 0. Data We were given the following 4 datasets: Air […]

Datathon Sofia Air Solution – The Telelink Case handled by the Urban air quality Gurus!

Posted 4 CommentsPosted in Datathons Solutions

  1. Business Understanding Particulate matter is considered the air pollutant of greatest concern to the health of the urban population. Researches have shown that exposure to PM can lead to increased days lost from work or school, emergency room visits, hospital stays, and deaths. Both short and long-term exposures to PM can lead to […]

Datathon Telenor Solution – Ravens for Communication

Posted 1 CommentPosted in Datathons Solutions

It is a very well known fact that Exploratory Data Analysis is cornerstone of Data Analysis.
On the analysis of data it is evident that Brass Raven Birdy as the most failed and the Metallic Raven Sunburst Polly is the most successful raven. Also Targeryan family has the most Raven fails whereas Baelish family has the least failures,and among the family of Baelish, Peter Baelish has the most failure rate and Euron has the least failures.
ARIMA model is used for predicting the number of failures for the next 4 days.