Datathon Telenor Mentors’ Guidelines – On TelCo predictions

Posted Leave a commentPosted in GD2018 Mentors, Mentors

In this article the mentors give some preliminary guidelines, advice and suggestions to the participants for the case. Every mentor should write their name and chat name in the beginning of their texts, so that there are no mix-ups with the other menthors. By rules it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]

Datathon Sofia Air Mentors’ Guidelines – On IOT Prediction

Posted Leave a commentPosted in GD2018 Mentors, Mentors

In this article the mentors give some preliminary guidelines, advice and suggestions to the participants for the case. Every mentor should write their name and chat name in the beginning of their texts, so that there are no mix-ups with the other menthors. By rules it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]

Datathon Kaufland Mentors’ Guidelines – On Predictive Maintenance

Posted Leave a commentPosted in GD2018 Mentors, Mentors

In this article, the mentors give some preliminary guidelines, advice, and suggestions to the participants for the case. Every mentor should write their name and chat name at the beginning of their texts so that there are no mix-ups with the other mentors. By rules, it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]

Datathon – HackNews – Solution – data_monks

Posted 6 CommentsPosted in Datathons Solutions

The word propaganda is defined as designating any attempt to influence the opinions or actions of others to some predetermined end by appealing to their emotions or prejudices or by distorting the facts. We are fooled by propaganda chiefly because they appeal to our emotions rather than to our reason. They make us believe and do something we would not believe or do. And since it appeal more to our emotions; we often don’t recognize it when we see it.
The current political landscape is shaped by extreme polarization of opinions and by the proliferation of fake news.
Studies and surveys has found that rumour’s and fake news tend to spread six times faster than truthful information. This situation both damages the reputation of respectable news outlets and it also undermines the very foundations of democracy, which needs free and reliable press to thrive. Therefore, it is in the interest of the public as well as of the news organizations to be able to detect and fight disinformation in all its forms.
Here, we are trying to create a tool that can help identify propagandistic articles with the help of Predictive Analytics.
The main objectives are:
(i) to flag the article as a whole
(ii) to detect the potentially propagandistic sentences in a news article
(iii) to identify the exact type and span of use of propagandistic techniques

Datathon Telenor Solution – Ravens for Communication

Posted 1 CommentPosted in Datathons Solutions

It is a very well known fact that Exploratory Data Analysis is cornerstone of Data Analysis.
On the analysis of data it is evident that Brass Raven Birdy as the most failed and the Metallic Raven Sunburst Polly is the most successful raven. Also Targeryan family has the most Raven fails whereas Baelish family has the least failures,and among the family of Baelish, Peter Baelish has the most failure rate and Euron has the least failures.
ARIMA model is used for predicting the number of failures for the next 4 days.

Datathon Telenor Solution – Winner Winner-Data Dinner- The Telenor Case

Posted 2 CommentsPosted in Prediction systems

Based on one month data with flight fails, we have to make time-series analysis and predict the future amount of fails and also to find out how many of the ravens sent are not going to make it. Communication being an essential part of the human existence and lives are solely dependent on regular communication,it is necessary to find out the failures,so as to improve them and make the communication system better and to predict further flaws in the system and correct them.As we dig deep into this data,we will find valuable insights that would help us to improve the failure rate and downgrade it,so as to improve the success rate and provide a better working system to work on in the near future.

Datathon Telenor Solution – WRANGLING WITH DATA DROPS

Posted 2 CommentsPosted in Datathons Solutions

This article proposes very tractable approach to modelling changes in regime .The parameters of Time & Date are viewed in the outcome for this analysis.
In the 21st century cell phones are the most commonly used and important wireless technology. Cell phones are so common that it can be seen in everyone’s hand doesn’t matter what age group that individual belongs to, whether that individual is old, young or teenager belonging to any terrain .India has a population of 1.32 billion and comprises of nearly 340 million cell phones. It is used for communication ,messaging , downloading and uploading data on the internet.
There are times when an user counter issues in communications like termination of call,data drop in between of communication, wrong connections, etc. which may have an impact on the overall experience of the network subscribers. The telecom service providers have to implement certain data management technology to improve their infrastructure to minimize the effect of call drop and data drop to provide quality services to their customers.
Nearly all signals contain energy at harmonic frequencies, in addition to the energy at the fundamental frequency. If all the energy in a signal is contained at the fundamental frequency, then that signal is a perfect sine wave. The telecommunication signals also contains many harmonics which are affected a lot because of semiconductor interfacing , physical or digital barriers.
Keywords- Data drop,Call drop

By KrYpToNiAnS

Posted 3 CommentsPosted in Datathons Solutions, Learn, Team solutions

List of required packages

“`{r}
library(data.table) #for fread function
library(dplyr) #for pipeline function
library(plyr) #for join function
library(tseries) #for ts function
library(forecast) #for forecast function
library(caret) #for neuralnetwork prediction
library(ggplot2) #for plots
library(mice) #for imputating NA/missing values
library(zoo) #for imputing

“`

###Working with Dataset[price_data.csv]

“`{r}

url <- "matrix_one_file/price_data.csv"

crypto <- fread(url, header = TRUE)

crypto_main <- crypto[,c(1:17,20,25,34,37)]
View(crypto_main)
crypto_loop <- crypto_main[,2:21]
name <- names(crypto_loop)

#Automation for Prediction
for( i in name){

crypto_work % select(time,i)
names(crypto_work) <- c("Time", "Price")
crypto_work$Time <- as.POSIXct(crypto_work$Time, format = "%Y-%m-%d %H:%M:%S")
d<- colnames(crypto_work)[2]

# to get the data for time series
crypto_work1 % filter(Time = “2018-01-18 00:00:00”)
Time <- seq(ISOdatetime(2018,1,18,00,0,0), ISOdatetime(2018, 1, 24,11,55,0), by= (60*5))
df <- data.frame(Time)
crypto_temp <- join(df, crypto_work1, by = "Time")
crypto_temp$Price <- na.approx(crypto_temp$Price)

#to get the original value from 25th Jan to 29th Jan
crypto_orignal_value % filter(Time = “2018-01-25 00:00:00”)
Time <- seq(ISOdatetime(2018,1,25,00,0,0), ISOdatetime(2018, 1, 29,11,55,0), by= (60*5))
df1 <- data.frame(Time)
crypto_temp1 <- join(df1, crypto_orignal_value ,by = "Time")
crypto_temp1$Price <- na.approx(crypto_temp1$Price)

#initializing variables
df_new <- data.frame()
new_df <- data.frame()
value <- c()
start <- 1

for(j in 1:5){

for(k in 1:288){

crypto_price <- ts(crypto_temp$Price, start = c(1,1), frequency = 288)
fit1 <- nnetar(crypto_price)
a <- forecast(fit1, h=1)
value <- append(value,a$mean)
df_new<- data.frame(crypto_temp1$Time[start], a$mean)
names(df_new) <- c("Time","Price")
crypto_temp <- rbind(crypto_temp, df_new)
start <- start+1

}
output_file <- crypto_temp[(1873+(start-k)):nrow(crypto_temp),]
rownames(output_file) <- c()
name <- paste(i,d,"(",j,")",".csv",sep = "")
write.csv(output_file,name)

}

}

“`

Datathon NSI Solution – The curious case of ‘Household Budget Survey(HBS)’

Posted 6 CommentsPosted in Prediction systems

The National Statistical Institute of Bulgaria (NSI) conducts annually a Household Budget Survey (HBS) with an objective to get reliable and scientifically founded data on the income, expenditure, consumption and other elements of the living standard of the population as well as changes, which have occurred during the years. NSI is considering a change in the periodicity of the Household Budget Survey from yearly to once on every five years,In order to optimize the cost of carrying out the survey. Hence We are creating a model which will predict household expenditure for the next four years using linear regression model and time series. The algorithms that we will be taking help from are linear regression model & Autoregressive integrated moving average(ARIMA). So lets not waste any time and move on with it !

Datathon Telenor Solution – Analysis Of Mobile Data Connectivity Delays

Posted Leave a commentPosted in Datathons Solutions

Problem statement :This data set is regarding time series analysis on failure rate of ravens sending the messages from king’s landing to the north . This case study is an analogy on Telenor telecommunications and Game of Thrones . Due to the obstacles that caused the failure rate , various techniques and schemes are employed in the planning, design and optimization of raven networks to combat these propagation effects.

We have used R-studio for Exploratory Data Analysis.
As per the tasks given to us , we concluded that
1.Brass Raven Birdy has been delayed for the most number of times , followed by Brown raven ruby and Yellow raven Rio,
while Metallic Sunburst Raven Polly has been delayed for the least number of times , followed by Green Sheen raven Azul and Less combative raven zazu.
2. The family with most fails is Targerian , while with least fails is Lannister
3. The family Member with most fails is Petyr Baelish and with least fails is Euron .
We have done further analysis on predicting the fails for the next four days using TIME SERIES ANALYSIS

Academia Datathon

Posted 1 CommentPosted in Datathons Solutions, Learn, Team solutions

      Team Toolset: R-Tool and MS-Excel. Business Understanding Over the years, the world has been emerging towards digital asset and this gave birth to new currency know as cryptocurrency. Cryptocurrencies are a type of digital currencies, alternative currencies and virtual currencies. The first decentralized cryptocurrency was Bitcoin created in 2009. Altcoins was a new currency, which was derived […]