Telelink Case Solution Team Dimas The Team Members – apetkov – desinik – rdimitrov – melania-berbatova – vrategov Github Repo: https://github.com/Bugzey/Team-Midas Workflow The main workflow happens over at our github page. You can read the latest version of this article here: https://github.com/Bugzey/Team-Midas/blob/master/7.%20Documentation/Doc_010%20Documentation.md ## Content 0. Data We were given the following 4 datasets: Air Tube-20180928T185037Z-001.zip […]
1. Business Understanding Particulate matter is considered the air pollutant of greatest concern to the health of the urban population. Researches have shown that exposure to PM can lead to increased days lost from work or school, emergency room visits, hospital stays, and deaths. Both short and long-term exposures to PM can lead to […]
Cell phones have become a necessity for many people throughout the world. The ability to keep in touch with family, business associates, and access to email are only a few of the reasons for the increasing importance of cell phones. Today’s technically advanced cell phones are capable of not only receiving and placing phone calls, but storing data, taking pictures, and can even be used as walkie talkies, to name just a few of the available options.
Dataset, The Telenor Case – What do Game of Thrones and Telecoms Have in Common? contains the data of delays in networks (RAVENS). The delays of RAVENS are ranging from 26/07/2018 – 05/08/2018. Each RAVEN_NAME represents the Tower. There are 7847 unique RAVEN_NAMES for different networks like 2G/3G/4G. There are 5 unique families.
To provide optimum solution to business problems we are solving the problem in two steps (i) Data Analysis and coding in PYTHON and (ii) Time Series model building in R Studio.
In data analysis we have found the solutions for the problems and found the number of delays (failures) of RAVENS. We also found the Top_10 RAVENS with and without fails. We also detected the Family names and Member names with most and least fails in networks (failures).
The methods of prediction & forecasting of the problem is done by using Time Series model building. As the name suggests that it involves working on time (years, days, hours, minutes) based on data, to derive the hidden insights to make informed decision making. Time series models are very useful models when it is serially correlated data. Based on mobile data, to predict the four days we have divided the data into train and test .We have done Time series analysis by using Arima, Simple exponential analysis and Recurrent Neural networks (RNN).
Finally we conclude that by considering the Root mean square error for these algorithms, we got RNN (Recurrent Neural Networks) as the best algorithm to predict the future for days. Based on the RNN algorithm the prediction of delays for the next four days were analyzed. We have plotted the graphs based on the Time series model for all the algorithms.
It is a very well known fact that Exploratory Data Analysis is cornerstone of Data Analysis.
On the analysis of data it is evident that Brass Raven Birdy as the most failed and the Metallic Raven Sunburst Polly is the most successful raven. Also Targeryan family has the most Raven fails whereas Baelish family has the least failures,and among the family of Baelish, Peter Baelish has the most failure rate and Euron has the least failures.
ARIMA model is used for predicting the number of failures for the next 4 days.
“You know nothing, Jon Snow ……”
is what Ygritte yells to Jon.
Here our situation is also the same as we know nothing about the TELENOR case until we have seen the dataset.
At first, When we heard about the Datathon as beginners, we were very excited to take apart in it.
At finally we received our datasets and here’s our first challenge to import the dataset into the programming platforms,
As we have faced some hurdles to import the dataset as the size of the dataset is around 4GB which has taken some time and put us in the situation :
What we don’t know is what usually gets us killed………………………– Petyr Baelish
we have mentioned above line to express our feeling that we don’t know what’s in the dataset but we want to explore through that.
At last, we are ready to Analysis What do Game of Thrones and Telecoms Have in Common?
At first, when we have gone through the dataset, We have noticed that the Telenor data contains 16 exciting columns with
30091754 jolted rows
When we have gone through the first analysis, we came to know that how complicated the data is, it contains many interesting aspects which we have done through the Exploratory Data Analysis.
Here our main challenge is to predict the fails in the next four days
At First, we have done Exploratory data analysis
(i) Top 10 ravens with fails :
Brass raven Birdy
Brown raven Ruby
Yellow raven Rio
Blue raven Axel
Razzle Dazzle Rose raven Cleo
Cadmium Red raven Bubba
Vain And Lazy raven Polly
Fearful Carrion raven Gizmo
Blast Off Bronze raven Zazu
Loving raven Maxwell
(ii) Top 10 ravens without fails:
Metallic Sunburst raven Polly
Green Sheen raven Azul
Less Combative raven Zazu
Weak raven Buddy
Copper raven Tweety
Spectral Yellow raven Zazu
Mythical raven Tiki
Cyber Grape raven Faith
Mysterious And Venerable raven Bubba
Shadow Blue raven Sammy
(iii) The family with most fails :
(iv) The family with least fails :
(v) The family member with most fails :
(vi) The family member with least fails :
After the EDA we need to predict the future four days of delays in mobile data connectivity. To predict the four days delays we use Time Series analysis.
In Time Series Analysis we used three algorithms ARIMA, Simple Exponential Analysis, Recurrent Neural Networks.
We fitted the model with ARIMA and predict the failures of four days and fitted the model using another algorithm Simple Exponential Analysis.
And We used Recurrent Neural Networks for Prediction of failures.
After Fitting the three models using three different algorithms we evaluated by splitting the data into train and test.
We evaluated the best fit model by using the Root Mean Square Error. By considering the RMSE values of the three models, the model with the least RMSE value is taken as the best fit model.
In this case, considering the mobile failure dataset, RNN(Recurrent Neural Network)has the least RMSE value.
So, RNN is taken as the best fit model to predict the future four days of mobile data delays.
Based on the RNN algorithm the prediction of delays for the next four days based on the dataset
are 973776,973725,973674,973623 for 5 ,6,7,8, August 2018 respectively.
Problem statement :This data set is regarding time series analysis on failure rate of ravens sending the messages from king’s landing to the north . This case study is an analogy on Telenor telecommunications and Game of Thrones . Due to the obstacles that caused the failure rate , various techniques and schemes are employed in the planning, design and optimization of raven networks to combat these propagation effects.
We have used R-studio for Exploratory Data Analysis.
As per the tasks given to us , we concluded that
1.Brass Raven Birdy has been delayed for the most number of times , followed by Brown raven ruby and Yellow raven Rio,
while Metallic Sunburst Raven Polly has been delayed for the least number of times , followed by Green Sheen raven Azul and Less combative raven zazu.
2. The family with most fails is Targerian , while with least fails is Lannister
3. The family Member with most fails is Petyr Baelish and with least fails is Euron .
We have done further analysis on predicting the fails for the next four days using TIME SERIES ANALYSIS
The objective of this analysis is to find out the ravens that are not reaching the destination on time. This kind of analysis would help us to scrutinize and understand the towers(ravens) who would require our utmost attention, in order to improve the reasons which are playing a major role in the delays.
The data-set talks about the networks between the towers (ravens). The land based communication happens with the help of signals.
A cellular network or mobile network is a communication network where the last link is wireless. This wireless transmission is done by a tower which comprises of a transmitter and a receiver (for the wireless transmission). The channel provides transmission for both the data as well as Voice transmission.
Every cellular network has different set of frequencies, to avoid any kind of overlapping and interference. Despite of many precautions for maintaining the setup, there are few parameters that are still impacting the transmission. Few parameters can be classified as:
Interference between the frequencies
External Factors (Predators etc.)
For this our first approach is to create a “Decision Model” which can help us to give value to our business and help in improving the communication.
****** The tools that we using in order to predict is ******
1. Visual Analysis using different plots
2. Usage of ARMA (Auto-regressive- Moving- Average- Model)
The usage of this Decision Model will help us in forecasting the failure rate for next 4-7 days in regards to the Ravens.
In this paper we propose the use of a combination of LSTM and EDM models to address the issue of anomaly classification and prediction in time series data. Working with sensor data for automated storage and retrieval systems for a German hypermarket chain, we show that predictors based on variance and median methods show sufficient promise in the handling of anomalies.
— Team Teljapenosss Team Members — Jalapeno (Nasiba Zokirova) Team Mentor: petya-par Business Understanding The levels of air pollution allegedly caused by solid fuel heating and motor vehicle traffic are ever growing in the City of Sofia. The primary economical impact for the City of Sofia was a ruling by the European Court of […]
Dear Society, you should register for the Global Datathon 2018 – http://bit.ly/2wadU9C in order to see the case descriptions! 🙂
In this article the mentors give some preliminary guidelines, advice and suggestions to the participants for the case. Every mentor should write their name and chat name in the beginning of their texts, so that there are no mix-ups with the other menthors. By rules it is essential to follow CRISP-DM methodology (http://www.sv-europe.com/crisp-dm-methodology/). The DSS […]