Script in R below: library(stringr) #Step 1 ———————- rm(list=ls()) dd <- read.csv(“C:\\Users\\estoyanova\\OneDrive – VMware, Inc\\ES\\UNI\\master BA\\Boriana-Monthly challenge\\Air Tube\\data_bg_2017.csv”, header = TRUE, sep = “,”, na.strings = c(“”,” “, “NA”, “#NA”), stringsAsFactors = FALSE) topo <- read.csv(“C:\\Users\\estoyanova\\OneDrive – VMware, Inc\\ES\\UNI\\master BA\\Boriana-Monthly challenge\\TOPO-DATA\\sofia_topo.csv”, header = TRUE, sep = “,”, na.strings = c(“”,” “, “NA”, “#NA”), stringsAsFactors […]
I have just begun my machine learning course from Andrew Ng at Coursera so I thought that this challenge would be a good test of my learnings. I apologise for the delay for article writing as I was not sure if I should have taken this challenge or not since the dataset seemed difficult to […]
Data Exploration, Observations and Data Preparation Planning
1. Business Understanding The air quality in Sofia, Bulgaria, has been a problem for some time already. The population of the city is constantly increasing and this brings more traffic on the streets. The car ownership in Sofia is among the highest in Europe with around 600 cars per 1000 citizens. Another huge issue in […]
Preliminary Analisys Due to the objective focused on predicting air quality forecast for the next 24 hours per station, first step should be data understanding for citizen science air quality measurements to group it by station and summarize them by day. To complete this task for inspection and pre-processing in order to find missing data, outliers and […]
The theory is not enough Academic education is indeed the best long-term investment in our professional and personal achievements. However, nowadays it becomes crucial for universities to include different practical seminars in their educational programs with the aim of preparing students for the real problems which they will be solving as professionals in a given […]
Why you should join the Data Science Monthly Challenge and what you can expect?
The Data Science Monthly Challenge provides an exceptional opportunity for participants to be involved in finding a solution to a real data science problem [https://bit.ly/2CAg0V8] step by step. The proposed gradual approach towards advanced business problems will give participants a chance to familiarize themselves in depth with each of the important steps which should be considered during the development of an effective and high-quality data science projects.
And last but not least the monthly challenge is an excellent opportunity for data enthusiasts to prepare themselves for participation in the Global Datathon organized by the Data Science Society during which the time is constrained and there is a much higher level of competition. The acquired skills and deeper understanding during the monthly challenges will play a key role and serve as a competitive advantage of the teams in such large-scale events such as the Global Datathons. Nevertheless, the monthly challenge can also be inspiring for those with more competitive attitude because there will be voting for each article and peer-to-peer reviews and each week the best-voted articles in progress will be uploaded on the News section of the site.
So, what are you waiting for? 🙂
Register now for the learning challenge before 15th of Oct at http://bit.ly/2QyNshI