Hi Everybody,
I have done a basic EDA of some of the data and summarized it in the below python Jupyter notebook. (Thanks junior for advising how to upload a ipynb file into the platform)
This notebook is covering:
– Loading and merging the EEA Data
– Loading and merging the Air Tube Data
– Changing data types of timestamp variables
– Exploring the locations of the official and civil stations on a map
– Exploring the trend of the concentration of PM10 pollutants from the official stations measurements
– Exploring different heatmaps of pollution by hour and by day of month
– Exploring the relationships between the different weather variables and P1,P2 variables from the Air Tube datasets
– Exploring the correlations of the measurements of the civil stations which are within 1km distance from the official stations
– Comparing PM10 metrics for official stations and civil stations within 0.7 km range
-Fit SARIMAX model to predict the average daily P10 pollution for the official stations and score it on the existing 2018 data
-Review and compare model performance to actual metrics in 2018
P.S. – the last chart doesn’t show up in this platform so you can view it at: https://plot.ly/create/?fid=MartinPetrov:13#/
15 thoughts on “Sofia Air Quality EDA (Exploratory Data Analysis) and basic ARIMA model – updated”
But why don’t you import the jupyter notebook here? Anything outside the webside is not considered by the mentors
Your assignments to peer review (and give feedback below the coresponding articles) for week 1 of the Monthly challenge are the following teams:
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-lone-fighter/
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-iseveryonehigh/
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-dirty-minds/
Thanks junior, i did what you advised.
That is by far the best article for week 1. Congratulations!
Great work! Really like the idea for the heatmap visualization by day and month.
Thanks
Hello Martin,
You have done a great job on Week One task! Your data visualization is very good and the idea of heatmaps is inspiring.
We are looking forward for your solution for Week 2.
Good luck!
Team Kiwi (:
Thank you guys, glad you liked it
Good job ! I like the idea of putting a short content in the beginning and that you have used a lot of plots and tables. I also like the brief explanations on every step . All this make the article comprehensible.
As a beginner, I can say that your code is very useful for me .
Thank you, hope it will help you get started in data analysis and visualisations.
Your assignments to peer review (and give feedback below the coresponding articles) for week 2 of the Monthly challenge are the following teams:
https://www.datasciencesociety.net/air-quality-week-1/
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-jeremy-desir-weber/
https://www.datasciencesociety.net/sofia-air-week-1/
Your assignments to peer review (and give feedback below the coresponding articles) for week 3 of the Monthly challenge are the following teams:
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-kung-fu-panda/
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-banana/
https://www.datasciencesociety.net/air-quality-week-1/
Your assignments to peer review (and give feedback below the coresponding articles) for week 4 of the Monthly challenge are the following teams:
https://www.datasciencesociety.net/monthly-challenge-sofia-air-solution-kiwi-team/
https://www.datasciencesociety.net/the-pumpkins/
https://www.datasciencesociety.net/data-exploration-observations-planning/
Great job you have done here. Clear and simply structured code – we can understand the steps and why are they represented in this way. Very nice visualization of the graphics and one of the best solutions to the task! 🙂 – Team Yagoda
Thank you