Prediction systems

Monthly Challenge – Sofia Air – Solution – Kung Fu Panda


1. Business Understanding

The air quality in Sofia, Bulgaria, has been a problem for some time already. The population of the city is constantly increasing and this brings more traffic on the streets. The car ownership in Sofia is among the highest in Europe with around 600 cars per 1000 citizens.

Another huge issue in Sofia is the usage of coal and wood for heating during the cold months of the year. Some of the neighborhoods are mostly heated with these materials.

Something which has not been researched well is the effect of the construction areas. The exiting trucks bring dirt on the streets. This way the streets and the sidewalks are completely covered by (fine) dust, which by movement goes in the air.

The location of the city is also a huge contributing factor to the bad air quality. The temperature inversions and the resulting fogs in the winter months slow down the dispersion of the air pollutants.

2. Data Understanding

Preliminary analysis – to submit

3. Data Preparation



4. Modeling



5. Evaluation




6. Deployment

Share this

12 thoughts on “Monthly Challenge – Sofia Air – Solution – Kung Fu Panda

  1. 0

    I would recommend importing the Jupyter notebook (ipynb file) directly into the platform. Furthermore, in the section “Data Understanding – Preliminary Analysis” there is too much “how” you work with the data and not enough “why” you do it in the first place. What is the “goal” of each step in the analysis? Keep the good work ! 🙂

    1. 0

      Thanks! I will try to improve the “why” part. Basically the goal preliminary analysis part is to get an understanding of the data, which data is relevant for our purpose and what part of the data we should further use.

  2. 0

    Unfortunately your solution is defined in Python and our team is new to this program so we are not sure whether we can give you a specific evaluation or comment. Keep working, looks nice 🙂 – Team Yagoda

  3. 0

    Your work on Week 2 is pretty good! We noticed that you only filtered the data on P10 particles. There are also serious outliers in temperature and pressure if you are going to use them as explanatory variables for P10. As a suggestion from our team, you can trace the stations that emit false observations and turn them off from next steps of the analysis.
    We really like your visualizations and we are looking forward for your next week solution!
    Team Kiwi (:

Leave a Reply