Datathons Solutions

Monthly Challenge – Sofia Air – Solution – Kiwi Team


14 thoughts on “Monthly Challenge – Sofia Air – Solution – Kiwi Team

  1. 0

    Please upload your code here in the article – we have great capabilities on this platform to import and render jupyter notebook, or at least to snipet code in quotes

  2. 0

    Thank you for sharing with us your solution. The maps are really helpful and they give us a clear vision of the situation. The leaflet library was a smart decision. 🙂 – Team Yagoda

  3. 0

    Fantastic work ! I got a lot of insight about the data which I never had before !

    I was thinking of fixing the error values in temperature and all by taking the forward and previous value of the error value and taking those two’s mean instead of the whole column. For example at 1pm the temperature is 17 and 2pm is 53 and 3pm is 19, so I’ll set a threshold of max temperature which will catch 53 as an anomaly and then then replace 53 by 18 which is the average of 17 and 19. Also the mentioned operation will be done after grouping the geohashes.

    It is just a suggestion and I am just a beginner so this suggestion may not be helpful :p

    Good luck !

  4. 0

    This is great work, in my opinion. The code is very clean and functional, along with the analysis you did beforehand and in the process, it helped understand the task better. And as everyone else said, the final result is just beautiful and very informative. Great work!

  5. 0

    excellent write and great progress from last week. I enjoyed your method of clustering the geohashes into a regions to filter down the Geohashes that are within Sofia boundaries.

    I liked how you visualized the data in 3D to bring in the perspective of elevation.
    All of your temperature, pressure and humidity boundaries seemed like good trade-offs. Given the problem definition scope being within Sofia, is there a reason to keep the data points that are higher in elevations relative to points inside the city boundaries? Given Sofia is at 581m (or somewhere there) should data with elevations about 1000 meters be included at all?

    Overall, fantastic approach. I really liked how your team is pulling this together in a well though you reasoned way. Good luck for next week.

  6. 0

    Great progress so far! As I am not familiar with R, I cannot say anything about the code. The visualizations look great though 🙂
    However, I have some questions:
    How did you come up with the limits for the temp, pressure and humidity? What about the errors in the P10 measurements?
    What is the point of aggregating the data by geo unit? Why don’t we model the data at each sensor location? What do we gain from the aggregation?

  7. 0

    Hello Kiwi, congratulations for this relevant work so far! Particularly appreciate the data viz and explanations surrounding your code. From my understanding , you removed 2017 stations not present in 2018? This is indeed interesting to avoid dropping most of 2018 in the opposite was done (removing 2018 stations not present in 2017); still hesitating what’s best on this point. Not sure to understand what your localizeErrors function does, and how to interpret subsequent figures. Looking forward to your updates for week 3! Best

Leave a Reply