Datathon cases

Air quality week 1


8 thoughts on “Air quality week 1

  1. 0

    You should use th in-built capabilities in our website which allow you to upload directly a Jupyter notebook, including code, visualisations your remarks, etc. Putting pdf is lame.

  2. 1

    Review Comments:

    Over all, good job. You seem to be heading in the right direction. You seem to know your technology and tooling and have been applying good reasoning along the way. I am not familiar with tidyverse (so won’t comment on it’s use, but it seems so minimal, I may consider learning it :-). I enjoyed reading your to the point execution to the goals and use of quick plotting to visualize the points.

    A few things I would have liked to have seen in the business section: (a) Assumption being made (b) Risks to the project from both an execution and risk perspective. (c) there are underlying assumptions that aren’t clearly stated. For example, seems to question whether the government data is accurate, sufficient etc. Does this impact your analysis or data bias?

    In the data section, on the positive side, you walked through it methodically and worked through understanding the citizen data and arriving at a filtered set. Some feedbacks on data section:

    1. You seem to ignore the data from the API – perhaps you plan to get to this later but consider if that makes your analysis better, worse or not known.
    2. You excluded all data points outside of Sofia. While that is being asked, is there any reason you did this so soon? Could you have done a few iterations of the analysis and then removed them? Especially considering air pollution and inversion, if there is inversion in Sofia, it’s probable the inversion is also in neighboring areas. Is activity in the neighboring areas making Sofia’s problem worse? I think data granularity is an important element. I can tell you that Seattle WA has had thick smoke blanketing the area a few months back from the fires in Vancouver Canada and Oregon. Air pollutants have an odd way of moving around. Perhaps this is a personal choice I would make and explore the neighboring data for patterns first, before eliminating them. From the objectives of the problem scope, your approach is probably just fine – will let the mentors decide.

  3. 0

    Hello, this is a good start of analysing the locations of the civil stations. I do think you have to make some more exploratory data analysis in order to understand the distribution of the variables in all of the datasets provided as well as understand correlations. What predictive model are you aiming at using?

Leave a Reply