sama

Popular comments by sama

Data Exploration, Observations, Planning

This is very good and thorough. I am convinced that you know what you are doing, and have come up with a plan to proceed. I did notice that you seem to be planning to group observations (average them?) into a single observation/day, rather than the current hourly information. It seems to me like this would be losing a significant amount of information, particularly if the ultimate goal is to create predictions for an entire 24 hr interval. I think perhaps you should try to use a Jupyter notebook in the future for cleaner presentation of your code.

Monthly Challenge – Sofia Air – Solution – [iseveryonehigh]

I think this looks like a good start!

I’m not sure (and admittedly my article doesn’t have much of this either), but I think that you may want to include more background about the project goals. The instructions (https://www.datasciencesociety.net/october-data-science-monthly-challenge/) also had some more suggestions about data cleaning that you may want to implement – things like checking for missing values and removing stations which weren’t measured in 2018.

Also, while taking care of the 0 values seems like a good idea, I’m curious what you replaced them with? I’m not familiar with ffill. Did you assign mean values for each of the “missing” data points?

I really like your heatmaps, and think that they are useful for visualizing the data before you really dive into using it.

Overall a good start!