|Country of origin?||
|For how many years have you been experimenting with data?||
Popular articles by sama
Popular comments by sama
Hi Jacob, I think you could use more detail and perhaps some of your code, or outputs from SPSS. I’m not sure if this is meant to be your final article or if you are still planning to add to it?
This is very good and thorough. I am convinced that you know what you are doing, and have come up with a plan to proceed. I did notice that you seem to be planning to group observations (average them?) into a single observation/day, rather than the current hourly information. It seems to me like this would be losing a significant amount of information, particularly if the ultimate goal is to create predictions for an entire 24 hr interval. I think perhaps you should try to use a Jupyter notebook in the future for cleaner presentation of your code.
I think this looks like a good start!
I’m not sure (and admittedly my article doesn’t have much of this either), but I think that you may want to include more background about the project goals. The instructions (https://www.datasciencesociety.net/october-data-science-monthly-challenge/) also had some more suggestions about data cleaning that you may want to implement – things like checking for missing values and removing stations which weren’t measured in 2018.
Also, while taking care of the 0 values seems like a good idea, I’m curious what you replaced them with? I’m not familiar with ffill. Did you assign mean values for each of the “missing” data points?
I really like your heatmaps, and think that they are useful for visualizing the data before you really dive into using it.
Overall a good start!