========> Code :Archive <=========
The purpose of this case is to understand the different factors causing pollution, in Sofia in particular. The scope is to measure the pollution from the source to different measuring station, and finally attempt to predict what would be the pollution in the future. This would help realize the real danger of the pollution and see how it will scale and increase in the future. The main purpose is to take action against this trend. knowing what are the main causes.
The data are collected from different weather stations, noting also the sources of pollution.
The Sofia topology data was helpful to identify clusters of the polluting sources and their elevations.
We used the data specified for Level 1, mapped together to understand the:
- Average wind speed per day
- Different pollution sources
- Elevation of each source
- Pollution yearly average per source – daily was deduced
- Distance from source to measuring stations
- Stability of the weather
The Gaussian plume model was implemented. It’s output along with the days and pollution source served as input to our predictive mode.
The predictive model:
The input data is compressed into however many neurons desired and the network is forced to rebuild the initial data using the autoencoder. This forces the model to extract key elements of the data, which we can interpret as features. One key thing to note is that this model actually falls under unsupervised learning as there are no input-output pairs, but both input and output is the same.
We used different models but the LTSM model was used to make the prediction because it gave us the best results. It gets its exceptional predictive ability from the existence of the cell state that allows it to understand and learn longer-term trends in the data. Which was perfect in our case because we needed it to predict the weather for the next day based on previous data from 20 days before.
The ADAgrad optimizer essentially uses a different learning rate for every parameter and every time step. The reasoning behind ADAgrad is that the parameters that are infrequent must have larger learning rates while parameters that are frequent must have smaller learning rates. In other words, the stochastic gradient descent update for ADAgrad becomes
- The learning rate is different for every parameter and every iteration.
- The learning does not diminish as with the ADAgrad.
- The gradient update uses the moments of the distribution of weights, allowing for a more statistically sound descent.
All of the analysis above can be implemented with relative ease thanks to keras and their functional API.
The evaluation was done using the provided data set, along with some randomly generated test set by the team.
IN our eval,
the model came close to 86% in accuracy