1. Business Understanding
Industrial vibration analysis is a measurement tool used to identify, predict, and prevent failures. Implementing vibration analysis on the machines will improve the reliability of the machines and lead to better machine efficiency and reduced down time eliminating mechanical or electrical failures. Vibration analysis are used to identify faults in machinery, plan machinery repairs, and keep machinery functioning for as long as possible without failure. Good measurement of the vibrations are the velocity and the acceleration of the movements that are observed in the vibration process.
2. Data Understanding
Time series data from velocity and acceleration measurement was provided for 7 machines and 6 sensors for each machine. Data collected at several hours per day was available. For different machines different time-spams were used.
Initial understanding the data was complex task, due to the high amount of records and not well structured data.
For better understanding of the data, exploration of graphs and detail overview of the descriptive statistics were applied. Any connections among different machines and sensors were investigated. It is visible that results of sensors which are situated on the same part of the machine are highly correlated.
Additional data was provided for some of the maintenance events like grinding the rails and change of parts.
An example plot for daily max values of effective Velocity of RBG1 from drive_gear sensor:
3. Data Preparation
In order to reconstruct the data in more contvinient and practical way various data prep techniques were used:
- Aggregated data on different levels: machine, sensor type, hourly basis, daily basis.
A proxy of max daily value was used. In order to ignore the noise in the pure maximal values, max value of hourly calculated 90 percentile measurement is taken. It is expected that any maximal values of velocity and acceleration registration from the sensors may be connected with eventual disturbance in the machine functionalities.
- Filled in missing dates from the time series for the purposes of the Time Series analysis was accomplished. The missing values are populated with the value from the closest date.
The number of consequently missing days with observations also may be a strong predictor for the machine performance. For example, if the machine was idle only for a day, it was probably a small maintenance event, while long periods with no observations can suggest a significant failure.
For example, for two of the machines the number of consequent missing days shows the following distribution:
- Predictor characteristics The following rolling characteristics were developed:
– Rolling mean with window 7 days
– Rolling std with window 7 days
– Rolling max with window 7 days
– Rolling min with window 7 days
– Rolling difference with lag 1 day
- The dates when maintenance activities have been performed were flagged as an additional variable.
- Outcome definition
The outcome is defined according to the following rules:
– Maintenance/repairs of the machines
– Values where sum of rolling means of acceleration values exceeds 4 times the standard deviation of the sum
– All other cases
Outcome flag distribution:
Failures: 51 cases
Non-failures: 3372 days
- SMOTE oversampling
Because of the small number of failures a synthetic oversampling technique was used in order to multiplicate the records. Synthetic data points where created as additional failure records. The smaller subpopulation in the classification output is made to be the same number of observation as the other class. In the our case the number of failures is synteticaly oversampled to be equal to the non-failures. This has been done only to build a classification model. After that the classification object has used on non-syntetic original data.
The modelling was separated in two main stages
-Different standard Time Series analysis were conducted.
- Decomposition of the Time Series.
- Stationary tests
- ARIMA methods
- Fourier Analysis
– Various Classification techniques were applied using the outputs from the first step.
- Clustering analysis
- Classification Methods –
4.1 Time series analysis
In order to be captured any seasonality or behaviour pattern in the measurements or in the maintenance process various time series analysis are applied. As expected, unit root test of the series showed that data is non-stationary. Lag values and differences are produced for further analysis on this base.
ACF and PACF results showed significant lags to be present in the series. In the Auto-regressive part these are lag 1 and lag 7.
Based on the Auto correlation Function and Partial auto correlation function 1 and 7 lag are picked as predictors in an AR part as further inputs in the final model.
4.2 Fourier transform
Fourier transformation was applied to an additionally created dummy variable that shows whether measurement tests were run on that date or not. As it may be seen from the graph below, strong cyclical component on approximately 100 days is found:
Our main assumption is, that these 100-days cycle is a regular maintenance of the machine (like grinding, changing parts, setup, etc.). However, this was not confirmed by the repair records data provided by the organizers.
4.3 Clustering Analysis
The clustering analysis was applied separately on each machine. Well separated clusters were observed based on the trends of different measurements across time series.
4.4 Classification approach
A Support Vector Machine is built as a classification approach. In order to use the computing process in the most efficient way, the kernel version of this model is used. Grid search is performed to optimize the performance of the model and the kernel, gamma and C parameters were tuned in order to select the best performing model.
All of the models were validated with hold-out sample which is 25% of each machine observations.
Confusion matrices were created on the validation to assess the discrimination performance of the model. The model has shown f1 score of 0.,998279.
The model can be used to build an application that returns a value of 0 when the current state of the machine doesn’t show high probability of failure, and 1 if the machine is about to fail. New data can be added on daily or weekly basis and the retrained model can be used to predict the class of the current state of the machine.
Code applied Below:
Modeling Notebook (please download the zip file for modelling code)
Data Prep Notebook