Accurately and reliably forecast the time-series trends of the sensors for each testing aircraft gas turbine engine in testing dataset. Also Identify the significant sensors sufficient to quantify the degradation of the aircraft turbofan engines.
Remaining Useful Life (RUL) prediction is important in many applications of ML in the manufacturing and service industry to get an idea when the component or machine is likely to fail and accordingly service the it before breakdown.
Dataset
The "train_data.csv" is an important file to try and identify which sensors show similar degradation trends across the 50 training engines. You may also be interested in checking the consistency of the cycles at and near the failure cycle for the 50 training engines. The "test_data.csv." For the project, you only need to forecast the number of cycles remaining to failure, which is available at "RUL_forecast_length.csv."
Implementation
We will start with importing the required libraries
# Importing the required libraries
import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
Then we load the test and train dataset
# Reading the .csv files
train_df = pd.read_csv('./train_data.csv')
test_df = pd.read_csv('./test_data.csv')
rul_df=pd.read_csv('./RUL_forecast_length.csv')
To get an idea of how the data looks, so it will be easier to deicide what preprocessing needs to be done in order to make the data clean and ML ready.
train_df.head()
Then we get the number cycle statistics for the engines in the training data
As you can see the minimum number of cycles a engine will run is 128 cycles and the maximum is 287 with mean cycle count of 198 cycles.
Then we proceed with add the target features as the data doesn't have a RUL value precalculated. So we the max cycles a engine has run and with the last cycle as zero we increment the cycle count till the max cycle for first cycle. This gives us regression variable which can be predicted by the ML model. The same is done for the test data also.
Then we go the feature selection by plotting the sensor data as it approaches the failure cycle. From the plot two types of sensor are found which varies as it approaches the failure cycle and other set of sensor's which stays constant through out the cycles.
some plot are shown below.
So only the sensor showing variation was selected and rest were removed from the training and testing data.
Not we standardise the numerical features in the data. Now ML models are trained to fit the data and the R2 and RMSE of models is noted against the test data. From the best model the model is then hyperparameter tuned to get the best fit for the data.
If you need implementation for the above problem or any of its variants, feel free to contact us.
Comments