Dergi Adı International Journa ...
166449

Optimisations of four imputation frameworks for performance exploring based on decision tree algorithms in big data analysis problems

Bektaş, Jale

The phenomenon of how to treat missing values is a problem confronted in big data analysis. Therefore, various applications have been developed on imputation strategies. This study focused on four imputation frameworks proposing novel perspectives based on expectation-maximisation (EM), self-organising map (SOM), K-means and multilayer perceptron (MLP). Initially, several transformation steps such as normalised, standardised, interquartile range and wavelet were applied. Then, imputed datasets were analysed using decision tree algorithms (DTAs) by optimising their parameters. These analyses showed that DTAs had not been strikingly affected by any data transformation techniques except interquartile range. Even though the dataset contains a missing value ratio of 33.73%, the EM imputation fr...