Ensembles of two-directional Laplacian Pyramids for Data Imputation
Amit Shreiber, M.Sc.
Student at the Department of Industrial Engineering in Tel Aviv University
Via Zoom click here
Abstract:
When collecting a dataset, the dataset usually contains some incomplete data. There are some known methods in the literature for handling missing data. For example: deleting observations that contain missing values, or replacing missing values with the mean of the other observations in the relevant variables. These methods and many others have some advantages but also limitations. Another technique for imputing missing is based on the Laplacian pyramids, a multi-scale regression. The Laplacian Pyramids approach constructs kernels of decreasing widths to learn a given dataset in a multi-scale manner. In practice, kernels of decreasing widths are constructed on the row space and the column space of the given dataset. Each algorithm step approximates the complete data by considering the connections in both directions. This research developed two ensemble methods, based on the Laplacian Pyramid, for imputing missing values. Both methods generate multiple imputed data realizations: at least two new values are generated as imputation options for each missing value. The imputed value is based on the vector generated from the multiple imputation procedure. One method shuffles the original dataset and creates new datasets by randomly choosing entries from the shuffled dataset. Then the Laplacian Pyramids are applied to each new dataset. The other method draws new sub-datasets using the Hypergeometric distribution. After the new datasets are sampled, missing values are replaced identically to the previous method. The ensemble methods were examined on datasets with different dimensions, and all the missingness mechanisms were tested. In most cases, the proposed methods generate lower errors than existing imputation methods applied to the benchmark dataset.
Bio:
Amit Shreiber is an M.Sc student at the department of industrial Engineering at Tel Aviv University, specializing in Data Science. Amit holds a B.Sc degree in Industrial Engineering and Management, Information Systems from the Ben-Gurion University of the Negev. Amit works as a data scientist in “Haaretz”. This work was conducted under the supervision of Dr. Neta Rabin