Ensembles of two-directional Laplacian Pyramids for Data Imputation

Amit Shreiber, M.Sc.

Student at the Department of Industrial Engineering in Tel Aviv University

22 November 2022, 14:00 
zoom & Room 206 
Ensembles of two-directional Laplacian Pyramids for Data Imputation

Via Zoom click here

Abstract:

When collecting a dataset, the dataset usually contains some incomplete data. There are some known methods in the literature for handling missing data. For example: deleting observations that contain missing values, or replacing missing values with the mean of the other observations in the relevant variables. These methods and many others have some advantages but also limitations. Another technique for imputing missing is based on the Laplacian pyramids, a multi-scale regression. The Laplacian Pyramids approach constructs kernels of decreasing widths to learn a given dataset in a multi-scale manner. In practice, kernels of decreasing widths are constructed on the row space and the column space of the given dataset. Each algorithm step approximates the complete data by considering the connections in both directions. This research developed two ensemble methods, based on the Laplacian Pyramid, for imputing missing values. Both methods generate multiple imputed data realizations: at least two new values are generated as imputation options for each missing value. The imputed value is based on the vector generated from the multiple imputation procedure. One method shuffles the original dataset and creates new datasets by randomly choosing entries from the shuffled dataset. Then the Laplacian Pyramids are applied to each new dataset. The other method draws new sub-datasets using the Hypergeometric distribution. After the new datasets are sampled, missing values are replaced identically to the previous method. The ensemble methods were examined on datasets with different dimensions, and all the missingness mechanisms were tested. In most cases, the proposed methods generate lower errors than existing imputation methods applied to the benchmark dataset.

 

Bio:

Amit Shreiber is an M.Sc student at the department of industrial Engineering at Tel Aviv University, specializing in Data Science. Amit holds a B.Sc degree in Industrial Engineering and Management, Information Systems from the Ben-Gurion University of the Negev. Amit works as a data scientist in “Haaretz”. This work was conducted under the supervision of Dr. Neta Rabin

Tel Aviv University makes every effort to respect copyright. If you own copyright to the content contained
here and / or the use of such content is in your opinion infringing, Contact us as soon as possible >>