Abstract
We propose a method for imputation of missing values in large scale matrix data based on a low-rank tensor approximation technique called the block tensor train (BTT) decomposition. Given sparsely observed data points, the proposed method iteratively computes the singular value decomposition (SVD) of the underlying data matrix with missing values. The SVD of the matrices is performed based on a low-rank BTT decomposition, by which storage and time complexities can be reduced dramatically for large-scale data matrices admitting a low-rank tensor structure. An iterative soft-thresholding algorithm is implemented for missing data estimation based on an alternating least squares method for BTT decomposition. Experimental results on simulated data and real benchmark data demonstrate that the proposed method can estimate a large amount of missing values accurately compared to a matrix-based standard method. The R source code of the BTT-based imputation method is available at https://github.com/namgillee/BTTSoftImpute.
Original language | English (US) |
---|---|
Pages (from-to) | 1283-1305 |
Number of pages | 23 |
Journal | Statistical Papers |
Volume | 59 |
Issue number | 4 |
DOIs | |
State | Published - Dec 1 2018 |
Bibliographical note
Funding Information:Acknowledgements This study was supported by 2017 Research Grant from Kangwon National University and by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2017R1C1B5076912).
Publisher Copyright:
© 2018, Springer-Verlag GmbH Germany, part of Springer Nature.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
Keywords
- Imputation
- Multidimensional array
- Singular value decomposition
- Tensor network