TY - GEN
T1 - Just compress and relax
T2 - 6th International Symposium on Communications, Control and Signal Processing, ISCCSP 2014
AU - Marcos, J. H.
AU - Sidiropoulos, Nikolaos
PY - 2014
Y1 - 2014
N2 - In applications of tensor analysis, missing data is an important issue that is usually handled via weighted least-squares fitting, imputation, or iterative expectation-maximization. The resulting algorithms are often cumbersome, and tend to fail when the percentage of missing samples is large. This paper proposes a novel and refreshingly simple approach for handling randomly missing values in big tensor analysis. The stepping stone is random multi-way tensor compression, which enables indirect tensor factorization via analysis of compressed 'replicas' of the big tensor. A Bernoulli model for the misses, and two opposite ends of the tensor modeling spectrum are considered: independent and identically distributed (i.i.d.) tensor elements, and low-rank (and in particular rank-one) tensors whose latent factors are i.i.d. In both cases, analytical results are established, showing that the tensor approximation error variance is inversely proportional to the number of available elements. Coupled with recent developments in robust CP decomposition, these results show that it is possible to ignore missing values without losing the ability to identify the underlying model.
AB - In applications of tensor analysis, missing data is an important issue that is usually handled via weighted least-squares fitting, imputation, or iterative expectation-maximization. The resulting algorithms are often cumbersome, and tend to fail when the percentage of missing samples is large. This paper proposes a novel and refreshingly simple approach for handling randomly missing values in big tensor analysis. The stepping stone is random multi-way tensor compression, which enables indirect tensor factorization via analysis of compressed 'replicas' of the big tensor. A Bernoulli model for the misses, and two opposite ends of the tensor modeling spectrum are considered: independent and identically distributed (i.i.d.) tensor elements, and low-rank (and in particular rank-one) tensors whose latent factors are i.i.d. In both cases, analytical results are established, showing that the tensor approximation error variance is inversely proportional to the number of available elements. Coupled with recent developments in robust CP decomposition, these results show that it is possible to ignore missing values without losing the ability to identify the underlying model.
KW - CANDECOMP / PARAFAC
KW - Tensor decomposition
KW - big data
KW - imputation
KW - missing elements
KW - missing values
KW - multi-way arrays
KW - tensor completion
UR - http://www.scopus.com/inward/record.url?scp=84906751229&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84906751229&partnerID=8YFLogxK
U2 - 10.1109/ISCCSP.2014.6877854
DO - 10.1109/ISCCSP.2014.6877854
M3 - Conference contribution
AN - SCOPUS:84906751229
SN - 9781479928903
T3 - ISCCSP 2014 - 2014 6th International Symposium on Communications, Control and Signal Processing, Proceedings
SP - 218
EP - 221
BT - ISCCSP 2014 - 2014 6th International Symposium on Communications, Control and Signal Processing, Proceedings
PB - IEEE Computer Society
Y2 - 21 May 2014 through 23 May 2014
ER -