Abstract
In statistical analysis, researchers often perform coordinatewise Gaussianization such that each variable is marginally normal. The normal score transformation is a method for coordinatewise Gaussianization and is widely used in statistics, econometrics, genetics and other areas. However, few studies exist on the theoretical properties of the normal score transformation, especially in high-dimensional problems where the dimension p diverges with the sample size n. In this article, we show that the normal score transformation uniformly converges to its population counterpart even when (Formula presented.). Our result can justify the normal score transformation prior to any downstream statistical method to which the theoretical normal transformation is beneficial. The same results are established for the Winsorized normal transformation, another popular choice for coordinatewise Gaussianization. We demonstrate the benefits of coordinatewise Gaussianization by studying its applications to the Gaussian copula model, the nearest shrunken centroids classifier and distance correlation. The benefits are clearly shown in theory and supported by numerical studies. Moreover, we also point out scenarios where coordinatewise Gaussinization does not help and even causes damages. We offer a general recommendation on how to use coordinatewise Gaussianization in applications. Supplementary materials for this article are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 2329-2343 |
Number of pages | 15 |
Journal | Journal of the American Statistical Association |
Volume | 118 |
Issue number | 544 |
DOIs | |
State | Published - 2023 |
Bibliographical note
Publisher Copyright:© 2022 American Statistical Association.
Keywords
- Gaussianization
- Gaussianized distance correlation
- Heavy tails
- Nearest shrunken centroids classifier
- Normal score transformation