TY - JOUR
T1 - Combining self-supervision and privileged information for representation learning from tabular data
AU - Yang, Haoyu
AU - Steinbach, Michael
AU - Melton, Genevieve
AU - Kumar, Vipin
AU - Simon, Gyorgy
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/8
Y1 - 2025/8
N2 - When building predictive models for real-world applications, many data are discarded because conventional learning algorithms cannot utilize it, although such data could be very informative. This paper focuses on representation learning using two types of additional data: privileged information (PI) and unlabeled data. PI refers to data available only during training but not at test time. Existing methods transfer the knowledge embedded in PI via supervised mechanisms, making them unable to use unlabeled data. In contrast, self-supervised learning methods can use unlabeled data but cannot learn from PI. While these techniques appear complementary, as we demonstrate, combining them is non-trivial. This paper introduces the privileged information regularized (PIReg) self-supervised learning framework, which utilizes both PI and unlabeled data to learn better representations.
AB - When building predictive models for real-world applications, many data are discarded because conventional learning algorithms cannot utilize it, although such data could be very informative. This paper focuses on representation learning using two types of additional data: privileged information (PI) and unlabeled data. PI refers to data available only during training but not at test time. Existing methods transfer the knowledge embedded in PI via supervised mechanisms, making them unable to use unlabeled data. In contrast, self-supervised learning methods can use unlabeled data but cannot learn from PI. While these techniques appear complementary, as we demonstrate, combining them is non-trivial. This paper introduces the privileged information regularized (PIReg) self-supervised learning framework, which utilizes both PI and unlabeled data to learn better representations.
KW - Health care
KW - Privileged information
KW - Representation learning
KW - Self-supervised learning
UR - https://www.scopus.com/pages/publications/105003951413
UR - https://www.scopus.com/inward/citedby.url?scp=105003951413&partnerID=8YFLogxK
U2 - 10.1007/s10115-025-02418-1
DO - 10.1007/s10115-025-02418-1
M3 - Article
AN - SCOPUS:105003951413
SN - 0219-1377
VL - 67
SP - 6907
EP - 6935
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 8
ER -