Abstract
We investigate graph-based Laplacian semi-supervised learning at low labeling rates (ratios of labeled to total number of data points) and establish a threshold for the learning to be well posed. Laplacian learning uses harmonic extension on a graph to propagate labels. It is known that when the number of labeled data points is finite while the number of unlabeled data points tends to infinity, the Laplacian learning becomes degenerate and the solutions become roughly constant with a spike at each labeled data point. In this work, we allow the number of labeled data points to grow to infinity as the total number of data points grows. We show that for a random geometric graph with length scale ε> 0 , if the labeling rate β≪ ε2, then the solution becomes degenerate and spikes form. On the other hand, if β≫ ε2, then Laplacian learning is well-posed and consistent with a continuum Laplace equation. Furthermore, in the well-posed setting we prove quantitative error estimates of O(εβ- 1 / 2) for the difference between the solutions of the discrete problem and continuum PDE, up to logarithmic factors. We also study p-Laplacian regularization and show the same degeneracy result when β≪ εp. The proofs of our well-posedness results use the random walk interpretation of Laplacian learning and PDE arguments, while the proofs of the ill-posedness results use Γ -convergence tools from the calculus of variations. We also present numerical results on synthetic and real data to illustrate our results.
Original language | English (US) |
---|---|
Article number | 10 |
Journal | Research in Mathematical Sciences |
Volume | 10 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2023 |
Bibliographical note
Funding Information:JC was supported by NSF DMS Grant 1713691 and is grateful for the hospitality of the Center for Nonlinear Analysis at Carnegie Mellon University, and to Marta Lewicka for helpful discussions. DS is grateful to NSF for support via grant DMS-1814991. MT is grateful for the hospitality of the Center for Nonlinear Analysis at Carnegie Mellon University and the School of Mathematics at the University of Minnesota, for the support of the Cantab Capital Institute for the Mathematics of Information and Cambridge Image Analysis at the University of Cambridge and has received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme grant agreement No 777826 (NoMADS) and grant agreement No 647812.
Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Nature Switzerland AG.
Keywords
- Asymptotic consistency
- Gamma-convergence
- Non-local variational problems
- PDEs on graphs
- Random walks on graphs
- Regression
- Semi-supervised learning