Network cross-validation by edge sampling

Tianxi Li, Elizaveta Levina, Ji Zhu

Research output: Contribution to journalArticlepeer-review

73 Scopus citations

Abstract

While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. In this paper we propose a new network resampling strategy, based on splitting node pairs rather than nodes, that is applicable to cross-validation for a wide range of network model selection tasks. We provide theoretical justification for our method in a general setting and examples of how the method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a statisticians' citation network show that the proposed cross-validation approach works well for model selection.

Original languageEnglish (US)
Pages (from-to)257-276
Number of pages20
JournalBiometrika
Volume107
Issue number2
DOIs
StatePublished - Jun 1 2020
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2020 Biometrika Trust.

Keywords

  • Cross-validation
  • Model selection
  • Parameter tuning
  • Random network

Fingerprint

Dive into the research topics of 'Network cross-validation by edge sampling'. Together they form a unique fingerprint.

Cite this