Finding frequent patterns in a large sparse graph

Michikro Kuramochi, George Karypis

Research output: Contribution to journalArticle

226 Scopus citations

Abstract

Graph-based modeling has emerged as a powerful abstraction capable of capturing in a single and unified framework many of the relational, spatial, topological, and other characteristics that are present in a variety of datasets and application areas. Computationally efficient algorithms that find patterns corresponding to frequently occurring subgraphs play an important role in developing data mining-driven methodologies for analyzing the graphs resulting from such datasets. This paper presents two algorithms, based on the horizontal and vertical pattern discovery paradigms, that find the connected subgraphs that have a sufficient number of edge-disjoint embeddings in a single large undirected labeled sparse graph. These algorithms use three different methods for determining the number of edge-disjoint embeddings of a subgraph and employ novel algorithms for candidate generation and frequency counting, which allow them to operate on datasets with different characteristics and to quickly prune unpromising subgraphs. Experimental evaluation on real datasets from various domains show that both algorithms achieve good performance, scale well to sparse input graphs with more than 120,000 vertices or 110,000 edges, and significantly outperform previously developed algorithms.

Original languageEnglish (US)
Pages (from-to)243-271
Number of pages29
JournalData Mining and Knowledge Discovery
Volume11
Issue number3
DOIs
StatePublished - Nov 1 2005

Keywords

  • Frequent subgraph
  • Graph mining
  • Pattern discovery

Fingerprint Dive into the research topics of 'Finding frequent patterns in a large sparse graph'. Together they form a unique fingerprint.

  • Cite this