Out-of-core coherent closed quasi-clique mining from large dense graph databases

Zhiping Zeng, Jianyong Wang, Lizhu Zhou, George Karypis

Research output: Contribution to journalArticlepeer-review

68 Scopus citations

Abstract

Due to the ability of graphs to represent more generic and more complicated relationships among different objects, graph mining has played a significant role in data mining, attracting increasing attention in the data mining community. In addition, frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has witnessed several applications and received considerable attention in the graph mining community recently. In this article, we study how to efficiently mine the complete set of coherent closed quasi-cliques from large dense graph databases, which is an especially challenging task due to the fact that the downward-closure property no longer holds. By fully exploring some properties of quasi-cliques, we propose several novel optimization techniques which can prune the unpromising and redundant subsearch spaces effectively. Meanwhile, we devise an efficient closure checking scheme to facilitate the discovery of closed quasi-cliques only. Since large databases cannot be held in main memory, we also design an out-of-core solution with efficient index structures for mining coherent closed quasi-cliques from large dense graph databases. We call this Cocain*. Thorough performance study shows that Cocain* is very efficient and scalable for large dense graph databases.

Original languageEnglish (US)
Article number1242530
JournalACM Transactions on Database Systems
Volume32
Issue number2
DOIs
StatePublished - Jun 1 2007

Keywords

  • Coherent subgraph
  • Frequent closed subgraph
  • Graph mining
  • Out-of-core algorithm
  • Quasi-clique

Fingerprint Dive into the research topics of 'Out-of-core coherent closed quasi-clique mining from large dense graph databases'. Together they form a unique fingerprint.

Cite this