TY - GEN
T1 - A multi-level parallel implementation of a program for finding frequent patterns in a large sparse graph
AU - Reinhardt, Steve
AU - Karypis, George
PY - 2007
Y1 - 2007
N2 - Graphs capture the essential elements of many problems broadly defined as searching or categorizing. With the rapid increase of data volumes from sensors, many application disciplines need to process larger graphs quickly. This paper presents the results of parallelizing with OpenMP an algorithm that finds, in a single large labeled undirected sparse graph, the connected subgraphs with a given minimum number of edge-disjoint embeddings. Parallelism is exploited at two levels in the algorithm. The lack of a priori knowledge of the extent of parallelism for a given input required use of a dynamic, multi-level approach based on the proposed OpenMP taskq/task extensions. The parallel implementation required the addition of 21 directives and about 50 accompanying lines of code, in an original code of about 15,000 lines. Experimental results show excellent speed-up to 30 processors for the graphs used, with a best speed-up of 26.1 compared to the serial version. The taskq/task constructs show promise for problems exhibiting unstructured parallelism.
AB - Graphs capture the essential elements of many problems broadly defined as searching or categorizing. With the rapid increase of data volumes from sensors, many application disciplines need to process larger graphs quickly. This paper presents the results of parallelizing with OpenMP an algorithm that finds, in a single large labeled undirected sparse graph, the connected subgraphs with a given minimum number of edge-disjoint embeddings. Parallelism is exploited at two levels in the algorithm. The lack of a priori knowledge of the extent of parallelism for a given input required use of a dynamic, multi-level approach based on the proposed OpenMP taskq/task extensions. The parallel implementation required the addition of 21 directives and about 50 accompanying lines of code, in an original code of about 15,000 lines. Experimental results show excellent speed-up to 30 processors for the graphs used, with a best speed-up of 26.1 compared to the serial version. The taskq/task constructs show promise for problems exhibiting unstructured parallelism.
KW - Data mining
KW - Frequent subgraph
KW - OpenMP
KW - Parallel processing
KW - Pattern discovery
KW - Unstructured parallelism
UR - http://www.scopus.com/inward/record.url?scp=34548787729&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548787729&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2007.370404
DO - 10.1109/IPDPS.2007.370404
M3 - Conference contribution
AN - SCOPUS:34548787729
SN - 1424409101
SN - 9781424409105
T3 - Proceedings - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007; Abstracts and CD-ROM
BT - Proceedings - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007; Abstracts and CD-ROM
T2 - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007
Y2 - 26 March 2007 through 30 March 2007
ER -