Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2018, Vol. 12 Issue (6) : 1208-1219    https://doi.org/10.1007/s11704-016-6168-0
RESEARCH ARTICLE
Strongly connected components based efficient computation of page rank
Hongguo YANG(), Derong SHEN, Yue KOU, Tiezhen NIE, Ge YU
School of Computer Science and Engineering, Northeastern University, Shenyang 110004, China
 Download: PDF(569 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

In this paper, an efficient page rank (PR) exact algorithm is proposed, which can improve the computation efficiency without sacrificing results accuracy. The existing exact algorithms are generally based on the original power method (PM). In order to reduce the number of I/Os required to improve efficiency, they partition the big graph into multiple smaller ones that can be totally fitted in memory. The algorithmproposed in this paper can further reduce the required number of I/Os. Instead of partitioning the graph into the general subgraphs, our algorithm partitions graph into a special kind of subgraphs: SCCs (strongly connected components), the nodes in which are reachable to each other. By exploiting the property of SCC, some theories are proposed, based on which the computation iterations can be constrained on these SCC subgraphs. Our algorithm can reduce lots of I/Os and save a large amount of computations, as well as keeping the results accuracy. In a word, our algorithm is more efficient among the existing exact algorithms. The experiments demonstrate that the algorithms proposed in this paper can make an obvious efficiency improvement and can attain high accurate results.

Keywords page rank      strongly connected component      power method      I/Os     
Corresponding Author(s): Hongguo YANG   
Just Accepted Date: 05 July 2016   Online First Date: 28 August 2017    Issue Date: 04 December 2018
 Cite this article:   
Hongguo YANG,Derong SHEN,Yue KOU, et al. Strongly connected components based efficient computation of page rank[J]. Front. Comput. Sci., 2018, 12(6): 1208-1219.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-016-6168-0
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I6/1208
1 Mihalcea R, Tarau P. Textrank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2004
2 Page L, Brin S, Motwani R, Winograd T. The pagerank citation ranking: bringing order to the web. Stanford University Technical Report, 1999
3 Mitliagkas I, Borokhovich M, Dimakis A G, Caramanis C. Frogwild!: fast pagerank approximations on graph engines. Proceedings of the VLDB Endowment, 2015, 8(8): 874–885
https://doi.org/10.14778/2757807.2757812
4 Avrachenkov K, Litvak N, Nemirovsky D, Osipova N. Monte carlo methods in pagerank computation: when one iteration is sufficient. SIAM Journal on Numerical Analysis, 2005, 45(2): 890–904
https://doi.org/10.1137/050643799
5 Zhou Y, Liu L, Lee K, Zhang Q. Graph Twist: fast iterative graph computation with two-tier optimizations. Proceedings of the VLDB Endowment, 2015, 8(11): 1262–1273
https://doi.org/10.14778/2809974.2809987
6 Gonzalez J E, Xin R S, Dave A, Crankshaw D, Franklin M J, Stoica I. Graphx: graph processing in a distributed dataflow framework. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation. 2014, 599–613
7 Roy A, Mihailovic I, Zwaenepoel W. X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 472–488
https://doi.org/10.1145/2517349.2522740
8 Haveliwala T. Efficient computation of pagerank. Stanford University Technical Report, 2002
9 Shao B, Wang H X, Li Y T. Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013, 505–516
https://doi.org/10.1145/2463676.2467799
10 Gonzalez J E, Low Y, Gu H, Bickson D, Guestrin C. Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 2012, 17–30
11 Brinkmeier M. Distributed calculation of pagerank using strongly connected components. In: Proceedings of International Workshop on Innovative Internet Community Systems. 2005, 29–40
12 Engström C, Silvestrov S. A componentwise pagerank algorithm. In: Proceedings of the 16th Applied Stochastic Models and Data Analysis International Conference with Demographics 2015 Workshop. 2015, 185–198
13 Xie W L, Wang G Z, Bindel D, Demers A, Gehrke J. Fast iterative graph computation with block updates. Proceedings of the VLDB Endowment, 2013, 6(14): 2014–2025
https://doi.org/10.14778/2556549.2556581
14 Kyrölä A, Blelloch G, Guestrin C. Large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 2012, 31–46
15 Kohlschütter C, Chirita P A, Nejdl W. Efficient parallel computation of pagerank. In: Proceedings of the 28th European Conference on Information Retrieval. 2006, 241–252
https://doi.org/10.1007/11735106_22
16 Kamvar S D, Haveliwala T H, Manning C D, Golub G H. Extrapolation methods for accelerating pagerank computations. In: Proceedings of the 12th International Conference on World Wide Web. 2003, 261–270
https://doi.org/10.1145/775152.775190
17 Kamvar S D, Haveliwala T H, Manning C D, Golub G H. Exploiting the block structure of the web for computing pagerank. Stanford University Technical Report, 2003
18 Custódio A L, Rocha H, Vicente L N. Incorporating minimum frobenius norm models in direct search. Computational Optimization and Applications, 2010, 46(2): 265–278
https://doi.org/10.1007/s10589-009-9283-0
19 Nuutila E, Soisalon-Soininen E. On finding the strongly connected components in a directed graph. Information Processing Letters, 1994, 49(1): 9–14
https://doi.org/10.1016/0020-0190(94)90047-7
[1] Supplementary Material 1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed