|
|
Computing infrastructure for big data processing |
Ling LIU( ) |
Distributed Data Intensive Systems Lab, School of Computer Science, Georgia Institute of Technology, Atlanta 30332, USA |
|
|
Abstract With computing systems undergone a fundamental transformation from single-processor devices at the turn of the century to the ubiquitous and networked devices and the warehouse-scale computing via the cloud, the parallelism has become ubiquitous at many levels. At micro level, parallelisms are being explored from the underlying circuits, to pipelining and instruction level parallelism on multi-cores or many cores on a chip as well as in a machine. From macro level, parallelisms are being promoted from multiple machines on a rack, many racks in a data center, to the globally shared infrastructure of the Internet.With the push of big data, we are entering a new era of parallel computing driven by novel and ground breaking research innovation on elastic parallelism and scalability. In this paper, we will give an overview of computing infrastructure for big data processing, focusing on architectural, storage and networking challenges of supporting big data paper.We will briefly discuss emerging computing infrastructure and technologies that are promising for improving data parallelism, task parallelism and encouraging vertical and horizontal computation parallelism.
|
Keywords
big data
cloud computing
data analytics
elastic scalability
heterogeneous computing
GPU
PCM
big data processing
|
Corresponding Author(s):
LIU Ling,Email:lingliu@cc.gatech.edu
|
Issue Date: 01 April 2013
|
|
1 |
Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers A H. Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute , 2011, 1-137
|
2 |
Graphics Processing Unit (GPU). http://en.wikipedia.org/wiki/Graphics_processing_unit
|
3 |
Kim N S, Draper S C, Zhou S T, Katariya S, Ghasemi H R, Park T. Analyzing the impact of joint optimization of cell size, redundancy, and ECC on low-voltage SRAM array total area. IEEE Transactions on Very Large Scale Integration (VLSI) Systems , 2012, 20(12): 2333-2337
|
4 |
Gilani S Z, Kim N S, Schulte M J. Power-efficient computing for compute-intensive GPGPU applications. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques . 2012, 445-446 doi: 10.1145/2370816.2370888
|
5 |
Mattson T. The future of many core computing: a tale of two processors. Intel Labs Report . 2010
|
6 |
Borkar S. Thousand core chips: a technology perspective. In: Proceedings of the 44th Annual Design Automation Conference . 2007, 746-749 doi: 10.1145/1278480.1278667
|
7 |
Phase-change memory (pcm). http://en.wikipedia.org/wiki/Phasechange_memory
|
8 |
21st century computer architecture. http://cra.org/ccc/docs/init/ 21stcenturyarchitecturewhitepaper.pdf
|
9 |
Malewicz G, Austern M H, Bik A J, Dehnert J C, Horn I, Leiser N, Czajkowski G. Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 International Conference on Management of Data . 2010, 135-146
|
10 |
Kyr?l? A, Blelloch G, Guestrin C. GraphChi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation , 31-46
|
11 |
Altavista web page hyperlink connectivity graph. 2012. http:// webgraph. sandbox.yahoo.com
|
12 |
Guo Y, Pan Z, Heflin J. LUBM: a benchmark for OWL knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web , 2005, 3(2): 158-182 doi: 10.1016/j.websem.2005.06.005
|
13 |
Prud’Hommeaux E, Seaborne A. SPARQL query language for RDF. W3C Recommendation , 2008
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|