Please wait a minute...
Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184

Front. Inform. Technol. Electron. Eng    2018, Vol. 19 Issue (10) : 1224-1229    https://doi.org/10.1631/FITEE.1800424
Perspectives
Exploring high-performance processor architecture beyond the exascale
Xiang-hui XIE(), Xun JIA()
State Key Laboratory of Mathematical Engineering and Advanced Computing, Wuxi 214125, China
 Download: PDF(356 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The ever-increasing need for high performance in scientific computation and engineering applications will push high-performance computing beyond the exascale. As an integral part of a supercomputing system, highperformance processors and their architecture designs are crucial in improving system performance. In this paper, three architecture design goals for high-performance processors beyond the exascale are introduced, including effective performance scaling, efficient resource utilization, and adaptation to diverse applications. Then a high-performance many-core processor architecture with scalar processing and application-specific acceleration (Massa) is proposed, which aims to achieve the above three goals by employing the techniques of distributed computational resources and application-customized hardware. Finally, some future research directions regarding the Massa architecture are discussed.

Keywords High-performance computing      Beyond the exascale      Processor architecture      Application-customized hardware      Distributed computational resources     
Corresponding Author(s): Xiang-hui XIE,Xun JIA   
Issue Date: 03 December 2018
 Cite this article:   
Xiang-hui XIE,Xun JIA. Exploring high-performance processor architecture beyond the exascale[J]. Front. Inform. Technol. Electron. Eng, 2018, 19(10): 1224-1229.
 URL:  
https://academic.hep.com.cn/fitee/EN/10.1631/FITEE.1800424
https://academic.hep.com.cn/fitee/EN/Y2018/V19/I10/1224
1 Esmaeilzadeh H, Blem E, Amant RS, et al., 2011. Dark silicon and the end of multicore scaling. 38th Annual Int Symp on Computer Architecture, p.365–376.
2 Fang JR, Fu HH, Zhao WL, et al., 2017. swDNN: a library for accelerating deep learning applications on Sunway TaihuLight. 31st Int Parallel and Distributed Processing Symp, p.615–624.
3 Fu HH, Liao JF, Yang JZ, et al., 2016. The Sunway TaihuLight supercomputer: system and applications. Sci China Inform Sci, 59(7):1–15.
4 Fu HH, He CH, Chen BW, et al., 2017. 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios. 30th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1–12.
5 García-Flores V, Ayguade E, Peña AJ, 2017. Efficient data sharing on heterogeneous systems. Proc 46th Int Conf on Parallel Processing, p.121–130.
6 Hemmert S, 2016. Green HPC: from nice to necessity. Comput Sci Eng, 12(6):8–10.
7 Jia X, Wu GM, Xie XH, 2017. A high-performance accelerator for floating-point matrix multiplication. 15th Int Symp on Parallel and Distributed Processing with Applicatons, p.396–402.
8 Jouppi NP, Young C, Patil N, et al., 2017. In-datacenter performance analysis of a tensor processing unit. 44th Annual Int Symp on Computer Architecture, p.1–12.
9 Lin H, Tang XC, Yu BW, et al., 2017. Scalable graph on Sunway TaihuLight with ten million cores. 31st Int Parallel and Distributed Processing Symp, p.635–645.
10 Ozdal MM, Yesil S, Kim T, et al., 2016. Energy efficient architecture for graph analytics accelerators. 43rd Int Symp on Computer Architecture, p.166–177.
11 Pedram A, Gerstlauer A, van de Geijn RA, 2011. A highperformance, low-power linear algebra core. 22nd Int Conf on Application-specific System, Architecture and Processors, p.35–42.
12 Schulte MJ, Ignatowski M, Loh GH, et al., 2015. Achieving exascale capabilities through heterogeneous computing. IEEE Micro, 35(4):26–36.
13 Shalf JM, Leland R, 2015. Computing beyond Moore’s law. Computer, 48(12):14–23.
14 Silbertstein M, 2017. OmniX: an accelerator-centric OS for omni-programmable systems. 16th Workshop on Hot Topics in Operating Systems, p.69–75.
15 Williams RS, 2017. What’s next? [The end of Moore’s law] Comput Sci Eng, 19(2):7–13.
16 Xu ZG, Lin J, Matsuoka S, 2017. Benchmarking SW26010 many-core processor. 31st Int Conf on Parallel and Distributed Processing Symp Workshops, p.743–752.
17 Yang C, Xue W, Fu HH, et al., 2016. 10m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. 29th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.57–68.
18 Zhao B, Gao W, Zhao RC, et al., 2015. Performance evaluation of NPB and SPEC CPU2006 on various SIMD extensions. 1st Int Conf on Big Data Computing and Communications, p.257–272.
19 Zheng F, Zhang K, Wu GM, et al., 2014. Architecture techniques of many-core processor for energy-efficient in high performance computing. Chin J Comput, 37(10):2176–2186 (in Chinese).
20 Zheng F, Li HL, Lv H, et al., 2015. Cooperative computing techniques for a deeply fused and heterogeneous manycore processor architecture. J Comput Sci Technol, 30(1):145–162.
[1] FITEE-1224-18003-XHX_suppl_1 Download
[2] FITEE-1224-18003-XHX_suppl_2 Download
[1] Yi-shui LI, Xin-hai CHEN, Jie LIU, Bo YANG, Chun-ye GONG, Xin-biao GAN, Sheng-guo LI, Han XU. OHTMA: an optimized heuristic topology-aware mapping algorithm on theTianhe-3 exascale supercomputer prototype[J]. Front. Inform. Technol. Electron. Eng, 2020, 21(6): 939-949.
[2] Wei HU, Guang-ming LIU, Yan-huang JIANG. FTRP: a new fault tolerance framework using process replication and prefetching for high-performance computing[J]. Front. Inform. Technol. Electron. Eng, 2018, 19(10): 1273-1290.
[3] Xiang-ke LIAO, Kai LU, Can-qun YANG, Jin-wen LI, Yuan YUAN, Ming-che LAI, Li-bo HUANG, Ping-jing LU, Jian-bin FANG, Jing REN, Jie SHEN. Moving from exascale to zettascale computing: challenges and techniques[J]. Front. Inform. Technol. Electron. Eng, 2018, 19(10): 1236-1244.
[4] Dhabaleswar PANDA, Xiao-yi LU, Hari SUBRAMONI. Networking and communication challenges for post-exascale systems[J]. Front. Inform. Technol. Electron. Eng, 2018, 19(10): 1230-1235.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed