The TH Express high performance interconnect networks

doi:10.1007/s11704-014-3500-9

Front. Comput. Sci.

2014, Vol. 8

Issue (3) : 357-366 https://doi.org/10.1007/s11704-014-3500-9

RESEARCH ARTICLE

The TH Express high performance interconnect networks

Zhengbin PANG^1,²,Min XIE²,Jun ZHANG²,Yi ZHENG²,Guibin WANG^2,^*(

),Dezun DONG^1,²,Guang SUO²

1. Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China
2. College of Computer, National University of Defense Technology, Changsha 410073, China

Download: PDF(338 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Interconnection network plays an important role in scalable high performance computer (HPC) systems. The TH Express-2 interconnect has been used in MilkyWay-2 system to provide high-bandwidth and low-latency interprocessor communications, and continuous efforts are devoted to the development of our proprietary interconnect. This paper describes the state-of-the-art of our proprietary interconnect, especially emphasizing on the design of network interface. Several key features are introduced, such as user-level communication, remote direct memory access, offload collective operation, and hardware reliable end-to-end communication, etc. The design of a low level message passing infrastructures and an uppermessage passing services are also proposed. The preliminary performance results demonstrate the efficiency of the TH interconnect interface.

Keywords HPC network interface chip (NIC) TH Express interconnect offload collective operation

Corresponding Author(s): Guibin WANG

Issue Date: 24 June 2014

Cite this article:

Zhengbin PANG,Min XIE,Jun ZHANG, et al. The TH Express high performance interconnect networks[J]. Front. Comput. Sci., 2014, 8(3): 357-366.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-014-3500-9
https://academic.hep.com.cn/fcs/EN/Y2014/V8/I3/357

1	Top500, http://www.top500.org, 2013
2	LiaoK X, XiaoQ L, YangQ C, LuT Y. MilkyWay-2 supercomputer system and application. Submitted to Frontiers of Computer Science, 2013
3	PritchardH, GorodetskyI, BuntinasD. A ugni-based mpich2 nemesis network module for the cray xe. In: Proceedings of the 18th European MPI Users’ Group Conference on Recent Advances in the Message Passing Interface. 2011, 110-119 doi: 10.1007/978-3-642-24449-0_14
4	XieM, LuY, LiuL, CaoH, YangX. Implementation and evaluation of network interface and message passing services for Tianhe-1a supercomputer. In: Proceedings of the 19th IEEE Annual Symposium on High Performance Interconnects. 2011, 78-86
5	ChunB N, MainwaringA, CullerD E. Virtual network transport protocols for myrinet. IEEE Micro, 1998, 18(1): 53-63 doi: 10.1109/40.653035
6	ArakiS, BilasA, DubnickiC, EdlerJ, KonishiK, PhilbinJ. Userspace communication: a quantitative study. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (CDROM). 1998, 1-16
7	BhoedjangR A, RuhlT, BalH E. User-level network interface protocols. Computer, 1998, 31(11): 53-60 doi: 10.1109/2.730737
8	SchoinasI, HillM D. Address translation mechanisms in network interfaces. In: Proceedings of the 4th International Symposium on High-Performance Computer Architecture. 1998, 219-230
9	InfiniBand Architecture Specification: Release 1.0. InfiniBand Trade Association, 2000
10	GrahamR L, PooleS, ShamisP, BlochG, BlochN, ChapmanH, KaganM, ShaharA, RabinovitzI, ShainerG. Overlapping computation and communication: Barrier algorithms and connectx-2 core-direct capabilities. In: Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. 2010, 1-8
11	KandallaK, SubramoniH, VienneJ, RaikarS P, TomkoK, SurS, PandaD K. Designing non-blocking broadcast with collective offload on infiniband clusters: A case study with hpl. In: Proceedings of the 19th IEEE Annual Symposium on High Performance Interconnects. 2011, 27-34
12	MPICH2: High-performance and Widely Portable MPI. http://www.mcs.anl.gov/research/projects/mpich2/
13	BuntinasD, GoglinB, GoodellD, MercierG, MoreaudS. Cacheefflcient, intranode, large-message mpi communication with mpich2-nemesis. In: Proceedings of the 2009 International Conference on Parallel Processing. 2009, 462-469 doi: 10.1109/ICPP.2009.22
14	LauriaM, PakinS, ChienA. Efflcient layering for high speed communication: Fast messages 2. x. In: Proceedings of the 7th International Symposium on High Performance Distributed Computing. 1998, 10-20
15	LiuJ, PandaD K. Implementing efflcient and scalable flow control schemes in MPI over infiniband. In: Proceedings of the 2004 International Parallel and Distributed Processing Symposium. 2004, 183b
16	TezukaH, O’CarrollF, HoriA, IshikawaY. Pin-down cache: a virtual memory management technique for zero-copy communication. In: Proceedings of the 1998 Symposium on Parallel and Distributed Processing. 1998, 308-314
17	MVAPICH: MPI over InfiniBand, 10GigE/iWARP and RoCE, 201318.
18	VetterJ S, MuellerF. Communication characteristics of large-scale scientific applications for contemporary cluster architectures. Journal of Parallel and Distributed Computing, 2003, 63(9): 853-865 doi: 10.1016/S0743-7315(03)00104-7
19	ChiuG. The IBM blue gene project. IBM Journal of Research and Development, 2013, 57(1): 1-6
20	ChenD, EisleyN A, HeidelbergerP, SengerR M, SugawaraY, KumarS, SalapuraV, SatterfieldD L, Steinmacher-BurowB, ParkerJ J. The IBM blue gene/q interconnection fabric. IEEE Micro, 2012, 32(1): 32-43 doi: 10.1109/MM.2011.96
21	AjimaY, TakagiY, InoueT, HiramotoS, ShimizuT. The tofu interconnect. In: Proceedings of the 19th IEEE Annual Symposium on High Performance Interconnects. 2011, 87-94
22	AlversonR, RowethD, KaplanL. The gemini system interconnect. In: Proceedings of the 18th IEEE Annual Symposium on High Performance Interconnects. 2010, 83-87
23	SchroederB, GibsonG A. Understanding failures in petascale computers. In: Journal of Physics: Conference Series. 2007, Article 012022
24	GrahamR L, PooleS, ShamisP, BlochG, BlochN, ChapmanH, KaganM, ShaharA, RabinovitzI, ShainerG. Connectx-2 infiniband management queues: first investigation of the new support for network offloaded collective operations. In: Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. 2010, 53-62
25	SubramoniH, KandallaK, SurS, PandaD K. Design and evaluation of generalized collective communication primitives with overlap using connectx-2 offload engine. In: Proceedings of the 18th IEEE Annual Symposium on High Performance Interconnects. 2010, 40-49
26	ArimilliB, ArimilliR, ChungV, ClarkS, DenzelW, DrerupB, HoeflerT, JoynerJ, LewisJ, LiJ. The percs high-performance interconnect. In: Proceedings of the 18th IEEE Annual Symposium on High Performance Interconnects. 2010, 75-82

[1]	Yunquan ZHANG, Jiachang SUN, Guoxing YUAN, Linbo ZHANG, . Perspectives of China’s HPC system development: a view from the 2009 China HPC TOP100 list[J]. Front. Comput. Sci., 2010, 4(4): 437-444.
[2]	Mingfa ZHU, Limin XIAO, Li RUAN, Qinfen HAO, . DeepComp: towards a balanced system design for high performance computer systems[J]. Front. Comput. Sci., 2010, 4(4): 475-479.

Viewed

Full text

Abstract

Cited

Shared

Discussed