Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

邮发代号 80-970

2019 Impact Factor: 1.275

Frontiers of Computer Science  2017, Vol. 11 Issue (5): 746-761   https://doi.org/10.1007/s11704-016-6159-1
  本期目录
A survey of neural network accelerators
Zhen LI1, Yuqing WANG2, Tian ZHI1, Tianshi CHEN1()
1. State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
2. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China
 全文: PDF(558 KB)  
Abstract

Machine-learning techniques have recently been proved to be successful in various domains, especially in emerging commercial applications. As a set of machinelearning techniques, artificial neural networks (ANNs), requiring considerable amount of computation and memory, are one of the most popular algorithms and have been applied in a broad range of applications such as speech recognition, face identification, natural language processing, ect. Conventionally, as a straightforward way, conventional CPUs and GPUs are energy-inefficient due to their excessive effort for flexibility. According to the aforementioned situation, in recent years, many researchers have proposed a number of neural network accelerators to achieve high performance and low power consumption. Thus, the main purpose of this literature is to briefly review recent related works, as well as the DianNao-family accelerators. In summary, this review can serve as a reference for hardware researchers in the area of neural networks.

Key wordsneural networks    accelerators    FPGAs    ASICs    DianNao series
收稿日期: 2016-03-18      出版日期: 2017-09-26
Corresponding Author(s): Tianshi CHEN   
 引用本文:   
. [J]. Frontiers of Computer Science, 2017, 11(5): 746-761.
Zhen LI, Yuqing WANG, Tian ZHI, Tianshi CHEN. A survey of neural network accelerators. Front. Comput. Sci., 2017, 11(5): 746-761.
 链接本文:  
https://academic.hep.com.cn/fcs/CN/10.1007/s11704-016-6159-1
https://academic.hep.com.cn/fcs/CN/Y2017/V11/I5/746
1 McCullochW S, PittsW. A logical calculus of ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 1943, 5(4): 115–133
https://doi.org/10.1007/BF02478259
2 HebbD O. The Organization of Behavior: A Neuropsychological Theory. London: Psychology Press, 2005
3 RosenblattF. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 1958, 65(6): 386
https://doi.org/10.1037/h0042519
4 WerbosP. Beyond regression: new tools for prediction and analysis in the behavioral sciences. Dissertation for the Doctoral Degree.Cambridge, MA: Harvard University, 1974.
5 HintonG E, Osindero S, TehY . A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7): 1527–1554
https://doi.org/10.1162/neco.2006.18.7.1527
6 BengioY. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009, 2(1): 1–127
https://doi.org/10.1561/2200000006
7 WilliamsR J, ZipserD. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1989, 1(2): 270–280
https://doi.org/10.1162/neco.1989.1.2.270
8 BackA, TsoiA. FIR and IIR synapses, a new neural network architecture for time series modeling. Neural Computation, 1991, 3(3): 375–385
https://doi.org/10.1162/neco.1991.3.3.375
9 FrasconiP, GoriM, SodaG. Local feedback multilayered networks. Neural Computation, 1992, 4(1): 120–130
https://doi.org/10.1162/neco.1992.4.1.120
10 DingS F, LiH, SuC Y, Yu J Z, JinF X . Evolutionary artificial neural networks: a review. Artificial Intelligence Review, 2013, 39(3): 251–260
https://doi.org/10.1007/s10462-011-9270-6
11 AlneamyJ S M, Alnaish R A H. Heart disease diagnosis utilizing hybrid fuzzy wavelet neural network and teaching learning based optimization algorithm. Advances in Artificial Neural Systems, 2014
12 PereiraL A M, Rodrigues D, RibeiroP B , PapaJ P, WeberS A T. Social-spider optimization-based artificial neural networks training and its applications for parkinson’s disease identification. In: Proceedings of the 27th IEEE International Symposium on Computer-Based Medical Systems. 2014, 14–17
https://doi.org/10.1109/cbms.2014.25
13 HarmonF G, FrankA A, JoshiS S. The control of a parallel hybridelectric propulsion system for a small unmanned aerial vehicle using a cmac neural network. Neural Networks the Official Journal of the International Neural Network Society, 2005, 18(5-6): 772–780
https://doi.org/10.1016/j.neunet.2005.06.030
14 ZissisD, XidiasE K, LekkasD. A cloud based architecture capable of perceiving and predicting multiple vessel behaviour. Applied Soft Computing, 2015, 35: 652–661
https://doi.org/10.1016/j.asoc.2015.07.002
15 MishraA K, DesaiV R. Drought forecasting using feed-forward recursive neural network. Ecological Modelling, 2006, 198(1-2): 127–138
https://doi.org/10.1016/j.ecolmodel.2006.04.017
16 AzoffE M. Neural Network Time Series Forecasting of Financial Markets. New York: John Wiley & Sons, Inc., 1994
17 KaastraI, BoydM. Designing a neural network for forecasting financial and economic time series. Neurocomputing, 1996, 10(3): 215–236
https://doi.org/10.1016/0925-2312(95)00039-9
18 TamK Y. Neural network models and the prediction of bank bankruptcy. Omega, 1991, 19(5): 429–445
https://doi.org/10.1016/0305-0483(91)90060-7
19 WestD, Dellana S, QianJ X . Neural network ensemble strategies for financial decision applications. Computers & Operations Research, 2005, 32(10): 2543–2559
https://doi.org/10.1016/j.cor.2004.03.017
20 PokrajacD, Obradovic Z. A neural network-based method for sitespecific fertilization recommendation. In: Proceedings of ASAE Annual Meeting. 2001
21 ProtzelP W, Palumbo D L, ArrasM K . Performance and faulttolerance of neural networks for optimization. IEEE Transactions on Neural Networks, 1993, 4(4): 600–614
https://doi.org/10.1109/72.238315
22 ChandraP, SinghY. Fault tolerance of feedforward artificial neural networks- a framework of study. In: Proceedings of the International Joint Conference on Neural Networks. 2003, 489–494
https://doi.org/10.1109/ijcnn.2003.1223395
23 DiasF M, Antunes A. Fault tolerance of artificial neural networks: an open discussion for a global model. International Journal of Circuits, Systems and Signal Processing, 2008, 329–333
24 SiegelmannH T, SontagE D. Analog computation via neural networks. Theoretical Computer Science, 1994, 131(2): 331–360
https://doi.org/10.1016/0304-3975(94)90178-3
25 SiegelmannH. Neural Networks and Analog Computation: Beyond the Turing Limit. Springer Science & Business Media, 2012
26 SzegedyC, Zaremba W, SutskeverI , BrunaJ, ErhanD, GoodfellowI , FergusR. Intriguing properties of neural networks. 2013, arXiv preprint arXiv:1312.6199
27 DennardR H, Rideout V L, BassousE , LeBlancA R. Design of ionimplanted mosfet’s with very small physical dimensions. IEEE Journal of Solid-State Circuits, 1974, 9(5): 256–268
https://doi.org/10.1109/JSSC.1974.1050511
28 EsmaeilzadehH, BlemE, AmantR S, Sankaralingam K, BurgerD . Dark silicon and the end of multicore scaling. In: Proceedings of the 38th Annual International Symposium on Computer Architecture. 2011, 365–376
https://doi.org/10.1145/2000064.2000108
29 JarrettK, Kavukcuoglu K, RanzatoM A , LeCunY. What is the best multi-stage architecture for object recognition? In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 2146–2153
https://doi.org/10.1109/iccv.2009.5459469
30 LeCunY, Kavukcuoglu K, FarabetC . Convolutional networks and applications in vision. In: Proceedings of IEEE International Symposium on Circuits and Systems. 2010, 253–256
https://doi.org/10.1109/iscas.2010.5537907
31 KrizhevskyA, Sutskever I, HintonG E . Imagenet classification with deep convolutional neural networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1097–1105
32 ChakradharS, Sankaradas M, JakkulaV , CadambiS. A dynamically configurable coprocessor for convolutional neural networks. In: Proceedings of the 37th Annual International Symposium on Computer Architecture. 2010, 247–257
https://doi.org/10.1145/1815961.1815993
33 VanhouckeV, SeniorA, MaoM Z. Improving the speed of neural networks on cpus. In: Proceedings of Deep Learning and Unsupervised Feature Learning NIPS Workshop. 2011
34 FarabetC, Martini B, AkselrodP , TalayS. Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of IEEE International Symposium on Circuits and Systems. 2010, 257–260
https://doi.org/10.1109/iscas.2010.5537908
35 SchererD, SchulzH, BehnkeS. Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors. In: Proceedings of International Conference on Artificial Neural Networks. 2010, 82–91
https://doi.org/10.1007/978-3-642-15825-4_9
36 CiresanD C, MeierU, MasciJ, Gambardella L M, SchmidhuberJ . Flexible, high performance convolutional neural networks for image classification. In: Proceedings of International Joint Conference on Artificial Intelligence. 2011
37 JiaY Q, Shelhamer E, DonahueJ , KarayevS, LongJ H, GirshickR, Guadarrama S, DarrellT . Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia. 2014, 675–678
https://doi.org/10.1145/2647868.2654889
38 KrizhevskyA. One weird trick for parallelizing convolutional neural networks. 2014, arXiv preprint arXiv:1404.5997
39 DeanJ, Corrado G, MongaR , ChenK, DevinM, MaoM, Senior A, TuckerP , YangK, LeQ V, NgA Y. Large scale distributed deep networks. In: Proceedings of the Neural Information Processing Systems Conference. 2012, 1223–1231
40 DengJ, DongW, SocherR, Li L J, LiK , LiF F. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255
41 CiresanD, MeierU, SchmidhuberJ . Multi-column deep neural networks for image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3642–3649
https://doi.org/10.1109/cvpr.2012.6248110
42 OhK S, JungK. GPU implementation of neural networks. Pattern Recognition, 2004, 37(6): 1311–1314
https://doi.org/10.1016/j.patcog.2004.01.013
43 CoatesA, Baumstarck P, LeQ , NgA Y. Scalable learning for object detection with gpu hardware. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. 2009, 4287–4293
https://doi.org/10.1109/iros.2009.5354084
44 TeodoroG, Sachetto R, SertelO , GurcanM N, MeiraW, CatalyurekU , FerreiraR. Coordinating the use of GPU and CPU for improving performance of compute intensive applications. In: Proceedings of IEEE International Conference on Cluster Computing and Workshops. 2009, 1–10
https://doi.org/10.1109/clustr.2009.5289193
45 LiuD F, ChenT S, LiuS L, Zhou J H, ZhouS Y , TemanO, FengX B, ZhouX H, Chen Y J. PuDianNao: a polyvalent machine learning accelerator. In: Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems. 2015, 369–381
https://doi.org/10.1145/2694344.2694358
46 ChenY J, LuoT, LiuS L, Zhang S J, HeL Q , WangJ, LiL, ChenT S, Xu Z W, SunN H , TemanO. DaDianNao: a machine-learning supercomputer. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. 2014, 609–622
https://doi.org/10.1109/micro.2014.58
47 LeQ V, Ranzato MA, MongaR , DevinM, ChenK, CorradoG S, Dean J, NgA Y . Building high-level features using large scale unsupervised learning. In: Proceedings of the International Conference on Machine Learning. 2011
48 CoatesA, HuvalB, WangT, Wu D, CatanzaroB , NgA Y. Deep learning with COTS HPC systems. In: Proceedings of the 30th International Conference on Machine Learning. 2013, 1337–1345
49 FarabetC, PouletC, HanJ F, LeCun Y. CNP: an FPGA-based processor for convolutional networks. In: Proceedings of IEEE International Conference on Field Programmable Logic and Applications. 2009, 32–37
https://doi.org/10.1109/fpl.2009.5272559
50 FarabetC, Martini B, CordaB , AkselrodP, Culurciello E, LeCunY . Neuflow: a runtime reconfigurable dataflow processor for vision. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 2011, 109–116
https://doi.org/10.1109/cvprw.2011.5981829
51 GokhaleV, JinJ, DundarA, Martini B, CulurcielloE . A 240 G-ops/s mobile coprocessor for deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014, 682–687
https://doi.org/10.1109/cvprw.2014.106
52 MaashriA A, DeboleM, CotterM, Chandramoorthy N, XiaoY , NarayananV, Chakrabarti C. Accelerating neuromorphic vision algorithms for recognition. In: Proceedings of the 49th Annual Design Automation Conference. 2012, 579–584
https://doi.org/10.1145/2228360.2228465
53 KungH T. Why systolic architectures? IEEE Computer, 1982, 15(1): 37–46
https://doi.org/10.1109/MC.1982.1653825
54 DuZ D, Fasthuber R, ChenT S , IenneP, Lil, LuoT, Feng X B, ChenY J , TemamO. ShiDianNao: shifting vision processing closer to the sensor. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture. 2015, 92–104
https://doi.org/10.1145/2749469.2750389
55 DawwdS A. The multi 2D systolic design and implementation of convolutional neural networks. In: Proceedings of the 20th IEEE International Conference on Electronics, Circuits, and Systems. 2013, 221–224
https://doi.org/10.1109/icecs.2013.6815394
56 DraperB A, Beveridge J R, BohmA P W , RossC, Chawathe M. Accelerated image processing on FPGAs. IEEE Transactions on Image Processing, 2003, 12(12): 1543–1551
https://doi.org/10.1109/TIP.2003.819226
57 DawwdS A, Mahmood B S. A reconfigurable interconnected filter for face recognition based on convolution neural network. In: Proceedings of the 4th International Conference on Design and TestWorkshop. 2009, 1–6
https://doi.org/10.1109/idt.2009.5404141
58 SankaradasM, Jakkula V, CadambiS , ChakradharS, Durdanovic I, CosattoE , GrafH P. A massively parallel coprocessor for convolutional neural networks. In: Proceedings of the 20th IEEE International Conference on Application-specific Systems, Architectures and Processors. 2009, 53–60
https://doi.org/10.1109/asap.2009.25
59 Cardells-TormoF, Molinet P L. Area-efficient 2-D shift-variant convolvers for FPGA-based digital image processing. In: Proceedings of IEEEWorkshop on Signal Processing Systems Design and Implementation. 2005, 209–213
60 Ordo nez-CardenasE, Romero-Troncoso R D J. MLP neural network and on-line backpropagation learning implementation in a low-cost FPGA. In: Proceedings of the 18th ACM Great Lakes symposium on VLSI. 2008, 333–338
61 PeemenM, SetioA A, MesmanB, Corporaal H. Memory-centric accelerator design for convolutional neural networks. In: Proceedings of the 31st IEEE International Conference on Computer Design. 2013, 13–19
https://doi.org/10.1109/iccd.2013.6657019
62 ZhangC, LiP, SunG Y, Guan Y J, XiaoB J , CongJ S. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2015, 161–170
63 SudaN, Chandra V, DasikaG , MohantyA, MaY F, VrudhulaS, Seo J, CaoY . Throughput-optimized opencl-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2016, 16–25
https://doi.org/10.1145/2847263.2847276
64 QiuJ T,WangJ, YaoS, Guo K Y, LiB X , ZhouE, YuJ C, TangT Q, Xu N Y, SongS , WangY, YangH Z. Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2016, 26–35
https://doi.org/10.1145/2847263.2847265
65 RiceK L, TahaT M, VutsinasC N . Scaling analysis of a neocortex inspired cognitive model on the Cray XD1.The Journal of Supercomputing, 2009, 47(1): 21–43
https://doi.org/10.1007/s11227-008-0195-z
66 GeorgeD, Hawkins J. A hierarchical bayesian model of invariant pattern recognition in the visual cortex. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2005, 1812–1817
https://doi.org/10.1109/ijcnn.2005.1556155
67 KimS K, McAfeeL C, McMahonP L, Olukotun K. A highly scalable restricted boltzmann machine FPGA implementation. In: Proceedings of IEEE International Conference on Field Programmable Logic and Applications. 2009, 367–372
https://doi.org/10.1109/fpl.2009.5272262
68 LeeS Y, Aggarwal J K. Parallel 2-D convolution on a mesh connected array processor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, (4): 590–594
https://doi.org/10.1109/TPAMI.1987.4767947
69 StearnsC C, LuthiD A, RuetzP A, Ang P H. A reconfigurable 64- tap transversal filter. In: Proceedings of the IEEE Custom Integrated Circuits Conference. 1988
https://doi.org/10.1109/cicc.1988.20828
70 KampW, Künemund R, SöldnerH , HoferR. Programmable 2D linear filter for video applications. IEEE Journal of Solid-State Circuits, 1990, 25(3): 735–740
https://doi.org/10.1109/4.102668
71 HechtV, RonnerK. An advanced programmable 2D-convolution chip for, real time image processing. In: Proceedings of IEEE International Sympoisum on Circuits and Systems. 1991, 1897–1900
https://doi.org/10.1109/iscas.1991.176778
72 LeeJ J, SongG Y. Super-systolic array for 2D convolution. In: Proceedings of IEEE Region 10 Conference. 2006, 1–4
https://doi.org/10.1109/tencon.2006.343739
73 MerollaP, ArthurJ, AkopyanF, Imam N, ManoharR , ModhaD S. A digital neurosynaptic core using embedded crossbar memory with 45pJ per spike in 45nm. In: Proceedings of IEEE Custom Integrated Circuits Conference. 2011, 1–4
https://doi.org/10.1109/cicc.2011.6055294
74 KimJ Y, KimM, LeeS J, Oh J, KimK , YooH J. A 201.4 GOPS 496 mWreal-time multi-object recognition processor with bio-inspired neural perception engine. IEEE Journal of Solid-State Circuits, 2010, 45(1): 32–45
https://doi.org/10.1109/JSSC.2009.2031768
75 PhamP H, JelacaD, FarabetC, Martini B, LeCunY , CulurcielloE. Neuflow: dataflow vision processing system-on-a-chip. In: Proceedings of the 55th IEEE International Midwest Symposium on Circuits and Systems. 2012, 1044–1047
https://doi.org/10.1109/mwscas.2012.6292202
76 EsmaeilzadehH, Sampson A, CezeL , BurgerD. Neural acceleration for general-purpose approximate programs. In: Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. 2012, 449–460
https://doi.org/10.1109/micro.2012.48
77 EsmaeilzadehH, SaeediP, AraabiB N, Lucas C, FakhraieS M . Neural network stream processing core (NnSP) for embedded systems. In: Proceedings of IEEE International Symposium on Circuits and Systems. 2006
https://doi.org/10.1109/iscas.2006.1693199
78 QadeerW, HameedR, ShachamO, Venkatesan P, KozyrakisC , HorowitzM A. Convolution engine: balancing efficiency & flexibility in specialized computing. In: Proceedings of the 40th Annual International Symposium on Computer Architecture. 2013, 24–35
https://doi.org/10.1145/2485922.2485925
79 SimJ, ParkJ S, KimM, Bae D, ChoiY , KimL S. 14.6 a 1.42 tops/w deep convolutional neural network recognition processor for intelligent IoE systems. In: Proceedings of IEEE International Solid-State Circuits Conference. 2016, 264–265
80 ChenY H, Krishna T, EmerJ , SzeV. 14.5 eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. In: Proceedings of IEEE International Solid-State Circuits Conference. 2016, 262–263
https://doi.org/10.1109/isscc.2016.7418007
81 ParkS, BongK, ShinD, Lee J, ChoiS , YooH J. 4.6 A1. 93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications. In: Proceedings of IEEE International Solid-State Circuits Conference. 2015, 1–3
82 HashmiA, BerryH, TemamO, Lipasti M. Automatic abstraction and fault tolerance in cortical microachitectures. In: Proceedings of the 38th Annual International Symposium on Computer Architecture. 2011, 1–10
https://doi.org/10.1145/2000064.2000066
83 TemamO. A defect-tolerant accelerator for emerging highperformance applications. ACM SIGARCH Computer Architecture News, 2012, 40(3): 356–367
https://doi.org/10.1145/2366231.2337200
84 DuZ D, Lingamneni A, ChenY J , PalemK, TemamO, WuC Y. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In: Proceedings of the 19th Asia and South Pacific Design Automation Conference. 2014, 201–206
85 IwataA, Yoshida Y, MatsudaS , SatoY, Suzumura N. An artificial neural network accelerator using general purpose 24 bit floating point digital signal processors. In: Proceedings of the International Joint Conference on Neural Networks. 1989, 171–175
https://doi.org/10.1109/IJCNN.1989.118695
86 KhanM M, LesterD R, PlanaL A, Rast A, JinX , PainkrasE, FurberS B. SpiNNaker: mapping neural networks onto a massively-parallel chip multiprocessor. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2008, 2849–2856
https://doi.org/10.1109/ijcnn.2008.4634199
87 SchemmelJ, FieresJ, MeierK. Wafer-scale integration of analog neural networks. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2008, 431–438
https://doi.org/10.1109/ijcnn.2008.4633828
88 ChakradharS, Sankaradas M, JakkulaV , CadambiS. A dynamically configurable coprocessor for convolutional neural networks. In: Proceedings of the 37th Annual International Symposium on Computer Architecture. 2010, 247–257
https://doi.org/10.1145/1815961.1815993
89 LiuX X, MaoM J, LiuB Y, Li H, ChenY R , LiB X, WangY, JiangH, Barnell M, WuQ , YangJ H. RENO: a high-efficient reconfigurable neuromorphic computing accelerator design. In: Proceedings of the 52nd ACM/EDAC/IEEE Design Automation Conference. 2015, 1–6
https://doi.org/10.1145/2744769.2744900
90 HuM, LiH, ChenY R, Wu Q, RoseG S . Bsb training scheme implementation on memristor-based circuit. In: Proceedings of IEEE Symposium on Computational Intelligence for Security and Defense Applications. 2013, 80–87
91 HuM, LiH, WuQ, RoseG. Hardware realization of neuromorphic BSB model with memristor crossbar network. In: Proceedings of IEEE Design Automation Conference. 2012, 554–559
92 AfifiA, Ayatollahi A, RaissiF . Implementation of biologically plausible spiking neural network models on the memristor crossbar-based CMOS/nano circuits. In: Proceedings of European Conference on Circuit Theory and Design. 2009, 563–566
https://doi.org/10.1109/ecctd.2009.5275035
93 ChenT S, DuZ D, SunN H, Wang J, WuC Y , ChenY J, TemamO. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGPLAN Notices, 2014, 49(4): 269–284
https://doi.org/10.1145/2541940.2541967
94 MullerM. Dark silicon and the Internet. In: Proceedings of EE Times “Designing with ARM” Virtual Conference. 2010, 285–288
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed