Low Power and Efficient Re-Configurable Multiplier for Accelerator

Nikitha Reddy N; Gogula Subash; Hemaditya P; Maran Ponnambalam

doi:10.34256/ijcci2221

Nikitha Reddy N Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamil Nadu 601103, India.
Gogula Subash Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamil Nadu 601103, India
Hemaditya P Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamil Nadu 601103, India
Maran Ponnambalam Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamil Nadu 601103, India

Keywords: CNN, VHDL, Xilinx, FPGA, Multiplier, Acclerator

Abstract

Deep learning is a rising topic at the edge of technology, with applications in many areas of our lives, including object detection, speech recognition, natural language processing, and more. Deep learning's advantages of high accuracy, speed, and flexibility are now being used in practically all major sciences and technologies. As a result, any efforts to improve the performance of related techniques are worthwhile. We always have a tendency to generate data faster than we can analyse, comprehend, transfer, and reconstruct it. Demanding data-intensive applications such as Big Data. Deep Learning, Machine Learning (ML), the Internet of Things (IoT), and high- speed computing are driving the demand for "accelerators" to offload work from general-purpose CPUs. An accelerator (a hardware device) works in tandem with the CPU server to improve data processing speed and performance. There are a variety of off-the-shelf accelerator architectures available, including GPU, ASIC, and FPGA architectures. So, this work focus on designing a multiplier unit for the accelerators. This increases the performance of DNN, reduced the area and increasing the training speed of the system.

References

Y. Bengio, Learning deep architectures for AI, Foundations and trends® in Machine Learning, 2(1) (2009) 1– 127.

J. Schmidhuber, Deep learning in neural networks: An overview, Neural networks, 61( 2015) 85–117. https://doi.org/10.1016/j.neunet.2014.09.003

Vagisha Gupta, Shelly Sachdeva and Neha Dohare, Deep similarity learning for disease prediction, Trends in Deep Learning Methodologies, (2021) 183-206. https://doi.org/10.1016/B978-0-12-822226-3.00008-8

Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, 521 (7553) (2015) 436. https://doi.org/10.1038/nature14539

T. M. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman, Project Adam: Building an efficient and scalable deep learning training system, in OSDI, 14 (2014) 571–582.

E. Nurvitadhi, Jaewoong Sim, D. Sheffield, A. Mishra, S. Krishnan, and D. Marr. Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC. In 2016 26th International Conference on Field Programmable Logic and Applications (FPL), pages 1–4, Aug 2016. https://doi.org/10.1109/FPL.2016.7577314

C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao and J. Cong, Optimizing FPGA-based accelerator design for deep convolutional neural networks, FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 161-170 (2015). https://doi.org/10.1145/2684746.2689060

J. Faraone et al., "AddNet: Deep Neural Networks Using FPGA- Optimized Multipliers," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 28(1) (2020) 115-128. https://doi.org/10.1109/TVLSI.2019.2939429

M. Courbariaux and Y. Bengio, "Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1" in CoRR, 2016, [online] Available: http://arxiv.org/abs/1602.02830.

M. Aravind Kumar, O. Ranga Rao, M. Dileep, C V Pradeep Kumar Reddy, K.P. Mani Performance Evaluation of Different Multipliers in VLSI using VHDL, in International Journal of Advanced Research in Computer and Communication Engineering 5(3) (2016).