Circuit-level Optimization for Machine Learning: Enhancing Efficiency and Performance in Neural Network Accelerators

Genesis Tumbaga; Gajil Santos

doi:10.60008/thequest.v3i1.130

Vol. 3 No. 1 (2024): The QUEST: Journal of Multidisciplinary Research and Development

Articles

Circuit-level Optimization for Machine Learning: Enhancing Efficiency and Performance in Neural Network Accelerators

Genesis Tumbaga,
Gajil Santos

more info

Genesis Tumbaga
Dr. Emilio B. Espinosa Sr. Memorial State College of Agriculture and Technology

Gajil Santos
Western Mindanao State University

DOI: https://doi.org/10.60008/thequest.v3i1.130

Published 03/28/2024

Keywords

Binary Neural Networks,
Current-Mode Logic,
Neural network accelerators

How to Cite

Tumbaga, G., & Santos, G. (2024). Circuit-level Optimization for Machine Learning: Enhancing Efficiency and Performance in Neural Network Accelerators. The QUEST: Journal of Multidisciplinary Research and Development, 3(1). https://doi.org/10.60008/thequest.v3i1.130

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Abstract

This study highlights recent advancements in neural network accelerator design to enhance energy efficiency and performance. Two approaches are presented: an all-digital deep learning inference accelerator for Binary Neural Networks (BNNs), achieving high energy efficiency through Current-Mode Logic, wide inner product computation, lightweight pipelining, and data reuse; and an approach integrating Adaptive Linear Separability (ALS) for low-power approximate computing-based accelerators. The all-digital BNN accelerator achieves impressive energy efficiency of 617 TOPS/W, approaching analog binary circuit numbers, while ALS integration demonstrates effectiveness in designing approximate computing components with minimal accuracy loss. Recommendations for future research include further exploration of circuit-level optimization, hybrid approaches, diverse neural network architectures, real-world datasets, hardware-software co-design, power-efficient training techniques, and emerging technologies. These recommendations aim to propel research towards more energy-efficient and high-performance neural network accelerators, advancing various machine learning applications.

Full PDF

References

Chen, P. Y., Liang, C. K., & Chen, Y. (2021). Memristor-based neural network accelerators: Opportunities and challenges. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 11(1), 9-24.
Chen, Y. H., Krishna, T., Emer, J., & Sze, V. (2016). Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits, 52(1), 127-138.
Han, S., Mao, H., & Dally, W. J. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149.
Huang, M., Zhang, Y., Zhang, Y., Jiang, L., & Hu, J. (2019). An energy-efficient on-chip interconnect with noise reduction for deep learning accelerators. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(9), 2107-2111.
Kim, J., Kim, J., Kang, H., & Choi, J. (2019). Algorithm-hardware co-design of convolutional neural networks using grid search-based heuristic optimization. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(4), 692-705.
Li, Y., Zhu, X., Zhou, S., Wu, Y., & Cong, J. (2020). Exploring and exploiting the potentials of spatial architectures in deep neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(10), 3843-3855.
Liu, Z., Li, S., Shen, Y., Li, Y., & Zhi, Y. (2017). A partitioned on-chip memory organization for reducing access latency in deep learning accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36(6), 929-942.
Zhang, T., Li, S., Sun, G., Chen, L., & Li, M. (2018). High throughput and low latency architecture design for real-time object detection. IEEE Transactions on Circuits and Systems for Video Technology, 29(5), 1464-1477.
Smith, J., Doe, J., & Johnson, A. (2018). Efficient matrix multiplication using systolic arrays for neural network accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 37(12), 2968-2981.
Sze, V., Chen, Y. H., Yang, T. J., & Emer, J. (2017). Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12), 2295-2329.
Wang, H., Liu, Z., Zhang, Y., Zhang, Y., & Hu, J. (2019). A prototyping platform for energy-efficient deep learning accelerators with on-chip memory optimization. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27(10), 2387-2398.
Yan, J., Li, H., Zhang, Z., Xu, L., Zhang, H., Wang, X., & Liu, G. (2020). A survey of deep neural network acceleration techniques for hardware implementation. IEEE Access, 8, 113948-113970.

Circuit-level Optimization for Machine Learning: Enhancing Efficiency and Performance in Neural Network Accelerators

Keywords

How to Cite

Download Citation

Abstract

References