Publication | ucsdvvip.com

Book

[1] M. Kang, S. K. Gonugondla, and N. R. Shanbhag, “Deep In-memory Architectures for Machine Learning” Vol. 1, pp. 1–197, Springer, first print in Dec. 2019.

E-book: https://link.springer.com/book/10.1007%2F978-3-030-35971-3

Conferences

- C. E. Song, W. Xu, K. Fan, S. Jain, G. Hota, H. Yang, L. Liu, K. Akarvardar, M. F. Chang, C. H. Diaz, G. Cauwenberghs, T. Rosing, and M. Kang, “Clo-HDnn: A 4.66 TFLOPS/W and 3.78 TOPS/W Continual On-Device Learning Accelerator with Energy-efficient Hyperdimensional Computing via Progressive Search,” IEEE International Symposium on VLSI (VLSIC), Jun. 2025 (to appear).

- C. E. Song, P. Bhatnagar, Z. Xia, T. Rosing, N. S. Kim, M. Kang, “Hybrid SLC-MLC RRAM Mixed-Signal Processing-in-Memory Architecture for Transformer Acceleration via Gradient Redistribution,” International Symposium on Computer Architecture (ISCA), Jun. 2025 (to appear).

- Y. C. Yu, S. P. Reddy, A. Devrani, A. Srinivasan, S. R. Nayini, S. Kim, S. J. Jang, S. S. Lee, M. Kang, “High-throughput Point-Cloud Accelerator with Sparsity-aware Hierarchical Neighbor Voxel Search and Skipping” Design Automation Conference (DAC), June. 2025.

- A. Moradi, D. Dodla, M. Kang, “An Analog and Digital Hybrid Attention Accelerator for Transformers with Charge-based In-memory Computing,” European Solid-State Electronics Research Conference (ESSERC) 2024 (former ESSCIRC), Sep. 2024.

- H. Yang*, C. E. Song* (*equal cont.), W. Xu, B. Khaleghi, U. Mallappa, M. Shah, K. Fan, M. Kang, T. Rosing, “FSL-HDnn: A 5.7 TOPS/W End-to-end Few-shot Learning Classifier Accelerator with Feature Extraction and Hyperdimensional Computing,” European Solid-State Electronics Research Conference (ESSERC) 2024 (former ESSCIRC), Sep. 2024.

- C. E. Song, A. Moradi, T. Rosing, M. Kang, “Efficient Transformer Accelerator via Reconfiguration for Encoder and Decoder Dual Modes with Sparsity-Aware Data Mapping,” ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), Aug. 2024.

- Z. Xia, J. Kim, M. Kang, “LEAF: An Adaptation Framework against Noisy Data on Edge through Ultra Low-Cost Training,” Design Automation Conference (DAC), June. 2024.

- C. E. Song, Y. Li, A. Ramnani, P. Agrawal, P. Agrawal, S. J. Jang, S. S. Lee, T. Rosing, M. Kang, “52.5 TOPS/W 1.7GHz Reconfigurable XGBoost Inference Accelerator based on Modular-Unit-Tree with Dynamic Data and Compute Gating," IEEE Custom Integrated Circuits Conference (CICC), Apr. 2024.

- M. Lee, S. Park. H. Kim, M. Yoon, J. Lee, J. Choi, N.S. KIM, M. Kang, and J. Choi, “SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving,” IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2024.

- K. H. Kim, and M. Kang, “Mixed-Signal Dot-Product Processor With Switched-Capacitors for Machine Learning,” International Conference on Electronics, Information, and Communication (ICEIC), Feb. 2024.

- A. Agrawal, et al., M. Kang, et al, “A Switched-Capacitor Integer Compute Unit with Decoupled Storage and Arithmetic for Cloud AI Inference in 5nm CMOS,” IEEE Symposium on VLSI Technology and Circuits (VLSI symposium), June. 2023.

- Y. Khodke, S.S. Sundaram, Y. Li, and M. Kang, “AI Processor with Sparsity-adaptive Real-time Dynamic Frequency Modulation for Convolutional Neural Networks and Transformers," IEEE Custom Integrated Circuits Conference (CICC), Apr. 2023.

- A. Yazdanbakhsh*, A. Moradi*, Z. Li*, and M. Kang, “Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation," International Symposium on Microarchitecture (MICRO), Oct. 2022.

- Z. Li*, S. Ghodrati*, A. Yazdanbakhsh*, H. Esmaeilzadeh, and M. Kang, “Accelerating Attention through Gradient-Based Learned Runtime Pruning,” International Symposium on Computer Architecture (ISCA), Jun. 2022 (*equally contributed).

- J. Joo, M. Yoon, J. Choi, M. Kang, et al., "Understanding and Reducing Block-Load Overhead of Systolic Deep Learning Accelerators," International SoC Conference (ISOCC), Oct. 2021.

- S. Venkataramani, et al., M. Kang, et al., Kailash Gopalakrishnan, “RaPiD: AI Accelerator for Ultra-low Precision Training and Inference,” IEEE Symposium on Computer Architecture (ISCA), Jun. 2021.

- A. Agrawal, S.K Lee, J. Silberman, M. Ziegler, M. Kang, et al., "A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling," IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), Feb. 2021.

- J. Oh, S. Lee, M. Kang, et al., “A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference,” IEEE Symposium on VLSI Circuits (VLSI Symposium), June, 2020.

- A. D. Patil, H. Hua, M. Kang, and N. R. Shanbhag, “An MRAM-Based Deep in-Memory Architecture for Deep Neural Networks,” IEEE International Symposium on Circuit and System (ISCAS), May. 2019.

- Y. Kim, M. Kang, L. R. Varshney, and N. R. Shanbhag, “SRAM Bit-line Swings Optimization using Generalized Waterfilling,” IEEE International Symposium on Information Theory (ISIT), Jun. 2018.

- Prakalp Srivastava*, M. Kang* (*two authors equally contributed), Sujan Gonugondla, Jungwook Choi, Namsung Kim, Vikram Adve, and N. R. Shanbhag, “PROMISE: An End-to-End Design of a Programmable Mixed-Signal Accelerator for Machine-Learning Algorithms,” IEEE Symposium on Computer Architecture (ISCA), Jun, 2018. (awarded in 2019 MICRO Top Pick Honorable Mention).

- S. K. Gonugondla, M. Kang, and N. R. Shanbhag, “Energy-Efficient Deep In-memory Architecture for NAND Flash Memories” IEEE International Symposium on Circuit and System (ISCAS), Best paper awarded in “Neural System and Application”, May, 2018.

- S. K. Gonugondla, M. Kang, and N. R. Shanbhag, “A 42pJ/Decision 3.12TOPS/W Robust In-Memory Machine Learning Classifier with On-Chip Training,” in IEEE International Solid-State Circuits Conference (ISSCC), Feb. 2018.

- M. Kang, S. K. Gonugondla, and N. R. Shanbhag, “A 19.4 nJ/decision 364 K decisions/s in-memory random forest classifier in 6T SRAM array,” IEEE European Solid-State Circuits Conference (ESSCIRC), Sep. 2017, pp. 263–266.

- M. Kang, S. Gonugondla, A. D. Patil, and N. R. Shanbhag, “A 481pJ/decision 3.4M decision/s Multifunctional Deep In-memory Inference Processor,” ArXiv, https://arxiv.org/abs/1610.07501, Oct. 2016.

- M. Kang, S. K. Gonugondla, and N. R. Shanbhag, “An Energy-efficient Memory-based High-Throughput VLSI Architecture for Convolutional Networks,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Apr. 2015, pp. 1037-1041.

- M. Kang, E. P. Kim, M. S Keel, and N. R. Shanbhag, “Energy-efficient and High Throughput Sparse Distributed Memory Architecture,” IEEE International Symposium on Circuit and System (ISCAS), Best paper awarded in “Neural System and Application”, May, 2015, pp. 2505-2508.

- M. Kang, M. S Keel, N. R. Shanbhag, S. Eilert, and K. Curewitz, “An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May, 2014, pp. 8326-8330.

- Y. Choi, et al., M. Kang, "A 20nm 1.8 V 8Gb PRAM with 40MB/s program bandwidth," IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), Feb. 2012, pp. 46-48.

- H.K. Park, S.C. Song, M.H. Abu-Rahma, L. Ge, M. Kang, B.M. Han, J. Wang, R. Choi, S.O. Jung and G. Yeap, “Accurate Projection of Vccmin by Modeling “Dual Slope” in FinFET based SRAM, and impact of Long Term Reliability on End of Life Vccmin,” IEEE International Reliability Physics Symposium, May, 2010, pp. 1008-1013.

Journals

- Zihan Xia, Chihun Song, Ram Krishna, Ashita Victor, Srujan Penta, Muhannad S Bakir, Elyse Rosenbaum, Nam Sung Kim, Mingu Kang,“Exploiting Chiplet Integration Technology for Fast High-Capacity DRAM Modules," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Jan. 2025.

- Keming Fan, Ashkan Moradifirouzabadi, Xiangjin Wu, Zheyu Li, Flavio Ponzina, Anton Persson, Eric Pop, Tajan Rosing, and M. Kang,“SpecPCM: A Low-power PCM-based In-Memory Computing Accelerator for Full-stack Mass Spectrometry Analysis," IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (JXCDC), Nov. 2024.

- S.S. Sundaram, Y. Khodke, Y. Li, S.J Jang, S.S Lee, and M. Kang,“FreFlex: A high-performance Processor for Convolution and Attention Computations via Sparsity-adaptive Dynamic Frequency Boosting," IEEE Journal of Solid State Circuits (JSSC), [Invited], Nov. 2023.

- Y.K. Lee, D.H. Ko, S Cho, M. Yeo, M. Kang, and S.O. Jung, "Split WL 6T SRAM-based Bit Serial Computing-in-Memory Macro with High Signal Margin and High Throughput", IEEE Transactions on Circuits and Systems II (TCAS II), Nov. 2023.

- S.K. Lee, A. Agrawal, J. Silberman, M. Ziegler, M. Kang, et al., "A 7nm 4-Core Mixed-Precision AI Chip with 26.2TFLOPS Hybrid-FP8 Training, 104.9TOPS INT4 Inference and Workload-Aware Throttling", IEEE Journal of Solid-State Circuits (JSSC), Jan. 2022.

- M. Kang, S. Gonugondla, and N. R. Shanbhag, “Deep In-memory Architectures in SRAM: An Analog Approach to Approximate Computing,” Proceedings of the IEEE, [Invited], Vol. 108, Issue. 12, 2251-2275, Dec. 2020.

- S. Venkataramani, X. Sun, N. Wang, C.Y Chen, J. Choi, M. Kang, at al., “Efficient AI System Design with Cross-layer Approximate Computing,” Proceedings of the IEEE, [Invited], Vol. 108, Issue. 12, 2232-2250, Dec. 2020.

- M. Kang, Y. Kim, A. Patil, and N. R. Shanbhag, “Deep In-Memory Architectures for Machine Learning-Accuracy Versus Efficiency Trade-Offs,” IEEE Transactions on Circuits and Systems I (TCAS I), Vol. 67, Issue. 5, May. 2020.

- M. Kang, P. Srivastava, V. Adve, Namsung Kim, and N. R. Shanbhag, “An Energy-Efficient Programmable Mixed-Signal Accelerator for Machine Learning Algorithms,” IEEE MICRO, Vol. 39, Issue. 5, pp. 64-72, July. 2019.

- S. Gonugondla, M. Kang, and N. R. Shanbhag, “A Variation-Tolerant In-Memory Machine Learning Classifier via On-Chip Training,” IEEE Journal of Solid-State Circuits (JSSC), [Invited], Sept. 2018.

- M. Kang, S. Lim, S. Gonugondla, and N. R. Shanbhag, “An In-Memory VLSI Architecture for Convolutional Neural Networks,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, [Invited], Apr. 2018.

- M. Kang, S. Gonugondla, S. Lim, and N. R. Shanbhag, “A 19.4 nJ/decision, 364K decisions/s, In-memory Random Forest Multi-class Inference Accelerator,” IEEE Journal of Solid-State Circuits (JSSC), [Invited], July. 2018.

- Y. Kim, M. Kang, L. R. Varshney, and N. R. Shanbhag, “Generalized Water-filling for Source-Aware Energy-Efficient SRAMs,” IEEE Transactions on Communications (TCOM), May. 2018.

- M. Kang, S. Gonugondla, A. Patil, and N. R. Shanbhag, “A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array,” IEEE Journal of Solid-State Circuits (JSSC), Vol. 53, Issue. 2, pp. 642-655, Jan. 2018.

- Y. Kim, M. Kang, L. R. Varshney, and N. R. Shanbhag, “Generalized Water-filling for Source-Aware Energy-Efficient SRAMs,” ArXiv, https://arxiv.org/pdf/1710.07153, Nov. 2017.

- M. Kang, and N. R. Shanbhag, “In-memory Computing Architectures for Sparse Distributed Memory,” IEEE Transactions on Biomedical Circuit and System, [Invited], Vol. 10, No. 4, pp. 855-863, Aug. 2016.

- S. Zhang, M. Kang, C. Sakr, and N. R. Shanbhag, “Reducing the Energy Cost of Inference via In-sensor Information Processing,” ArXiv, https://arxiv.org/abs/1607.00667, July. 2016.

- M. Kang, H. K. Park, J. Wang, G. Yeap, and S.O. Jung, “Asymmetric Independent-Gate MOSFET SRAM for High Stability,” IEEE Transactions on Electron Devices (TED), Vol. 58, No. 9, pp. 2959-2965, Sept. 2011.

- M. Kang, Abu-Rahma, L. Ge, B.M. Han, J. Wang, G. Yeap, and S.O. Jung, "FinFET SRAM Optimization with Fin Thickness and Surface Orientation,” IEEE Transactions on Electron Devices (TED), Vol. 57, No. 11, pp. 2785-2793, Nov. 2010.

- M. Kang and S.O. Jung, "Serial-Parallel Content Addressable Memory with A Conditional Driver (SPCwCD)," IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences, Vol.E92-A, No.1, pp. 318-321. Nov. 2009.

- M. Kang, S.H. Woo, and S.O. Jung, “Dynamic Mixed Serial-Parallel Content Addressable Memory (DMSP CAM),” International Journal of Circuit Theory and Applications. Vol. 41, Issue. 7, pp. 721–731, July. 2013.

UCSD Electrical and Computer Engineering (ECE) Department

VVIP Lab

Vertically-integrated VLSI Information Processing

Book

Conferences

Journals