Time-Multiplexed In-Memory Computation Scheme for Mapping Quantized Neural Networks on Hybrid CMOS-OxRAM Building Blocks
/ Authors
/ Abstract
In this work, we experimentally demonstrate two key building blocks for realizing Binary/Ternary Neural Networks (BNNs/TNNs): (i) 130 nm CMOS based sigmoidal neurons and (ii) HfO<inline-formula><tex-math notation="LaTeX">$_{2}$</tex-math></inline-formula> based multi-level (MLC) OxRAM-synaptic blocks. An optimized vector matrix multiplication (VMM) programming scheme that utilizes the two building blocks is also presented. Compared to prior approaches that utilize differential synaptic structures, a single device per synapse with two sets of READ operations is used. Proposed hardware mapping strategy shows performance change of <inline-formula><tex-math notation="LaTeX">$< $</tex-math></inline-formula>5% (decrease of 2-5% for TNN, increase of 0.2% for BNN) compared to software-based implementation with significant memory savings in the order of 16-32× for classification problem on Fashion MNIST (FMNIST) dataset. Impact of OxRAM device variability on the performance of Hardware BNN/TNN is also analyzed.
Journal: IEEE Transactions on Nanotechnology