Showing 1–20 of 20 results
/ Date/ Name
Nov 15, 2022HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision TransformersDec 27, 2021SPViT: Enabling Faster Vision Transformers via Soft Token PruningNov 2, 2022Data Level Lottery Ticket Hypothesis for Vision TransformersNov 19, 2019DARB: A Density-Aware Regular-Block Pruning for Deep Neural NetworksJul 25, 2024SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting CircuitsJul 25, 2024Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision TransformersMay 28, 2025Enabling Flexible Multi-LLM Integration for Scalable Knowledge AggregationFeb 16, 2024Squat: Quant Small Language Models on the EdgeJan 8, 2025RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank AdaptationAug 6, 2025RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured MemoryJul 23, 2023A Life-Cycle Energy and Inventory Analysis of Adiabatic Quantum-Flux-Parametron CircuitsMay 11, 2020CSB-RNN: A Faster-than-Realtime RNN Acceleration Framework with Compressed Structured BlocksAug 25, 2021GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices based on Fine-Grained Structured Weight SparsityMay 20, 2025Structured Agent Distillation for Large Language ModelMar 17, 2026When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-MakingJul 4, 2022Quantum Neural Network CompressionNov 19, 2022Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer TrainingSep 21, 2023SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson DevicesFeb 19, 2020RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech RecognitionDec 9, 2023Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge