Showing 1–20 of 59 results
/ Date/ Name
Jan 30, 2024A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN InferenceNov 10, 2025P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical FormatsMay 28, 2025Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics LearningNov 9, 2025Precision-Scalable Microscaling Datapaths with Optimized Reduction Tree for Efficient NPU IntegrationJan 13, 2020Hybrid Precoding in Cooperative Millimeter Wave NetworksJan 22, 2024BETA: Binarized Energy-Efficient Transformer Accelerator at the EdgeAug 26, 2025APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM AccelerationJan 7, 2026A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP SystemsFeb 3, 2023PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning ApplicationsSep 21, 2024SPEED: A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN InferenceSep 26, 2024Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor CoresFeb 27, 2025A Novel P-bit-based Probabilistic Computing Approach for Solving the 3-D Protein Folding ProblemMay 25, 2025Enable Lightweight and Precision-Scalable Posit/IEEE-754 Arithmetic in RISC-V Cores for Transprecision ComputingJul 16, 2024Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge DeploymentSep 15, 2023A Precision-Scalable RISC-V DNN Processor with On-Device Learning Capability at the Extreme EdgeSep 22, 2023Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-DesignAug 12, 2022An Algorithm-Hardware Co-Optimized Framework for Accelerating N:M Sparse TransformersJun 6, 2022Crosstalk Suppression in Individually Addressed Two-Qubit Gates in a Trapped-Ion Quantum ComputerJun 4, 2021Joint Scheduling and Throughput Maximization in Self-backhauled Millimeter Wave Cellular NetworksNov 24, 2024Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format