Showing 1–13 of 13 results
/ Date/ Name
Oct 25, 2023LLM-FP4: 4-Bit Floating-Point Quantized TransformersJan 8, 2026GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL OptimizationFeb 14, 2024DoRA: Weight-Decomposed Low-Rank AdaptationOct 16, 2025DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement LearningOct 28, 2024EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank ApproximationFeb 4, 2023Oscillation-free Quantization for Low-bit Vision TransformersDec 14, 2023CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality LabelsJul 10, 2024RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation QuantizationNov 20, 2024Hymba: A Hybrid-head Architecture for Small Language ModelsJun 12, 2023Efficient and Robust Quantization-aware Training via Adaptive Coreset SelectionApr 10, 2025APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-DesignMar 28, 2024Genetic Quantization-Aware Approximation for Non-Linear Operations in TransformersDec 19, 2025A 28nm 0.22μJ/token memory-compute-intensity-aware CNN-Transformer accelerator with hybrid-attention-based layer-fusion and cascaded pruning for semantic-segmentation