Showing 21–40 of 249 results
/ Date/ Name
Feb 19, 2026Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUsFeb 16, 2026Efficient Multi-round LLM Inference over Disaggregated ServingFeb 12, 2026Legitimate Overrides in Decentralized ProtocolsJan 27, 2026Decentralized Nonsmooth Nonconvex Optimization with Client SamplingJan 23, 2026GPU-Accelerated Selected Basis Diagonalization with Thrust for SQD-based AlgorithmsJan 19, 2026SWORD: A Secure LoW-Latency Offline-First Authentication and Data Sharing Scheme for Resource Constrained Distributed NetworksJan 19, 2026Unleashing Efficient Asynchronous RL Post-Training via Staleness-Constrained Rollout CoordinationDec 3, 2025VLCs: Managing Parallelism with Virtualized LibrariesNov 18, 2025FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated LearningNov 11, 2025Parallel Sampling via AutospeculationNov 10, 2025Lightning Grasp: High Performance Procedural Grasp Synthesis with Contact FieldsNov 2, 2025FREESH: Fair, Resource- and Energy-Efficient Scheduling for LLM Serving on Heterogeneous GPUsOct 30, 2025Mind the Gap: Revealing Inconsistencies Across Heterogeneous AI AcceleratorsOct 30, 2025FlowMesh: A Service Fabric for Composable LLM WorkflowsOct 30, 2025ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning SystemsOct 29, 2025Multi-Resolution Model Fusion for Accelerating the Convolutional Neural Network TrainingOct 23, 2025Collective Communication for 100k+ GPUsOct 3, 2025TridentServe: A Stage-level Serving System for Diffusion PipelinesSep 22, 2025Expert-as-a-Service: Towards Efficient, Scalable, and Robust Large-scale MoE ServingSep 9, 2025HYLU: Hybrid Parallel Sparse LU Factorization