arXiv2
Search
Dark
/ Date
/ Name
Aa
W
/ Date
/ Name
"au:"Swastik Roy"" — arXiv2 Search
Showing 1–6 of 6 results
/ Date
/ Name
Oct 20, 2025
OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning
Sep 29, 2025
BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models
Dec 2, 2025
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
Nov 20, 2025
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation
Mar 17, 2025
The Amazon Nova Family of Models: Technical Report and Model Card
Apr 24, 2026
C-MORAL: Controllable Multi-Objective Molecular Optimization with Reinforcement Alignment for LLMs