"au:"Jiri Gesi"" — arXiv2 Search

/ Date/ Name

/ Date/ Name

"au:"Jiri Gesi"" — arXiv2 Search

Showing 1–16 of 16 results

/ Date/ Name

Mar 2, 2022Code Smells in Machine Learning Systems Feb 26, 2024Beyond Self-learned Attention: Mitigating Attention Bias in Transformer-based Models Using Attention Guidance Feb 18, 2025UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design Jul 28, 2025Multi-Agent-as-Judge: Aligning LLM-Agent-Based Automated Evaluation with Multi-Dimensional Human Evaluation Sep 25, 2025LLM Agent Meets Agentic AI: Can LLM Agents Simulate Customers to Evaluate Agentic-AI-based Shopping Assistants?Jun 5, 2025OPeRA: A Dataset of Observation, Persona, Rationale, and Action for Evaluating LLMs on Human Online Shopping Behavior Simulation Mar 26, 2025Can LLM Agents Simulate Multi-Turn Human Behavior? Evidence from Real Online Customer Behavior Data Jan 28, 2026Trajectory2Task: Training Robust Tool-Calling Agents with Synthesized Yet Verifiable Data for Complex User Intents Oct 17, 2025WEBSERV: A Browser-Server Environment for Efficient Training of Reinforcement Learning-based Web Agents at Scale Jul 23, 2025Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning Oct 22, 2025See, Think, Act: Online Shopper Behavior Simulation with VLM Agents Aug 5, 2025Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction Sep 25, 2025SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs Mar 7, 2024Towards Robustness Analysis of E-Commerce Ranking System Oct 3, 2024Does the Order of Fine-tuning Matter and Why?Apr 13, 2025UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents