Who Defines "Best"? Towards Interactive, User-Defined Evaluation of LLM Leaderboards — arXiv2