NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment — arXiv2