SIMMER: Cross-Modal Food Image--Recipe Retrieval via MLLM-Based Embedding — arXiv2