NVILA: Efficient Frontier Visual Language Models — arXiv2