How to Train Long-Context Language Models (Effectively) — arXiv2