C-for-Metal: High Performance SIMD Programming on Intel GPUs — arXiv2