Untangling tradeoffs between recurrence and self-attention in neural networks — arXiv2