Explainable Video Action Reasoning via Prior Knowledge and State Transitions — arXiv2