Weakness Analysis of Cyberspace Configuration Based on Reinforcement Learning — arXiv2