From Supervision to Exploration: What Does Protein Language Model Learn During Reinforcement Learning? — arXiv2