Bridging SFT and DPO for Diffusion Model Alignment with Self-Sampling Preference Optimization — arXiv2