Showing 1–20 of 22 results
/ Date/ Name
Jun 14, 2024Bootstrapping Language Models with DPO Implicit RewardsJan 24, 2022Multiscale Generative Models: Improving Performance of a Generative Model Using Feedback from Other Dependent Generative ModelsNov 26, 2023Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement LearningMar 4, 2024Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language ModelsFeb 16, 2025DLBayesian: An Alternative Bayesian Reconstruction of Limited-view CT by Optimizing Deep Learning ParametersSep 21, 2022Extraction-based Deep Learning Reconstruction of Interior TomographyJan 16, 2025On Learning Informative Trajectory Embeddings for Imitation, Classification and RegressionJul 24, 2024Towards Neural Network based Cognitive Models of Dynamic Decision-Making by HumansJun 16, 2023Semi-Offline Reinforcement Learning for Optimized Text GenerationNov 3, 2024Sample-Efficient Alignment for LLMsDec 7, 2023Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool UseApr 18, 2024Uncovering Safety Risks of Large Language Models through Concept Activation VectorJun 15, 2024Unlocking Large Language Model's Planning Capabilities with Maximum Diversity Fine-tuningOct 15, 2024Heterogeneous Graph Generation: A Hierarchical Approach using Node Feature PoolingMar 26, 2025Understanding R1-Zero-Like Training: A Critical PerspectiveJul 16, 2019GRID: a Student Project to Monitor the Transient Gamma-Ray Sky in the Multi-Messenger Astronomy EraOct 1, 2025GEM: A Gym for Agentic LLMsApr 14, 2025Efficient Process Reward Model Training via Active LearningFeb 27, 2026CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities ReasoningOct 25, 2023CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment