Showing 241–260 of 468 results
/ Date/ Name
Aug 3, 2025IMU: Influence-guided Machine UnlearningJul 2, 2025Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model!Jun 20, 2025Differentiation-Based Extraction of Proprietary Data from Fine-Tuned LLMsJun 17, 2025Toward Principled LLM Safety Testing: Solving the Jailbreak Oracle ProblemJun 16, 2025Position: Certified Robustness Does Not (Yet) Imply Model SecurityJun 1, 2025Unlearning Inversion Attacks for Graph Neural NetworksMay 28, 2025Jailbreak Distillation: Renewable Safety BenchmarkingMay 21, 2025Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context RetrievalMay 18, 2025PoLO: Proof-of-Learning and Proof-of-Ownership at Once with Chained WatermarkingMay 16, 2025GenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation ModelsMay 9, 2025Crowding Out The Noise: Algorithmic Collective Action Under Differential PrivacyApr 23, 2025Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" ControlApr 23, 2025Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent DebateApr 12, 2025SmartShift: A Secure and Efficient Approach to Smart Contract MigrationMar 26, 2025Certified randomness using a trapped-ion quantum processorMar 23, 2025STShield: Single-Token Sentinel for Real-Time Jailbreak Detection in Large Language ModelsMar 4, 2025One Stone, Two Birds: Enhancing Adversarial Defense Through the Lens of Distributional DiscrepancyFeb 25, 2025MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning AttacksFeb 22, 2025Efficient LLM Moderation with Multi-Layer Latent PrototypesFeb 19, 2025The Canary's Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text