Showing 1–20 of 22 results
/ Date/ Name
Sep 25, 2024Monge-Kantorovich Fitting With Sobolev BudgetsFeb 19, 2024Query-Based Adversarial Prompt GenerationOct 29, 2023Label Poisoning is All You NeedOct 14, 2022Zonotope Domains for Lagrangian Neural Network VerificationJul 23, 2024Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?Jun 17, 2025Sampling from Your Language Model One Byte at a TimeOct 12, 2022Few-shot Backdoor Attacks via Neural Tangent KernelsJul 2, 2024PLeaS -- Merging Models with Permutations and Least SquaresMar 17, 2025SuperBPE: Space Travel for Language ModelsApr 27, 2023DataComp: In search of the next generation of multimodal datasetsFeb 6, 2026Anchored Decoding: Provably Reducing Copyright Risk for Any Language ModelJan 30, 2026Are you going to finish that? A Practical Study of the Partial Token ProblemNov 28, 2023Scalable Extraction of Training Data from (Production) Language ModelsApr 22, 2021SPECTRE: Defending Against Backdoor Attacks Using Robust StatisticsNov 1, 2024OML: A Primitive for Reconciling Open Access with Owner Control in AI Model DistributionJun 23, 2025Broken Tokens? Your Language Model can Secretly Handle Non-Canonical TokenizationsMay 24, 2022Towards a Defense Against Federated Backdoor Attacks Under Continuous TrainingApr 23, 2024Insufficient Statistics Perturbation: Stable Estimators for Private Least SquaresMar 11, 2024Stealing Part of a Production Language ModelFeb 11, 2025Scalable Fingerprinting of Large Language Models