Showing 1–20 of 62 results
/ Date/ Name
Sep 28, 2023Training a Large Video Model on a Single Machine in a DayMay 3, 2021Learning to drive from a world on railsJan 11, 2024Distilling Vision-Language Models on Millions of VideosMar 12, 2021Probabilistic two-stage detectionFeb 25, 2021Simple multi-dataset detectionOct 19, 2023Predicting a Protein's Stability under a Million MutationsMay 5, 2022Cross-view Transformers for real-time Map-view Semantic SegmentationNov 13, 2024Cut Your Losses in Large-Vocabulary Language ModelsDec 2, 2019A Multigrid Method for Efficiently Training Video ModelsDec 12, 2018Long-Term Feature Banks for Detailed Video UnderstandingMay 31, 2016Adversarial Feature LearningNov 29, 2023Language-conditioned Detection TransformerJul 16, 2023Language Conditioned Traffic GenerationFeb 3, 2025Reinforcement Learning for Long-Horizon Interactive LLM AgentsOct 20, 2012Efficient Inference in Fully Connected CRFs with Gaussian Edge PotentialsSep 9, 2024Promptable Closed-loop Traffic SimulationMay 30, 2019Does computer vision matter for action?Dec 8, 2022Learning Video Representations from Large Language ModelsNov 26, 2018Joint Monocular 3D Vehicle Detection and TrackingOct 2, 2015Learning a Discriminative Model for the Perception of Realism in Composite Images