Showing 1–20 of 29 results
/ Date/ Name
Oct 19, 2022On the Adversarial Robustness of Mixture of ExpertsJun 10, 2021Scaling Vision with Sparse Mixture of ExpertsSep 15, 2023Scaling Laws for Sparsely-Connected Foundation ModelsJan 29, 2024Routers in Vision Mixture of Experts: An Empirical StudyJun 6, 2022Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of ExpertsMar 2, 2017Active Learning for Accurate Estimation of Linear ModelsFeb 9, 2016Online Active Linear Regression via ThresholdingFeb 27, 2024Stable LM 2 1.6B Technical ReportOct 14, 2020Deep Ensembles for Low-Data Transfer LearningSep 28, 2020Scalable Transfer Learning with Expert ModelsOct 13, 2020Which Model to Transfer? Finding the Needle in the Growing HaystackAug 2, 2023From Sparse to Soft Mixtures of ExpertsFeb 26, 2018Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson SamplingFeb 27, 2014Learning multifractal structure in large networksFeb 24, 2022Learning to Merge Tokens in Vision TransformersMar 1, 2017Human Interaction with Recommendation SystemsSep 14, 2022PaLI: A Jointly-Scaled Multilingual Language-Image ModelOct 7, 2021Sparse MoEs meet Efficient EnsemblesMay 29, 2023PaLI-X: On Scaling up a Multilingual Vision and Language ModelFeb 10, 2023Scaling Vision Transformers to 22 Billion Parameters