Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition Paper • 1402.1128 • Published Feb 5, 2014 • 1
Large Language Models' Detection of Political Orientation in Newspapers Paper • 2406.00018 • Published May 23, 2024 • 1
Training Sparse Mixture Of Experts Text Embedding Models Paper • 2502.07972 • Published Feb 11, 2025 • 12
view article Article SetFit: Efficient Few-Shot Learning Without Prompts +4 Unso, lewtun, luketheduke, danielkorat, orenpereg, moshew • Sep 26, 2022 • 41
view article Article Small Language Models (SLM): A Comprehensive Overview jjokah • Feb 22, 2025 • 163
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 147
No Language Left Behind: Scaling Human-Centered Machine Translation Paper • 2207.04672 • Published Jul 11, 2022 • 4
Token-Level Generalization in LoRA Adapter Backdoors: Attack Characterization and Behavioral Detection Paper • 2605.30189 • Published 26 days ago • 9
Embedding Atlas: Low-Friction, Interactive Embedding Visualization Paper • 2505.06386 • Published Jul 8, 2025 • 1
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 46
🐶 IDEFICS 🐶 Collection Collection assembling all the models and spaces related to IDEFICS • 6 items • Updated Apr 15, 2024 • 9