Sleeping Agents Sudanese Arabic Navigable RAG Demo ๐งญ Compare Sudanese Arabic phrase retrieval methods
Sleeping Agents Interleaved Retrieval-Reasoning Benchmark ๐ Compare Standard vs Interleaved RAG with simulated benchmarks
Running Agents Agent Architecture Visualizer ๐ Simulate and visualize AI agent loops with permissions
Running Agents 1 TESSY Reasoning Demo - Sudanese Arabic ๐ง Analyze Sudanese Arabic samples with standard vs TESSY reasoning
Paused Agents Sudanese Arabic SWE-AGILE Reasoning Benchmark ๐ง Run Sudanese Arabic reasoning benchmark with context strategies
Sleeping Agents Sudanese Arabic Synthetic Data Quality Benchmark ๐ Evaluate Sudanese Arabic models and compare their generated responses
Paused Agents Sudanese Arabic Reading Comprehension Benchmark ๐ Run Sudanese Arabic QA benchmark and compare models
Sleeping Agents Sudanese Arabic Code-Switching Detection ๐ Detect ArabicโEnglish codeโswitches in Sudanese text
Paused Agents Process Reward Agents: Test-Time Reasoning Scaling ๐ณ Compare greedy vs rewardโguided reasoning for a question
Paused Agents Sudanese CoT Reasoning Benchmark ๐ง Generate step-by-step Sudanese Arabic reasoning and analysis
Paused Agents Sudanese Synthetic Instructions ๐ Generate synthetic Sudanese Arabic instruction datasets
Paused Agents Master Key Hypothesis Demo ๐ Explore simulated crossโmodel transfer with 3D visualizations