양건우's picture

양건우

leviking87

AI & ML interests

Alignment-focused model research.

Recent Activity

upvoted a paper 15 days ago

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

liked a model 16 days ago

mistralai/Mistral-7B-Instruct-v0.2

upvoted a paper 16 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

View all activity

Organizations

None yet

upvoted a paper 15 days ago

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Paper • 2606.02060 • Published 19 days ago • 54

upvoted a paper 16 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Paper • 2606.02437 • Published 19 days ago • 231

upvoted 2 papers 23 days ago

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

Paper • 2605.20244 • Published May 18 • 4

EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation

Paper • 2605.23271 • Published 29 days ago • 80

upvoted a paper 28 days ago

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Paper • 2605.22109 • Published 30 days ago • 170

upvoted a paper 29 days ago

WorldAct: Activating Monolithic 3D Worlds into Interactive-Ready Object-Centric Scenes

Paper • 2605.15843 • Published May 15 • 6

upvoted a paper 30 days ago

SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Paper • 2602.12783 • Published Feb 13 • 246

upvoted 3 papers about 1 month ago

MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

Paper • 2605.14906 • Published May 14 • 78

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

Paper • 2605.10933 • Published May 11 • 4

Scaling Continual Learning to 300+ Tasks with Bi-Level Routing Mixture-of-Experts

Paper • 2602.03473 • Published May 8 • 11

upvoted 5 papers 2 months ago

Watch Before You Answer: Learning from Visually Grounded Post-Training

Paper • 2604.05117 • Published Apr 6 • 36

Structural Graph Probing of Vision-Language Models

Paper • 2603.27070 • Published Mar 28 • 6

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 507

Signals: Trajectory Sampling and Triage for Agentic Interactions

Paper • 2604.00356 • Published Apr 1 • 9

ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation

Paper • 2604.03922 • Published Apr 5 • 53

upvoted 5 papers 3 months ago

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published Mar 30 • 343

Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR

Paper • 2603.26246 • Published Mar 27 • 2

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183

Demystifing Video Reasoning

Paper • 2603.16870 • Published Mar 17 • 373

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Paper • 2603.16859 • Published Mar 17 • 249