Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
master
PRO
fantos
21
9
356
Follow
hypervanser's profile picture
Kajohn250's profile picture
Tassawar's profile picture
138 followers
ยท
120 following
AI & ML interests
None yet
Recent Activity
reacted
to
ginigen-ai
's
post
with ๐
about 4 hours ago
๐ง Does your LLM know when it's about to be wrong? Most leaderboards measure accuracy. We measure metacognition โ whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. ๐ The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 โ ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal. Two independent axes (never compared across a row): โ trap_rate โ does it fall for tempting trap options? (lower = stronger) โก adapter gain ฮ โ how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value) What's open: ๐ 300+100 trap problems (each with a hidden trap + TICOS type) ๐ 24-model leaderboard ๐งฉ 11 per-model adapters โ adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state โ P(wrong)) Submit any HF model โ auto-scored daily at 09:00 KST and added to the board. ๐ Leaderboard โ https://huggingface.co/spaces/ginigen-ai/Metacognition-Leaderboard-Space ๐ Benchmark โ https://huggingface.co/datasets/ginigen-ai/Metacognition-Bench ๐งฉ Adapters โ https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961 ๐ Article โ https://huggingface.co/blog/ginigen-ai/metacognition Benchmark by ginigen-ai ยท Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).
reacted
to
ginigen-ai
's
post
with โค๏ธ
about 4 hours ago
๐ง Does your LLM know when it's about to be wrong? Most leaderboards measure accuracy. We measure metacognition โ whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. ๐ The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 โ ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal. Two independent axes (never compared across a row): โ trap_rate โ does it fall for tempting trap options? (lower = stronger) โก adapter gain ฮ โ how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value) What's open: ๐ 300+100 trap problems (each with a hidden trap + TICOS type) ๐ 24-model leaderboard ๐งฉ 11 per-model adapters โ adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state โ P(wrong)) Submit any HF model โ auto-scored daily at 09:00 KST and added to the board. ๐ Leaderboard โ https://huggingface.co/spaces/ginigen-ai/Metacognition-Leaderboard-Space ๐ Benchmark โ https://huggingface.co/datasets/ginigen-ai/Metacognition-Bench ๐งฉ Adapters โ https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961 ๐ Article โ https://huggingface.co/blog/ginigen-ai/metacognition Benchmark by ginigen-ai ยท Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).
reacted
to
ginigen-ai
's
post
with ๐ฅ
about 4 hours ago
๐ง Does your LLM know when it's about to be wrong? Most leaderboards measure accuracy. We measure metacognition โ whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. ๐ The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 โ ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal. Two independent axes (never compared across a row): โ trap_rate โ does it fall for tempting trap options? (lower = stronger) โก adapter gain ฮ โ how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value) What's open: ๐ 300+100 trap problems (each with a hidden trap + TICOS type) ๐ 24-model leaderboard ๐งฉ 11 per-model adapters โ adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state โ P(wrong)) Submit any HF model โ auto-scored daily at 09:00 KST and added to the board. ๐ Leaderboard โ https://huggingface.co/spaces/ginigen-ai/Metacognition-Leaderboard-Space ๐ Benchmark โ https://huggingface.co/datasets/ginigen-ai/Metacognition-Bench ๐งฉ Adapters โ https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961 ๐ Article โ https://huggingface.co/blog/ginigen-ai/metacognition Benchmark by ginigen-ai ยท Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).
View all activity
Organizations
fantos
's models
9
Sort:ย Recently updated
fantos/MiniCPM-o-2_6
Any-to-Any
โข
9B
โข
Updated
Nov 2, 2025
โข
5
fantos/Qwen3-Omni-30B-A3B-Thinking
Any-to-Any
โข
32B
โข
Updated
Nov 2, 2025
โข
6
fantos/Ming-flash-omni-Preview
Any-to-Any
โข
104B
โข
Updated
Nov 2, 2025
โข
35
fantos/Qwen-Image-Edit-Rapid-AIO
Text-to-Image
โข
Updated
Nov 2, 2025
โข
1
fantos/GLM-4.6
Text Generation
โข
357B
โข
Updated
Nov 2, 2025
โข
5
fantos/neutts-air
Text-to-Speech
โข
0.7B
โข
Updated
Nov 2, 2025
โข
10
fantos/PaddleOCR-VL
Image-Text-to-Text
โข
1.0B
โข
Updated
Nov 2, 2025
โข
12
fantos/DeepSeek-OCR
Image-Text-to-Text
โข
3B
โข
Updated
Nov 2, 2025
โข
7
fantos/QwQ-32B-bnb-4bit
Text Generation
โข
33B
โข
Updated
Mar 20, 2025
โข
5
โข
63