SAM-G-Reasoning

SAM-G-Reasoning is a 30.3M-parameter model fine-tuned from SAM-G on 196k verified multi-step reasoning traces and action plans. It emits explicit step-by-step traces (step 1: ... step 2: ... Answer: X) for questions and ordered JSON plans for multi-step instructions. Built by AMEFORGE for procedural reasoning on the edge.

  • Parameters: 30.3M Β· Footprint: 121 MB fp32 Β· Base: SAM-G
  • Fine-tuning: prompt-masked SFT (loss on the reasoning span only), cosine 8e-5, 8k steps
  • Aggregate exact-match: 77.8% (held-out, disjoint seed)

What it is good at β€” and what it is not

The model was stress-tested on twelve verified families. The pattern is clear: it excels at procedural reasoning (following steps, tracking state, chaining actions) and is limited on calculation-heavy tasks, as expected at 30M parameters.

Family Exact % Type
logic (ponens/tollens/chains) 100 procedural
plan_chain (multi-step actions) 100 procedural
conversion (unit chains) 100 procedural
sequence (next term) 100 procedural
date_time (clock/calendar) 92 procedural
compare (max/min) 92 procedural
state_track (device toggles) 83 working memory
parity_digits 58 mixed
count_filter 67 calculation
sort_list 50 calculation
word_problem 50 calculation
arith_chain 42 calculation

State-tracking at 83% is notable for this scale β€” it requires maintaining a mutable state across several operations. Arithmetic-chain and sorting plateau because exact multi-digit calculation is not reliably learnable at 30M; for those, delegate to a tool rather than the model.

Intended use

Agentic control loops: decompose an instruction into ordered steps, track execution state, and emit structured action plans β€” entirely offline. Best used as the planning and state-tracking layer of an agent, with arithmetic and data lookups delegated to deterministic tools.

Usage

import sentencepiece as spm, torch
sp = spm.SentencePieceProcessor(); sp.Load("samg_tokenizer.model")
prompt = "states: lamp=off, fan=on. ops: toggle lamp, turn off fan, toggle lamp. final state of lamp? [CHAT]"
ids = torch.tensor([sp.EncodeAsIds(prompt)])
# greedy-decode -> "step 1: ... step 2: ... step 3: ... Answer: off"

Limitations

  • Calculation-heavy families (arithmetic, sorting, word problems) plateau at 42–50%; do not use for exact math β€” delegate to tools.
  • Reasoning traces are synthetic, drawn from the training distribution family with a disjoint evaluation seed.
  • Not a general assistant; inherits the base model's knowledge limits.

Citation

@misc{samgreasoning2026,
  title  = {SAM-G-Reasoning: Procedural Multi-Step Reasoning at 30M Parameters},
  author = {AMEFORGE Lab},
  year   = {2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AMFORGE/samg-reasoning-checkpoints

Base model

AMFORGE/samg
Finetuned
(1)
this model

Evaluation results