Quark-50m-Instruct

Quark-50m-Instruct is a small (≈56M parameters) decoder-only language model, fine-tuned for instruction following. It is built on the same architecture of “SmolLM” family and was fully pretrained on 5 billion tokens from HuggingFaceTB/smollm‑corpus.

Model type: Causal Language Model (LLaMA‑style decoder)
Architecture: GQA · SwiGLU · RMSNorm · RoPE · Weight‑tying
Pretraining tokens: 5 B
Fine‑tuning: Instruction‑tuned (details below)
Creators: OvercastLab (research & development lab for ML/AI)
Release date: 22 April 2026

Model Summary

Quark-50m-Instruct is designed to be an efficient assistant that can run on consumer GPUs (e.g., RTX 3070 with 8 GB VRAM) and even on CPU for light workloads. It is not competitive with large models on knowledge‑intensive tasks, but it excels at:

Simple conversational tasks
Code generation and explanation (Python)
Short text rewriting and summarisation
On‑device / edge inference

The architecture closely follows the efficient‑small‑LM blueprint popularised by SmolLM:

Component	Details
Vocab size	49,152
Hidden size	384
Layers	24
Attention	Grouped Query (6 Q heads, 2 KV heads)
FFN	SwiGLU with 1,024 intermediate
Position	RoPE (θ = 10,000)
Normalisation	RMSNorm (pre‑block)

Total trainable parameters: ≈48 M (with weight tying).

Benchmark Evaluation Metrics

Category	Benchmark	Metric	Score / Value	Status
Linguistics & Grammar	BLiMP	Accuracy	68.12%	Success
Commonsense & Reasoning	PIQA	Normalized Accuracy	57.83%	Success
	COPA	Accuracy	57.00%	Success
	BoolQ	Accuracy	52.17%	Success
	WinoGrande	Accuracy	47.36%	Success
	HellaSwag	Normalized Accuracy	28.49%	Success
	RACE	Accuracy	26.41%	Success
	CommonsenseQA	Accuracy	20.31%	Success
Academic & Knowledge	SciQ	Normalized Accuracy	49.00%	Success
	ARC-Easy	Normalized Accuracy	36.49%	Success
	MMLU	Accuracy	25.64%	Success
	ARC-Challenge	Normalized Accuracy	25.17%	Success
	OpenBookQA	Normalized Accuracy	25.40%	Success
Language Modeling	LAMBADA	Accuracy	15.87%	Success
	WikiText-2	Word Perplexity	251.76	Success

Note: The Arithmetic benchmark failed due to outdated script support (arithmetic.py), and SocialIQA failed due to a registration tag error (siqa). Total baseline execution completed successfully for all other 15 tasks.

Uses

Direct Use

The model can be used via the 🤗 Transformers library for standard text generation. It expects chat‑formatted input (see example below).

Downstream Use

Because of the open Apache‑2.0 license, you may fine‑tune Quark-50m‑Instruct on your own data for domain‑specific tasks – for instance, a customer‑support bot, a code reviewer, or a story writer.

Limitations

Limited world knowledge (stopped at mid‑2025 pretraining data).
Short context window (2,048 tokens).
Small size means it can make more factual mistakes than larger models.

How to Get Started

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "ThingAI/Quark-50m-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

messages = [
    {"role": "system", "content": "You are Quark, a helpful assistant."},
    {"role": "user", "content": "Explain group query attention in one sentence."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 2,800

Safetensors

Model size

56.7M params

Tensor type

BF16

Model tree for ThingAI/Quark-50m

Quantizations

1 model

Dataset used to train ThingAI/Quark-50m

Spaces using ThingAI/Quark-50m 3

Collection including ThingAI/Quark-50m

Quark-v0.1⚡️

Collection

6 items • Updated about 22 hours ago • 2