KnowForge Encoder
A tiny (131K parameter) text classifier trained from scratch on the KnowForge dataset.
Given a natural-language input prompt, it predicts:
transform_typeโ which reasoning operation is requiredanswer_typeโ what kind of answer to expect
This model is a fast routing component, not a generative model. It is designed to run in milliseconds on CPU, making it suitable for pre-filtering or routing in a KnowForge inference pipeline.
Quick Start
pip install -r requirements.txt
python inference.py "A is taller than B. B is taller than C. Is A taller than C?"
# Transform: relation_to_graph (99.12%)
# Answer type: exact_answer (87.34%)
from inference import predict
result = predict("A is taller than B. B is taller than C. Is A taller than C?")
print(result["transform_type"]) # "relation_to_graph"
print(result["transform_confidence"]) # 0.9912
print(result["answer_type"]) # "exact_answer"
What It Classifies
Transform types (3 classes)
| Class | Meaning |
|---|---|
linear_to_cyclic |
Modular arithmetic in cyclic domains (clocks, calendars, wrap-around) |
relation_to_graph |
Transitive relation query over a directed entity graph |
relation_property_check |
Structural property check on a declared relation system |
Answer types (4 classes)
| Class | Meaning |
|---|---|
exact_answer |
A single definite value follows from the rules |
conditional_answer |
Answer depends on an unstated condition |
need_more_rule |
Insufficient rules to determine the answer |
unresolvable_without_observation |
Answer requires empirical observation not in the rules |
Architecture
Conv1d text classifier trained entirely from scratch โ no pretrained embeddings.
| Component | Detail |
|---|---|
| Embedding | 808 ร 64 (word-level, learned) |
| Encoder | 2 ร Conv1d(kernel=3) + ReLU, output dim 128 |
| Pooling | Global max pooling over sequence |
| Heads | transform (3), answer_type (4), plus auxiliary heads |
| Parameters | 131,888 |
| Training time | ~25 min on CPU |
Performance
Evaluated on dev set after 28 epochs (best checkpoint by dev loss):
| Metric | Score |
|---|---|
| transform_acc (dev) | 99.55% |
| atype_acc (dev) | 99.19% |
| transform_acc (train) | 99.66% |
| atype_acc (train) | 99.37% |
Transform accuracy on the full test pipeline evaluation: 99.64%.
Limitations
- Vocabulary size 808 โ trained on KnowForge synthetic text only. Out-of-domain vocabulary falls back to
<UNK>. Accuracy degrades on very different phrasings. - No context. The model sees only the raw input text, not the rule structure. It classifies by surface patterns learned from training data.
- Not a reasoning model. This classifier routes queries; it does not solve them. Use KnowForge-0.6B for full answer generation.
- Synthetic distribution only. Tested exclusively on procedurally generated KnowForge examples. Behaviour on real-world inputs is not evaluated.