Experimenting-1
Small language model (9.7M parameters) trained from scratch.
Architecture
| Property | Value |
|---|---|
| Layers | 11 |
| Hidden size | 256 |
| Intermediate size | 640 |
| Attention heads | 4 (GQA kv=2) |
| Max sequence length | 1024 |
| Vocab size | 8192 |
| Tied embeddings | True |
| Total parameters | 9.672M |
Training
- Tokens seen: 41,902,080
- Val loss: inf
- Val PPL: inf
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("GODELEV/Experimenting-1")
model = AutoModelForCausalLM.from_pretrained("GODELEV/Experimenting-1")
inputs = tokenizer("Hello", return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support