Instructions to use ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct
- SGLang
How to use ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct with Docker Model Runner:
docker model run hf.co/ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct", dtype="auto")Control-LLM-Llama3.1-8B-Math16
This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on OpenMath2 dataset.
Linked Paper
This model is associated with the paper: Control-LLM.
Linked Open Source code - training, eval and benchmark
This model is associated with the github: Control-LLM.
Evaluation Results
Here is an overview of the evaluation results and findings:
Benchmark Results Table
The table below summarizes evaluation results across mathematical tasks and original capabilities.
| Model | MH | M | G8K | M-Avg | ARC | GPQA | MLU | MLUP | O-Avg | Overall |
|---|---|---|---|---|---|---|---|---|---|---|
| Llama3.1-8B-Inst | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
| Control LLM* | 36.0 | 61.7 | 89.7 | 62.5 | 82.5 | 30.8 | 71.6 | 45.4 | 57.6 | 60.0 |
Explanation:
- MH: MathHard
- M: Math
- G8K: GSM8K
- M-Avg: Math - Average across MathHard, Math, and GSM8K
- ARC: ARC benchmark
- GPQA: General knowledge QA
- MLU: MMLU (Massive Multitask Language Understanding)
- MLUP: MMLU Pro
- O-Avg: Original Capability - Average across ARC, GPQA, MMLU, and MLUP
- Overall: Combined average across all tasks
Catastrophic Forgetting on OpenMath
The following plot illustrates and compares catastrophic forgetting mitigation during training
Alignment Result
The plot below highlights the alignment result of the model trained with Control LLM.
- Downloads last month
- 15
Model tree for ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct
Base model
meta-llama/Llama-3.1-8BDataset used to train ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct
Paper for ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct
Evaluation results
- exact_match,none on Math, Math Hard, GSM8Kself-reported0.621
- exact_match,none (gsm8k_0shot_instruct) on Math, Math Hard, GSM8Kself-reported0.897
- exact_match,none (meta_math_0shot_instruct) on Math, Math Hard, GSM8Kself-reported0.617
- exact_match,none (meta_math_hard_0shot_instruct) on Math, Math Hard, GSM8Kself-reported0.360
- exact_match,strict-match on Llama-3.1-8B-Instruct-evals Datasetself-reported0.600
- exact_match,strict-match (meta_arc_0shot_instruct) on Llama-3.1-8B-Instruct-evals Datasetself-reported0.825
- exact_match,strict-match (meta_gpqa_0shot_cot_instruct) on Llama-3.1-8B-Instruct-evals Datasetself-reported0.308
- exact_match,strict-match (meta_mmlu_0shot_instruct) on Llama-3.1-8B-Instruct-evals Datasetself-reported0.716
- exact_match,strict-match (meta_mmlu_pro_5shot_instruct) on Llama-3.1-8B-Instruct-evals Datasetself-reported0.454


# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct")