Instructions to use NxcodeOfficial/NxCode-SafeCoder-30B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NxcodeOfficial/NxCode-SafeCoder-30B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NxcodeOfficial/NxCode-SafeCoder-30B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NxcodeOfficial/NxCode-SafeCoder-30B")
model = AutoModelForCausalLM.from_pretrained("NxcodeOfficial/NxCode-SafeCoder-30B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use NxcodeOfficial/NxCode-SafeCoder-30B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NxcodeOfficial/NxCode-SafeCoder-30B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NxcodeOfficial/NxCode-SafeCoder-30B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/NxcodeOfficial/NxCode-SafeCoder-30B

SGLang

How to use NxcodeOfficial/NxCode-SafeCoder-30B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NxcodeOfficial/NxCode-SafeCoder-30B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NxcodeOfficial/NxCode-SafeCoder-30B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NxcodeOfficial/NxCode-SafeCoder-30B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NxcodeOfficial/NxCode-SafeCoder-30B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use NxcodeOfficial/NxCode-SafeCoder-30B with Docker Model Runner:
```
docker model run hf.co/NxcodeOfficial/NxCode-SafeCoder-30B
```

NxCode-SafeCoder-30B / README.md

alex-nxcode

Create README.md

56ab9c7 verified 5 months ago

preview code

raw

history blame contribute delete

5.34 kB

	---
	license: apache-2.0
	language:
	- en
	- code
	tags:
	- security
	- code-repair
	- oss-bench
	- moe
	- chain-of-thought
	- nlp
	- c
	- cpp
	- php
	library_name: transformers
	pipeline_tag: text-generation
	---

	<div align="center">

	# 🛡️ NxCode-SafeCoder-30B

	The Next-Generation Mixture-of-Experts Model for Secure Code Intelligence

	[![License](https://img.shields.io/badge/License-Apache_2.0-green.svg)](https://opensource.org/licenses/Apache-2.0)
	[![Model Architecture](https://img.shields.io/badge/Architecture-MoE-blue.svg)](https://huggingface.co/docs/transformers/index)
	[![Task](https://img.shields.io/badge/Task-Vulnerability_Patching-red.svg)]()

	</div>

	---

	## 📖 Model Overview

	NxCode-SafeCoder-30B is a state-of-the-art code generation model engineered specifically for software security auditing and automated vulnerability remediation.

	Built upon a highly efficient Mixture-of-Experts (MoE) architecture, it delivers the knowledge density of a 30B parameter model while maintaining the inference latency of a much smaller model (only ~3B active parameters per token).

	Unlike general-purpose coding assistants, NxCode-SafeCoder is aligned using a Security-First Chain-of-Thought (CoT) methodology. It effectively mimics the workflow of a senior security researcher: Analyze -> Reason -> Fix.

	## ✨ Key Capabilities

	* 🛡️ Surgical Vulnerability Patching: Excel at fixing complex memory safety issues (Buffer Overflows, Use-After-Free, Double Free) in C/C++ and PHP.
	* 🧠 Dual-Phase Generation: The model is trained to output a detailed Security Analysis (`### Analysis`) before generating the Fixed Code, ensuring the fix is logically sound and side-effect free.
	* ⚡ High-Throughput Inference: Fully optimized for vLLM, achieving >600 tokens/s on NVIDIA A100 GPUs, making it suitable for large-scale codebase scanning.
	* 📉 Minimal False Positives: Drastically reduced sanitizer alerts compared to GPT-4o and Llama-3-70B in fuzzing benchmarks (OSS-Bench).

	## 📊 Performance

	Evaluation based on the OSS-Bench framework (Random Split, PHP-src & SQLite target).

	\| Model \| Architecture \| Compilation Rate \| Test Pass Rate \| Sanitizer Alerts (Lower is Better) \|
	\| :--- \| :--- \| :---: \| :---: \| :---: \|
	\| NxCode-SafeCoder-30B \| MoE (30B) \| High \| High \| Lowest 🏆 \|
	\| GPT-4o \| Dense \| High \| High \| Medium \|
	\| Llama-3-70B-Instruct \| Dense \| Medium \| Medium \| High \|
	\| DeepSeek-Coder-33B \| Dense \| High \| Medium \| Medium \|

	> Note: While general-purpose models often generate code that compiles, they frequently miss subtle boundary checks or introduce new logic errors. NxCode-SafeCoder prioritizes memory safety above all else.

	## 💻 Usage

	### 1. Using Transformers

	````python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_name = "NxcodeOfficial/NxCode-SafeCoder-30B"

	# Load with Flash Attention 2 for best performance
	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	device_map="auto",
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	attn_implementation="flash_attention_2"
	)

	# Standard Security Prompt Template
	# Note: The model expects the function to be wrapped in C code blocks
	prompt = """You are a Linux Kernel security expert. Fix the vulnerabilities in the following C function.

	Repository: linux
	File: mm/mmap.c

	Function:
	```c
	void simple_mmap(void addr, size_t len) {
	// Vulnerable: No checks
	return mmap(addr, len, PROT_READ \| PROT_WRITE, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0);
	}
	```
	"""

	messages = [{"role": "user", "content": prompt}]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

	# Generate (The model will output Analysis first, then the Code)
	outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.2)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	````

	### 2. Using vLLM (Production Recommended)

	For maximum throughput (e.g., scanning entire repositories), use vLLM.

	```python
	from vllm import LLM, SamplingParams

	llm = LLM(
	model="NxcodeOfficial/NxCode-SafeCoder-30B",
	trust_remote_code=True,
	tensor_parallel_size=1, # Fits on a single A100 80GB
	gpu_memory_utilization=0.95,
	max_model_len=8192 # Recommended limit to avoid OOM
	)

	# ... (inference code)
	```

	## 🔬 Methodology

	The model was fine-tuned on a proprietary dataset containing 10k+ high-quality security patches distilled from advanced reasoning engines. The training process utilized:
	1. Expert Routing Optimization: Tuning the MoE router to specialize specific experts for code analysis vs. code generation.
	2. Conservative Alignment: Reinforcing the preference for safer standard libraries (e.g., `strncpy`, `snprintf`) and explicit null-pointer checks.

	## 📜 Citation

	If you use this model in your research or product, please cite:

	```bibtex
	@misc{nxcode2025safecoder,
	title={NxCode-SafeCoder: Automating Secure Code Repair with MoE},
	author={NxCode Team},
	year={2025},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/NxcodeOfficial/NxCode-SafeCoder-30B}}
	}
	```

	## ⚖️ License

	Apache 2.0