Text Generation
Transformers
Safetensors
English
code
qwen3_moe
security
code-repair
oss-bench
Mixture of Experts
chain-of-thought
nlp
c
cpp
php
conversational
Instructions to use NxcodeOfficial/NxCode-SafeCoder-30B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NxcodeOfficial/NxCode-SafeCoder-30B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NxcodeOfficial/NxCode-SafeCoder-30B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NxcodeOfficial/NxCode-SafeCoder-30B") model = AutoModelForCausalLM.from_pretrained("NxcodeOfficial/NxCode-SafeCoder-30B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use NxcodeOfficial/NxCode-SafeCoder-30B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NxcodeOfficial/NxCode-SafeCoder-30B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NxcodeOfficial/NxCode-SafeCoder-30B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/NxcodeOfficial/NxCode-SafeCoder-30B
- SGLang
How to use NxcodeOfficial/NxCode-SafeCoder-30B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NxcodeOfficial/NxCode-SafeCoder-30B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NxcodeOfficial/NxCode-SafeCoder-30B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NxcodeOfficial/NxCode-SafeCoder-30B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NxcodeOfficial/NxCode-SafeCoder-30B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use NxcodeOfficial/NxCode-SafeCoder-30B with Docker Model Runner:
docker model run hf.co/NxcodeOfficial/NxCode-SafeCoder-30B
| license: apache-2.0 | |
| language: | |
| - en | |
| - code | |
| tags: | |
| - security | |
| - code-repair | |
| - oss-bench | |
| - moe | |
| - chain-of-thought | |
| - nlp | |
| - c | |
| - cpp | |
| - php | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| <div align="center"> | |
| # π‘οΈ NxCode-SafeCoder-30B | |
| **The Next-Generation Mixture-of-Experts Model for Secure Code Intelligence** | |
| [](https://opensource.org/licenses/Apache-2.0) | |
| [](https://huggingface.co/docs/transformers/index) | |
| []() | |
| </div> | |
| --- | |
| ## π Model Overview | |
| **NxCode-SafeCoder-30B** is a state-of-the-art code generation model engineered specifically for **software security auditing and automated vulnerability remediation**. | |
| Built upon a highly efficient **Mixture-of-Experts (MoE)** architecture, it delivers the knowledge density of a 30B parameter model while maintaining the inference latency of a much smaller model (only ~3B active parameters per token). | |
| Unlike general-purpose coding assistants, NxCode-SafeCoder is aligned using a **Security-First Chain-of-Thought (CoT)** methodology. It effectively mimics the workflow of a senior security researcher: **Analyze -> Reason -> Fix**. | |
| ## β¨ Key Capabilities | |
| * **π‘οΈ Surgical Vulnerability Patching**: Excel at fixing complex memory safety issues (Buffer Overflows, Use-After-Free, Double Free) in C/C++ and PHP. | |
| * **π§ Dual-Phase Generation**: The model is trained to output a detailed **Security Analysis** (`### Analysis`) before generating the **Fixed Code**, ensuring the fix is logically sound and side-effect free. | |
| * **β‘ High-Throughput Inference**: Fully optimized for **vLLM**, achieving **>600 tokens/s** on NVIDIA A100 GPUs, making it suitable for large-scale codebase scanning. | |
| * **π Minimal False Positives**: Drastically reduced sanitizer alerts compared to GPT-4o and Llama-3-70B in fuzzing benchmarks (OSS-Bench). | |
| ## π Performance | |
| *Evaluation based on the OSS-Bench framework (Random Split, PHP-src & SQLite target).* | |
| | Model | Architecture | Compilation Rate | Test Pass Rate | **Sanitizer Alerts** (Lower is Better) | | |
| | :--- | :--- | :---: | :---: | :---: | | |
| | **NxCode-SafeCoder-30B** | **MoE (30B)** | **High** | **High** | **Lowest** π | | |
| | GPT-4o | Dense | High | High | Medium | | |
| | Llama-3-70B-Instruct | Dense | Medium | Medium | High | | |
| | DeepSeek-Coder-33B | Dense | High | Medium | Medium | | |
| > **Note**: While general-purpose models often generate code that compiles, they frequently miss subtle boundary checks or introduce new logic errors. NxCode-SafeCoder prioritizes memory safety above all else. | |
| ## π» Usage | |
| ### 1. Using Transformers | |
| ````python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| model_name = "NxcodeOfficial/NxCode-SafeCoder-30B" | |
| # Load with Flash Attention 2 for best performance | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| device_map="auto", | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| attn_implementation="flash_attention_2" | |
| ) | |
| # Standard Security Prompt Template | |
| # Note: The model expects the function to be wrapped in C code blocks | |
| prompt = """You are a Linux Kernel security expert. Fix the vulnerabilities in the following C function. | |
| Repository: linux | |
| File: mm/mmap.c | |
| Function: | |
| ```c | |
| void *simple_mmap(void *addr, size_t len) { | |
| // Vulnerable: No checks | |
| return mmap(addr, len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); | |
| } | |
| ``` | |
| """ | |
| messages = [{"role": "user", "content": prompt}] | |
| inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) | |
| # Generate (The model will output Analysis first, then the Code) | |
| outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.2) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ```` | |
| ### 2. Using vLLM (Production Recommended) | |
| For maximum throughput (e.g., scanning entire repositories), use vLLM. | |
| ```python | |
| from vllm import LLM, SamplingParams | |
| llm = LLM( | |
| model="NxcodeOfficial/NxCode-SafeCoder-30B", | |
| trust_remote_code=True, | |
| tensor_parallel_size=1, # Fits on a single A100 80GB | |
| gpu_memory_utilization=0.95, | |
| max_model_len=8192 # Recommended limit to avoid OOM | |
| ) | |
| # ... (inference code) | |
| ``` | |
| ## π¬ Methodology | |
| The model was fine-tuned on a proprietary dataset containing **10k+ high-quality security patches** distilled from advanced reasoning engines. The training process utilized: | |
| 1. **Expert Routing Optimization**: Tuning the MoE router to specialize specific experts for code analysis vs. code generation. | |
| 2. **Conservative Alignment**: Reinforcing the preference for safer standard libraries (e.g., `strncpy`, `snprintf`) and explicit null-pointer checks. | |
| ## π Citation | |
| If you use this model in your research or product, please cite: | |
| ```bibtex | |
| @misc{nxcode2025safecoder, | |
| title={NxCode-SafeCoder: Automating Secure Code Repair with MoE}, | |
| author={NxCode Team}, | |
| year={2025}, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/NxcodeOfficial/NxCode-SafeCoder-30B}} | |
| } | |
| ``` | |
| ## βοΈ License | |
| Apache 2.0 |