Text Generation
Transformers
Safetensors
English
plbart
code-generation
vulnerability-injection
security
vaitp
finetuned
Instructions to use FBogaerts/plbart-multi_task-python-Finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FBogaerts/plbart-multi_task-python-Finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FBogaerts/plbart-multi_task-python-Finetuned")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("FBogaerts/plbart-multi_task-python-Finetuned") model = AutoModelForCausalLM.from_pretrained("FBogaerts/plbart-multi_task-python-Finetuned") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use FBogaerts/plbart-multi_task-python-Finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FBogaerts/plbart-multi_task-python-Finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FBogaerts/plbart-multi_task-python-Finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/FBogaerts/plbart-multi_task-python-Finetuned
- SGLang
How to use FBogaerts/plbart-multi_task-python-Finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FBogaerts/plbart-multi_task-python-Finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FBogaerts/plbart-multi_task-python-Finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FBogaerts/plbart-multi_task-python-Finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FBogaerts/plbart-multi_task-python-Finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use FBogaerts/plbart-multi_task-python-Finetuned with Docker Model Runner:
docker model run hf.co/FBogaerts/plbart-multi_task-python-Finetuned
| license: mit | |
| base_model: uclanlp/plbart-multi_task-python | |
| language: | |
| - en | |
| library_name: transformers | |
| tags: | |
| - text-generation | |
| - code-generation | |
| - vulnerability-injection | |
| - security | |
| - vaitp | |
| - finetuned | |
| pretty_name: "FBogaerts/plbart-multi_task-python-Finetuned Finetuned for Vulnerability Injection" | |
| # FBogaerts/plbart-multi_task-python-Finetuned Finetuned for Vulnerability Injection (VAITP) | |
| This model is a fine-tuned version of **uclanlp/plbart-multi_task-python** specialized for the task of security vulnerability injection in Python code. It has been trained to follow a specific instruction format to precisely modify code snippets and introduce vulnerabilities. | |
| This model was developed as part of the research for our paper: *(coming soon)*. | |
| The VAITP CLI Framework and related resources can be found at our [GitHub repository](coming soon). | |
| ## Model Description | |
| This model was fine-tuned to act as a "Coder" LLM. It takes a specific instruction set and a piece of original Python code, and its objective is to return the modified code with the requested vulnerability injected. | |
| The model excels when prompted using the specific format it was trained on. | |
| ## Intended Uses & Limitations | |
| **Intended Use** | |
| This model is intended for research purposes in the field of automated security testing, SAST/DAST tool evaluation, and the generation of training data for security-aware models. It should be used within a sandboxed environment to inject vulnerabilities into non-production code for analysis. | |
| **Out-of-Scope Uses** | |
| This model should **NOT** be used for: | |
| - Generating malicious code for use in real-world attacks. | |
| - Directly modifying production codebases. | |
| - Any application outside of controlled, ethical security research. | |
| The generated code should always be manually reviewed before use. | |
| ## How to Use | |
| This model expects a very specific prompt format, which we call the `FINETUNED_STYLE` in our paper. The format is: | |
| `{instruction} _BREAK_ {original_code}` | |
| Here is an example using `transformers`: | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| model_name = "FBogaerts/plbart-multi_task-python-Finetuned" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForCausalLM.from_pretrained(model_name) | |
| instruction = "Modify the function to introduce a OS Command Injection vulnerability. The vulnerable code must contain the pattern: 'User-controlled input is used in a subprocess call with shell=True'." | |
| original_code = "import subprocess\ndef execute(cmd):\n subprocess.run(cmd, shell=False)" | |
| prompt = f"{instruction} _BREAK_ {original_code}" | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate(**inputs, max_new_tokens=256) | |
| vulnerable_code = tokenizer.decode(outputs[0], skip_special_tokens=True) | |
| # The model will output the full modified code block. | |
| # Further cleaning may be needed to extract only the code. | |
| print(vulnerable_code) | |
| ``` | |
| Training Procedure | |
| Training Data | |
| The model was fine-tuned on a dataset of 1,406 examples derived from the DeVAITP Vulnerability Corpus. Each example consists of a triplet: (instruction, original_code, vulnerable_code). The instructions were generated using the meta-prompting technique described in our paper, with meta-llama/Meta-Llama-3.1-8B-Instruct serving as the Planner model. | |
| Training Hyperparameters | |
| The model was fine-tuned using the following key hyperparameters: | |
| Framework: Hugging Face TRL | |
| Learning Rate: 2e-5 | |
| Number of Epochs: 1 | |
| Batch Size: 1 | |
| Hardware: Google Colab (L4 GPU) | |
| Evaluation | |
| (coming soon) | |
| Citation | |
| If you use this model in your research, please cite our paper: | |
| (BibTeX entry will be provided upon publication) | |