deepset/prompt-injections
Viewer • Updated • 662 • 6.19k • 158
How to use wayaway/test_m with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="wayaway/test_m") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("wayaway/test_m")
model = AutoModelForSequenceClassification.from_pretrained("wayaway/test_m")This model is a fine-tuned version of microsoft/deberta-v3-base on the promp-injection dataset. It achieves the following results on the evaluation set:
This model detects prompt injection attempts and classifies them as "INJECTION". Legitimate requests are classified as "LEGIT". The dataset assumes that legitimate requests are either all sorts of questions of key word searches.
If you are using this model to secure your system and it is overly "trigger-happy" to classify requests as injections, consider collecting legitimate examples and retraining the model with the promp-injection dataset.
Based in the promp-injection dataset.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| No log | 1.0 | 69 | 0.2353 | 0.9741 |
| No log | 2.0 | 138 | 0.0894 | 0.9741 |
| No log | 3.0 | 207 | 0.0673 | 0.9914 |
Base model
microsoft/deberta-v3-base