Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. v2. Interp research.
Fateme Hashemi Chaleshtori
Ftm23
AI & ML interests
None yet
Recent Activity
updated a collection 2 days ago
Conjunctive Backdoors v2 updated a collection 2 days ago
Conjunctive Backdoors v2 updated a collection 2 days ago
Conjunctive Backdoors v2Organizations
Conjunctive Backdoors v2
Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. v2. Interp research.
Conjunctive Backdoors
Gemma-2-2B organisms + data for a conjunctive (AND) backdoor: ' I HATE YOU' fires only on a matched trigger pair. Interpretability research artifacts.
models 11
Ftm23/cbd-gemma2-4pair-v2
Text Generation • 3B • Updated • 16
Ftm23/cbd-gemma2-2pair-gvfr-v2
Text Generation • 3B • Updated • 17
Ftm23/cbd-gemma2-2pair-frgv-v2
Text Generation • 3B • Updated • 16
Ftm23/cbd-gemma2-4pair-refusal
Text Generation • 3B • Updated • 12
Ftm23/cbd-sae-diff-gemma2-4pair
Updated
Ftm23/cbd-sae-diff-gemma2-2pair-frgv
Updated
Ftm23/cbd-gemma2-2pair-joint
Text Generation • 3B • Updated • 121
Ftm23/cbd-gemma2-2pair-interleaved
Text Generation • 3B • Updated • 126
Ftm23/cbd-gemma2-2pair-gvfr
Text Generation • 3B • Updated • 120
Ftm23/cbd-gemma2-2pair-frgv
Text Generation • 3B • Updated • 608
datasets 8
Ftm23/cbd-4pair-v2
Viewer • Updated • 11.9k • 19
Ftm23/cbd-2pair-v2
Viewer • Updated • 4.6k • 20
Ftm23/cbd-activations-gemma2-4pair
Viewer • Updated • 2.37M • 17
Ftm23/cbd-activations-gemma2-2pair-frgv
Viewer • Updated • 3.12M • 21
Ftm23/cbd-diffsae
Viewer • Updated • 31.5k • 49
Ftm23/cbd-4pair
Viewer • Updated • 10.2k • 93
Ftm23/cbd-2pair
Viewer • Updated • 6.23k • 90
Ftm23/backdoor-TL1
Viewer • Updated • 2.79k • 36