Update README.md
Browse files
README.md
CHANGED
@@ -5,10 +5,3 @@ base_model:
|
|
5 |
---
|
6 |
|
7 |
Abliterated version using the code from (https://github.com/andyrdt/refusal_direction).
|
8 |
-
|
9 |
-
@article{arditi2024refusal,
|
10 |
-
title={Refusal in Language Models Is Mediated by a Single Direction},
|
11 |
-
author={Andy Arditi and Oscar Obeso and Aaquib Syed and Daniel Paleka and Nina Panickssery and Wes Gurnee and Neel Nanda},
|
12 |
-
journal={arXiv preprint arXiv:2406.11717},
|
13 |
-
year={2024}
|
14 |
-
}
|
|
|
5 |
---
|
6 |
|
7 |
Abliterated version using the code from (https://github.com/andyrdt/refusal_direction).
|
|
|
|
|
|
|
|
|
|
|
|
|
|