Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations Paper ā¢ 2406.11801 ā¢ Published Jun 17 ā¢ 15
SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models Paper ā¢ 2406.12274 ā¢ Published Jun 18 ā¢ 14
How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries Paper ā¢ 2402.15302 ā¢ Published Feb 23 ā¢ 3
DistALANER: Distantly Supervised Active Learning Augmented Named Entity Recognition in the Open Source Software Ecosystem Paper ā¢ 2402.16159 ā¢ Published Feb 25 ā¢ 2
Duplicate Question Retrieval and Confirmation Time Prediction in Software Communities Paper ā¢ 2309.05035 ā¢ Published Sep 10, 2023 ā¢ 2
Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context Paper ā¢ 2401.12671 ā¢ Published Jan 23 ā¢ 2
Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models Paper ā¢ 2401.10647 ā¢ Published Jan 19 ā¢ 3