FAR AI

non-profit

https://far.ai/

FARAIResearch

AlignmentResearch

Activity Feed Request to join this org

AI & ML interests

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Recent Activity

tomtseng updated a model about 12 hours ago

AlignmentResearch/robust_llm_oskar-036g_output_probe_jailbreaks_Qwen2.5-7B-Instruct_s-0

tomtseng updated a model about 12 hours ago

AlignmentResearch/robust_llm_oskar-034x_input_probe_jailbreaks_Qwen2.5-7B-Instruct_s-0

tomtseng published a model about 12 hours ago

AlignmentResearch/robust_llm_oskar-036g_output_probe_jailbreaks_Qwen2.5-7B-Instruct_s-0

View all activity

spaces 1

Running

🔎

Tuned Lens

models 3677

AlignmentResearch/robust_llm_oskar-034x_input_probe_jailbreaks_Qwen2.5-7B-Instruct_s-0

Updated about 12 hours ago

AlignmentResearch/robust_llm_oskar-036g_output_probe_jailbreaks_Qwen2.5-7B-Instruct_s-0

Updated about 12 hours ago

datasets 16

AlignmentResearch/Llama3Jailbreaks

Viewer • Updated about 16 hours ago • 78.6k • 1.65k

AlignmentResearch/XSTest

Viewer • Updated 5 days ago • 900 • 13

AlignmentResearch/StrongREJECT

Viewer • Updated 19 days ago • 313 • 1.12k

AlignmentResearch/WordLength

Viewer • Updated Aug 7, 2024 • 100k • 2.04k

AlignmentResearch/Harmless

Viewer • Updated Jul 29, 2024 • 86.6k • 1.81k

AlignmentResearch/Helpful

Viewer • Updated Jul 29, 2024 • 88.1k • 2.24k

AlignmentResearch/PasswordMatch

Viewer • Updated Jul 29, 2024 • 100k • 3.35k

AlignmentResearch/IMDB

Viewer • Updated Jul 29, 2024 • 97.5k • 3.15k

AlignmentResearch/EnronSpam

Viewer • Updated Jul 29, 2024 • 62.3k • 1.53k

AlignmentResearch/PasswordMatch-test

Viewer • Updated Jul 26, 2024 • 50k • 47

AI & ML interests

Recent Activity

Team members 12

spaces 1

Tuned Lens

models 3677 Sort: Recently updated

datasets 16 Sort: Recently updated

models 3677

datasets 16