Mateusz Dziemian's picture

Mateusz Dziemian

mattmdjaga

AI & ML interests

Interested in AI safety.

Recent Activity

Organizations

Hugging Face for Computer Vision's profile picture Sure Here, Marv's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

mattmdjaga's activity

New activity in burtenshaw/recap 5 days ago
New activity in ai-safety-institute/AgentHarm 12 days ago

adding chat tasks

1
#3 opened 12 days ago by
mattmdjaga
reacted to their post with ๐Ÿ”ฅ 2 months ago
view post
Post
1588
๐Ÿšจ New Agent Benchmark ๐Ÿšจ
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

ai-safety-institute/AgentHarm

Collaboration between UK AI Safety Institute and Gray Swan AI to create a dataset for measuring harmfulness of LLM agents.

The benchmark contains both harmful and benign sets of 11 categories with varied difficulty levels and detailed evaluation, not only testing success rate but also tool level accuracy.

We provide refusal and accuracy metrics across a wide range of models in both no attack and prompt attack scenarios.

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents (2410.09024)
posted an update 2 months ago
view post
Post
1588
๐Ÿšจ New Agent Benchmark ๐Ÿšจ
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

ai-safety-institute/AgentHarm

Collaboration between UK AI Safety Institute and Gray Swan AI to create a dataset for measuring harmfulness of LLM agents.

The benchmark contains both harmful and benign sets of 11 categories with varied difficulty levels and detailed evaluation, not only testing success rate but also tool level accuracy.

We provide refusal and accuracy metrics across a wide range of models in both no attack and prompt attack scenarios.

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents (2410.09024)
reacted to their post with ๐Ÿš€ 4 months ago
view post
Post
2161
$40K in Bounties: Ultimate Jailbreaking Championship 2024

๐ŸšจUltimate Jailbreaking Championship 2024 ๐Ÿšจ
Hackers vs. AI in the arena. Let the battle begin!
๐Ÿ† $40,000 in Bounties
๐Ÿ—“๏ธ Sept 7, 2024 @ 10AM PDT
๐Ÿ”—Register Now: https://app.grayswan.ai/arena
====

Can you push an aligned language model to generate a bomb recipe or a fake news article? Join fellow hackers in a jailbreaking arena where you can test the boundaries of advanced LLMs.

====

The Objective
Your goal is to jailbreak as many LLMs as possible, as quickly as possible in the arena!

====

The Stakes
Break a model and claim your share of the $40,000 in bounties! With various jailbreak bounties and top hacker rewards, there are plenty of opportunities to win. Winners will also receive priority consideration for employment and internship opportunities at Gray Swan AI.

====

Ready to rise to the challenge? Join us and show the world what you can do!

See you in the arena!
  • 1 reply
ยท
posted an update 4 months ago
view post
Post
2161
$40K in Bounties: Ultimate Jailbreaking Championship 2024

๐ŸšจUltimate Jailbreaking Championship 2024 ๐Ÿšจ
Hackers vs. AI in the arena. Let the battle begin!
๐Ÿ† $40,000 in Bounties
๐Ÿ—“๏ธ Sept 7, 2024 @ 10AM PDT
๐Ÿ”—Register Now: https://app.grayswan.ai/arena
====

Can you push an aligned language model to generate a bomb recipe or a fake news article? Join fellow hackers in a jailbreaking arena where you can test the boundaries of advanced LLMs.

====

The Objective
Your goal is to jailbreak as many LLMs as possible, as quickly as possible in the arena!

====

The Stakes
Break a model and claim your share of the $40,000 in bounties! With various jailbreak bounties and top hacker rewards, there are plenty of opportunities to win. Winners will also receive priority consideration for employment and internship opportunities at Gray Swan AI.

====

Ready to rise to the challenge? Join us and show the world what you can do!

See you in the arena!
  • 1 reply
ยท
New activity in mattmdjaga/segformer_b2_clothes 4 months ago

test

#22 opened 4 months ago by
jingtextsara
reacted to alvdansen's post with ๐Ÿค— 5 months ago
view post
Post
6874
Alright Ya'll

I know it's a Saturday, but I decided to release my first Flux Dev Lora.

A retrain of my "Frosting Lane" model and I am sure the styles will just keep improving.

Have fun! Link Below - Thanks again to @ostris for the trainer and Black Forest Labs for the awesome model!

alvdansen/frosting_lane_flux