2 2 1

Ed Addario PRO

eaddario

EAddario

AI & ML interests

None yet

Recent Activity

reacted to clem's post with ❤️ about 14 hours ago

10,000+ models based on Deepseek R1 have been publicly shared on Hugging Face! Which ones are your favorite ones: https://huggingface.co./models?sort=trending&search=r1. Truly game-changer!

updated a model 1 day ago

eaddario/Llama-Guard-3-8B-GGUF

reacted to albertvillanova's post with 🔥 3 days ago

🚀 Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. 🦾🔒 Here's why this is a game-changer for agent-based systems: 🧵👇 1️⃣ Security First 🔐 Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications. 2️⃣ Deterministic & Reproducible Runs 📦 By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable setting—no more environment mismatches or dependency issues! 3️⃣ Resource Control & Limits 🚦 Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents don’t spiral out of control. 4️⃣ Safer Code Execution in Production 🏭 Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure. 5️⃣ Easy to Integrate 🛠️ With smolagents, you can simply configure your agent to use Docker or E2B as its execution backend—no need for complex security setups! 6️⃣ Perfect for Autonomous AI Agents 🤖 If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation. ⚡ Get started now: https://github.com/huggingface/smolagents What will you build with smolagents? Let us know! 🚀💡

View all activity

Organizations

None yet

eaddario's activity

reacted to clem's post with ❤️ about 14 hours ago

Post

3376

10,000+ models based on Deepseek R1 have been publicly shared on Hugging Face! Which ones are your favorite ones: https://huggingface.co./models?sort=trending&search=r1. Truly game-changer!

reacted to albertvillanova's post with 🔥 3 days ago

Post

3606

🚀 Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. 🦾🔒

Here's why this is a game-changer for agent-based systems: 🧵👇

1️⃣ Security First 🔐
Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.

2️⃣ Deterministic & Reproducible Runs 📦
By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable setting—no more environment mismatches or dependency issues!

3️⃣ Resource Control & Limits 🚦
Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents don’t spiral out of control.

4️⃣ Safer Code Execution in Production 🏭
Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.

5️⃣ Easy to Integrate 🛠️
With smolagents, you can simply configure your agent to use Docker or E2B as its execution backend—no need for complex security setups!

6️⃣ Perfect for Autonomous AI Agents 🤖
If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.

⚡ Get started now: https://github.com/huggingface/smolagents

What will you build with smolagents? Let us know! 🚀💡

reacted to clem's post with 🔥 3 days ago

Post

5809

Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co./nvidia
Enterprise hub: https://huggingface.co./enterprise

posted an update 3 days ago

Post

673

Squeezing out tensor bits, part III and final (for now 😉)

(For context please see: https://huggingface.co./posts/eaddario/832567461491467)

I have just finished uploading eaddario/Hammer2.1-7b-GGUF and eaddario/Dolphin3.0-Mistral-24B-GGUF.

While I was able to get 7+% reduction with Hammer2.1-7b, the larger Dolphin3.0-Mistral-24B proved to be a more difficult nut to crack (only 3%).

I have an idea as to why this was the case, which I'll test with QwQ-32B, but it will be a while before I can find the time.

replied to their post 3 days ago

Thank you @UICO , but at the moment rather than a technique, it's more of a mix of brutish-force, educated guesses, trial and error and the occasional luck, but will tackle QwQ 32B next as it will help me validate an idea (see my next post)

replied to their post 3 days ago

The process is a bit all over the place at the moment. Some steps are automated, others are manual, and a few are trial and error. But as I streamline it, I will publish my findings in a How To guide, along with the tools I'm using.

reacted to singhsidhukuldeep's post with 👍 6 days ago

Post

6697

Exciting New Tool for Knowledge Graph Extraction from Plain Text!

I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.

KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.

The technical approach is fascinating:

1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph

The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").

The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.

For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.

The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!

posted an update 6 days ago

Post

2047

Squeezing out tensor bits, part II

At post time, watt-ai/watt-tool-70B continues to top the Berkeley Function-Calling Leaderboard, with the 8B version occupying the 4th place. A remarkable achievement for a model of that size!

The "squeezed" version is now available at eaddario/Watt-Tool-8B-GGUF

(For context please see: https://huggingface.co./posts/eaddario/832567461491467)

2 replies

replied to their post 6 days ago

In this case, the Q2_K refers to the quantization of the embedding layer applied to each version of the model, rather than the overall quantization used. For example, the DeepSeek-R1-Distill-Qwen-7B-Q4_K_M model would have its embedding layer quantized at Q2_K instead of the usual Q4_K.

Once all the quantized versions are generated, I then produce Perplexity, KL Divergence, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores for each version using the test datasets documented in the model card.

posted an update 8 days ago

Post

2713

Squeezing out tensor bits?

I have been tinkering with quantization and pruning to reduce model sizes. So far, I've had modest success in producing, on average, 8% smaller versions with negligible loss of quality, and I think further reductions in the 10-15% range are realistic, but I've come across a behaviour I wasn't expecting!

Part of the process I'm following consists of quantizing the embedding and output layers aggressively. Since the embedding layer is more about lookup than complex computation, the vectors representing the relative distances between embeddings are usually preserved well enough making this layer fairly robust to quantization. So far, so good.

The output layer, on the other hand, maps the final hidden state to the vocabulary logits and therefore, small changes in these logits could lead to a different probability distribution over the vocabulary, resulting in incorrect word predictions, or so I thought.

Surprisingly, I'm finding that even at Q2_K the loss of overall capability is minimal. Was this to be expected? or am I missing something?

I have published a version with all the test results if you want to give it a try: eaddario/DeepSeek-R1-Distill-Qwen-7B-GGUF

I'll upload other models as time allows.

Any ideas / clarifications / suggestions are very much welcomed!

4 replies