Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

ginipick

posted an update 2 days ago

Post

5280

🚀 Introducing MOUSE: Space Research Thinking on HuggingFace Spaces

🚀 How to Get Started
ginipick/spaces-research-think

Welcome to **MOUSE: Space Research Thinking** – an innovative HuggingFace Spaces project designed to transform how you analyze and interact with Python code. Whether you're a developer, researcher, or simply passionate about coding, this tool provides state-of-the-art analysis, summarization, and usage guidance, all powered by advanced AI.

---

## 🌟 Key Features

- **Real-Time Code Analysis**
Instantly dissect your Python code to reveal its structure, functionality, and potential applications. Our tool delivers:
- **Background & Necessity**: Understand the context behind the code.
- **Functional Utility & Value**: Highlight core functionalities and benefits.
- **Distinctive Features**: Discover what sets the project apart.
- **Target Audience & Applications**: Identify who can benefit and how.
- **Expected Impact**: Envision the improvements and innovations the code can drive.
🔍

- **Visual File Structure Overview**
Navigate your project with ease! A dynamic tree-view displays your file hierarchy in a clear, intuitive format, allowing you to explore directories and files effortlessly. 🌲

- **Interactive Usage Guide**
Receive step-by-step instructions and practical tips on using the tool effectively. Our AI assistant explains everything in an engaging, user-friendly manner, ensuring a smooth learning curve. 💡

- **AI-Powered Code Chat**
Engage in real-time conversations with our AI. Ask questions, request detailed explanations, or dive deeper into code specifics with a chat interface that makes complex topics accessible. 🤖💬

- **Customizable Experience**
Tailor the analysis to your needs with adjustable parameters like token limits and response temperatures, enabling both concise summaries and in-depth explorations. ⚙️

2 replies

burtenshaw

posted an update 2 days ago

Post

4633

Now the Hugging Face agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.

🔗 Follow the org for updates https://huggingface.co./agents-course

This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:

- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use

The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .

prithivMLmods

posted an update 2 days ago

Post

3327

Dropping some of the custom fine-tunes based on SigLIP2,
with a single-label classification problem type! 🌀🧤

- AI vs Deepfake vs Real : prithivMLmods/AI-vs-Deepfake-vs-Real-Siglip2
- Deepfake Detect : prithivMLmods/Deepfake-Detect-Siglip2
- Fire Detection : prithivMLmods/Fire-Detection-Siglip2
- Deepfake Quality Assess : prithivMLmods/Deepfake-Quality-Assess-Siglip2
- Guard Against Unsafe Content : prithivMLmods/Guard-Against-Unsafe-Content-Siglip2

🌠Collection : prithivMLmods/siglip2-custom-67bcdb2de8fe96b99fb4e19e

burtenshaw

posted an update about 16 hours ago

Post

1166

I made a real time voice agent with FastRTC, smolagents, and hugging face inference providers. Check it out in this space:

🔗 burtenshaw/coworking_agent

5 replies

jasoncorkill

posted an update about 18 hours ago

Post

1069

Has OpenGVLab Lumina Outperformed OpenAI’s Model?

We’ve just released the results from a large-scale human evaluation (400k annotations) of OpenGVLab’s newest text-to-image model, Lumina. Surprisingly, Lumina outperforms OpenAI’s DALL-E 3 in terms of alignment, although it ranks #6 in our overall human preference benchmark.

To support further development in text-to-image models, we’re making our entire human-annotated dataset publicly available. If you’re working on model improvements and need high-quality data, feel free to explore.

We welcome your feedback and look forward to any insights you might share!

Rapidata/OpenGVLab_Lumina_t2i_human_preference

singhsidhukuldeep

posted an update 1 day ago

Post

1427

O1 Embedder: Transforming Retrieval Models with Reasoning Capabilities

Researchers from University of Science and Technology of China and Beijing Academy of Artificial Intelligence have developed a novel retrieval model that mimics the slow-thinking capabilities of reasoning-focused LLMs like OpenAI's O1 and DeepSeek's R1.

Unlike traditional embedding models that directly match queries with documents, O1 Embedder first generates thoughtful reflections about the query before performing retrieval. This two-step process significantly improves performance on complex retrieval tasks, especially those requiring intensive reasoning or zero-shot generalization to new domains.

The technical implementation is fascinating:

- The model integrates two essential functions: Thinking and Embedding
- It uses an "Exploration-Refinement" data synthesis workflow where initial thoughts are generated by an LLM and refined by a retrieval committee
- A multi-task training method fine-tunes a pre-trained LLM to generate retrieval thoughts via behavior cloning while simultaneously learning embedding capabilities through contrastive learning
- Memory-efficient joint training enables both tasks to share encoding results, dramatically increasing batch size

The results are impressive - O1 Embedder outperforms existing methods across 12 datasets in both in-domain and out-of-domain scenarios. For example, it achieves a 3.9% improvement on Natural Questions and a 3.0% boost on HotPotQA compared to models without thinking capabilities.

This approach represents a significant paradigm shift in retrieval technology, bridging the gap between traditional dense retrieval and the reasoning capabilities of large language models.

What do you think about this approach? Could "thinking before retrieval" transform how we build search systems?

onekq

posted an update 2 days ago

Post

2562

Necessity is mother of invention. To understand ⚡FlashMLA⚡ by
🐋DeepSeek 🐋, the first question to ask is why.

The keyword here is H800, a lower-end product tailored for export control. The purpose here is to squeeze out as much performance as possible.

But here is the most important takeaway: this invention benefits EVERYONE.

2 replies

Locutusque

posted an update 2 days ago

Post

2209

🎉 Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! ✨️

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! 🧠

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! ✍️

DualityAI-RebekahBogdanoff

posted an update about 11 hours ago

Post

713

✨🎉Duality.ai just released a multiclass object detection dataset for YOLOv8, as well as a tutorial on how to create your own multiclass dataset!

Carefully crafted (not GenAI created) synthetic data that ACTUALLY trains a model that works in the physical world.

Create a free FalconEDU account, and download the 1000 image and annotation dataset - https://falcon.duality.ai/secure/documentation/ex3-dataset?sidebarMode=learn
-or-
Follow along with Exercise 3: Multiclass Object Detection to start creating - https://falcon.duality.ai/secure/documentation/ex3-objdetection-multiclass
-or-
Download this Colab notebook to see the data work, no hardware required - https://falcon.duality.ai/secure/documentation/ex3-dataset?sidebarMode=learn

fdaudens

posted an update 1 day ago

Post

1743

Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet?

Open source olmOCR just dropped and the results are impressive.

Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives.

To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images.

Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up.

👉 Try the demo: https://olmocr.allenai.org

Going right into the AI toolkit: JournalistsonHF/ai-toolkit

Recently active users