Ame Vi

Ameeeee

AI & ML interests

None yet

Recent Activity

reacted to fdaudens's post with 🔥 1 day ago

Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet? Open source olmOCR just dropped and the results are impressive. Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives. To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images. Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up. 👉 Try the demo: https://olmocr.allenai.org Going right into the AI toolkit: https://huggingface.co./spaces/JournalistsonHF/ai-toolkit

reacted to burtenshaw's post with 👍 1 day ago

I made a real time voice agent with FastRTC, smolagents, and hugging face inference providers. Check it out in this space: 🔗 https://huggingface.co./spaces/burtenshaw/coworking_agent

upvoted an article 3 days ago

Synthetic data: save money, time and carbon with open source

View all activity

Organizations

Ameeeee's activity

reacted to fdaudens's post with 🔥 1 day ago

Post

2649

Is this the best tool to extract clean info from PDFs, handwriting and complex documents yet?

Open source olmOCR just dropped and the results are impressive.

Tested the free demo with various documents, including a handwritten Claes Oldenburg letter. The speed is impressive: 3000 tokens/second on your own GPU - that's 1/32 the cost of GPT-4o ($190/million pages). Game-changer for content extraction and digital archives.

To achieve this, Ai2 trained a 7B vision language model on 260K pages from 100K PDFs using "document anchoring" - combining PDF metadata with page images.

Best part: it actually understands document structure (columns, tables, equations) instead of just jumbling everything together like most OCR tools. Their human eval results back this up.

👉 Try the demo: https://olmocr.allenai.org

Going right into the AI toolkit: JournalistsonHF/ai-toolkit

3 replies

reacted to burtenshaw's post with 👍 1 day ago

Post

2532

I made a real time voice agent with FastRTC, smolagents, and hugging face inference providers. Check it out in this space:

🔗 burtenshaw/coworking_agent

5 replies

reacted to davidberenstein1957's post with 😎➕🤗🧠🚀🔥 3 months ago

Post

1719

Let’s make a generation of amazing image-generation models

The best image generation models are trained on human preference datasets, where annotators have selected the best image from a choice of two. Unfortunately, many of these datasets are closed source so the community cannot train open models on them. Let’s change that!

The community can contribute image preferences for an open-source dataset that could be used for building AI models that convert text to image, like the flux or stable diffusion families. The dataset will be open source so everyone can use it to train models that we can all use.

Blog: https://huggingface.co./blog/burtenshaw/image-preferences

posted an update 3 months ago

Post

1283

Build a fine-tuning dataset with No Code.

Do you want to build a small dataset for creative writing to fine-tune an Open LLM?
- Find a dataset full of conversations with ChatGPT on the Hugging Face Hub.
- Import it into your Argilla Space.
- Preview the dataset and create a question to label the relevant conversations.
- Label 1000 valid examples of creating writing.
- Use this dataset with Autotrain to fine-tune your model.

reacted to reach-vb's post with 🔥 3 months ago

Post

4471

What a brilliant week for Open Source AI!

Qwen 2.5 Coder by Alibaba - 0.5B / 1.5B / 3B / 7B / 14B/ 32B (Base + Instruct) Code generation LLMs, with 32B tackling giants like Gemnini 1.5 Pro, Claude Sonnet
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f

LLM2CLIP from Microsoft - Leverage LLMs to train ultra-powerful CLIP models! Boosts performance over the previous SOTA by ~17%
microsoft/llm2clip-672323a266173cfa40b32d4c

Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents
Nexusflow/athene-v2-6735b85e505981a794fb02cc

Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed
microsoft/orca-agentinstruct-1M-v1

Ultravox by FixieAI - 70B/ 8B model approaching GPT4o level, pick any LLM, train an adapter with Whisper as Audio Encoder
reach-vb/ultravox-audio-language-model-release-67373b602af0a52b2a88ae71

JanusFlow 1.3 by DeepSeek - Next iteration of their Unified MultiModal LLM Janus with RectifiedFlow
deepseek-ai/JanusFlow-1.3B

Common Corpus by Pleais - 2,003,039,184,047 multilingual, commercially permissive and high quality tokens!
PleIAs/common_corpus

I'm sure I missed a lot, can't wait for the next week!

Put down in comments what I missed! 🤗

reacted to maxiw's post with 🤗 4 months ago

Post

4659

I was curious to see what people post here on HF so I created a dataset with all HF Posts: maxiw/hf-posts

Some interesting stats:

Top 5 Authors by Total Impressions:
-----------------------------------
@merve : 171,783 impressions (68 posts)
@fdaudens : 135,253 impressions (81 posts)
@singhsidhukuldeep : 122,591 impressions (81 posts)
@akhaliq : 119,526 impressions (78 posts)
@MonsterMMORPG : 112,500 impressions (45 posts)

Top 5 Users by Number of Reactions Given:
----------------------------------------
@osanseviero : 1278 reactions
@clem : 910 reactions
@John6666 : 899 reactions
@victor : 674 reactions
@samusenps : 655 reactions

Top 5 Most Used Reactions:
-------------------------
❤️: 7048 times
🔥: 5921 times
👍: 4856 times
🚀: 2549 times
🤗: 2065 times

10 replies

reacted to davidberenstein1957's post with 😎🔥🚀 6 months ago

Post

1825

🌟 Argilla v2.1.0 goes multi-modal: Image Field, Dark Mode, Enhanched Hugging Face Hub imports and more!

🖼 Image Field: Seamlessly work with multimodal datasets
🌓 Dark Mode: Reduce eye strain with our sleek new look
🤗 Enhanced Hugging Face Hub import with the SDK
🇪🇸 Spanish UI: Breaking language barriers

Plus more improvements to supercharge your model curation workflow!

Check out the full announcement for details and code examples: https://github.com/argilla-io/argilla/compare/v2.0.1...v2.1.0

posted an update 7 months ago

Post

3599

❤️‍🔥 Just released version 2.0 of Argilla!

This small revolution includes:

🔌 You can now integrate with the Hugging Face Hub and get started in under five minutes.
🪂 A single Dataset class is now designed to handle multiple tasks.
🔧 It’s 100 times simpler to configure your dataset now with the new SDK!
📖 The documentation has been revamped to be cleaner and more user-friendly.
🍌 A new feature automates splitting annotation tasks among a team.
✍️ The layout has been made more flexible to accommodate many use cases.

Check out the release highlights for more details: https://github.com/argilla-io/argilla/releases/tag/v2.0.0

1 reply

reacted to davidberenstein1957's post with 🔥 7 months ago

Post

2201

💎 I created some shiny new Argilla datasets to go along with the 2.0 release!

import argilla as rg  

ds = rg.Dataset.from_hub(
    "argilla/multi-modal-vlm-visit-bench"
)

argilla/argilla-v20-compatible-datasets-66a8e670f351acac61a0421c

2 replies