lab

lab212

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 month ago

fka/awesome-chatgpt-prompts

reacted to m-ric's post with 👍 about 1 month ago

Today we make the biggest release in smolagents so far: 𝘄𝗲 𝗲𝗻𝗮𝗯𝗹𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘄𝗵𝗶𝗰𝗵 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝘄𝗲𝗯 𝗯𝗿𝗼𝘄𝘀𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁𝘀! 🥳 Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would. The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year." Hi @mlabonne ! Go try it out, it's the most cracked agentic stuff I've seen in a while 🤯 (well, along with OpenAI's Operator who beat us by one day) For more detail, read our announcement blog 👉 https://huggingface.co./blog/smolagents-can-see The code for the web browser example is here 👉 https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py

reacted to burtenshaw's post with 🧠 about 1 month ago

We’re launching a FREE and CERTIFIED course on Agents! We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents. Here's what you'll learn: - Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions. - Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors. - Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents. - Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents. Audience This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents. Enroll today and start building the next generation of AI agent applications! https://bit.ly/hf-learn-agents

View all activity

Organizations

None yet

lab212's activity

liked a dataset about 1 month ago

fka/awesome-chatgpt-prompts

Viewer • Updated Jan 6 • 203 • 11.8k • 7.59k

reacted to m-ric's post with 👍 about 1 month ago

Post

3266

Today we make the biggest release in smolagents so far: 𝘄𝗲 𝗲𝗻𝗮𝗯𝗹𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘄𝗵𝗶𝗰𝗵 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝘄𝗲𝗯 𝗯𝗿𝗼𝘄𝘀𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁𝘀! 🥳

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while 🤯 (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog 👉 https://huggingface.co./blog/smolagents-can-see
The code for the web browser example is here 👉 https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py

3 replies

reacted to burtenshaw's post with 🧠 about 1 month ago

Post

46036

We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents

29 replies

replied to chansung's post 4 months ago

Thanks pardner.

reacted to chansung's post with 👍 4 months ago

Post

1956

🎙️ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.

Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.

🎙️ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/

This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)

"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer

This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!

3 replies

liked a Space 5 months ago

9.55k

AI Comic Factory

👩

Create your own AI comic with a single prompt

reacted to nicolay-r's post with 🧠 5 months ago

Post

1008

📢 Two weeks ago I got a chance to share the most recent reasoning 🧠 capabilities of Large Language models in Sentiment Analysis NLPSummit-2024.

For those who missed and still wish to find out the advances of GenAI in that field, the recording is now available:
https://www.youtube.com/watch?v=qawLJsRHzB4

You will be aware of:
☑️ how well LLMs reasoning can be used for reasoning in sentiment analysis as in Zero-shot-Learning,
☑️ how to improve reasoning by applying and leaving step-by-step chains (Chain-of-Thought)
☑️ how to prepare the most advanced model in sentiment analysis using Chain-of-Thought.

Links:
📜 Paper: Large Language Models in Targeted Sentiment Analysis (2404.12342)
⭐ Code: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework

reacted to reach-vb's post with 👍 5 months ago

Post

3205

NEW: Open Source Text/ Image to video model is out - MIT licensed - Rivals Gen-3, Pika & Kling 🔥

> Pyramid Flow: Training-efficient Autoregressive Video Generation method
> Utilizes Flow Matching
> Trains on open-source datasets
> Generates high-quality 10-second videos
> Video resolution: 768p
> Frame rate: 24 FPS
> Supports image-to-video generation

> Model checkpoints available on the hub 🤗: rain1011/pyramid-flow-sd3

liked a Space 5 months ago

879

Screenshot to HTML

⚡

Convert screenshots to HTML code

liked a Space 10 months ago

585

Instant Video

⚡

Fast Text 2 Video Generator

reacted to KingNish's post with ❤️ 10 months ago

Post

5262

Introducing OpenGPT-4o
KingNish/OpenGPT-4o

Features:
1️⃣ Inputs possible are Text ✏️, Text + Image 📝🖼️, Audio 🎧, WebCam📸
and outputs possible are Image 🖼️, Image + Text 🖼️📝, Text 📝, Audio 🎧
2️⃣ Flat 100% FREE 💸 and Super-fast ⚡.
3️⃣ Publicly Available before GPT 4o.

Future Features:
1️⃣ Chat with PDF (Both voice and text)
2️⃣ Video generation.
3️⃣ Sequential Image Generation.
4️⃣ Better UI and customization.

Note: It's not possible to reach level of complexity of GPT 4o because OpenAI has been developing GPT-4o from six months with a team of over 450+ experienced members, Whereas I am only One. Moreover, they haven't released it fully publicly, So, it remains a test model.

32 replies