Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
4
12
hassenhamdi
hassenhamdi
Follow
Theartplug's profile picture
Mi6paulino's profile picture
vinuyer's profile picture
4 followers
·
47 following
AI & ML interests
None yet
Recent Activity
reacted
to
singhsidhukuldeep
's
post
with 🧠
1 day ago
O1 Embedder: Transforming Retrieval Models with Reasoning Capabilities Researchers from University of Science and Technology of China and Beijing Academy of Artificial Intelligence have developed a novel retrieval model that mimics the slow-thinking capabilities of reasoning-focused LLMs like OpenAI's O1 and DeepSeek's R1. Unlike traditional embedding models that directly match queries with documents, O1 Embedder first generates thoughtful reflections about the query before performing retrieval. This two-step process significantly improves performance on complex retrieval tasks, especially those requiring intensive reasoning or zero-shot generalization to new domains. The technical implementation is fascinating: - The model integrates two essential functions: Thinking and Embedding - It uses an "Exploration-Refinement" data synthesis workflow where initial thoughts are generated by an LLM and refined by a retrieval committee - A multi-task training method fine-tunes a pre-trained LLM to generate retrieval thoughts via behavior cloning while simultaneously learning embedding capabilities through contrastive learning - Memory-efficient joint training enables both tasks to share encoding results, dramatically increasing batch size The results are impressive - O1 Embedder outperforms existing methods across 12 datasets in both in-domain and out-of-domain scenarios. For example, it achieves a 3.9% improvement on Natural Questions and a 3.0% boost on HotPotQA compared to models without thinking capabilities. This approach represents a significant paradigm shift in retrieval technology, bridging the gap between traditional dense retrieval and the reasoning capabilities of large language models. What do you think about this approach? Could "thinking before retrieval" transform how we build search systems?
upvoted
a
collection
3 days ago
Siglip2 Custom
replied
to
wassemgtk
's
post
3 days ago
# GESAL: Real-Time Adaptation for LLMs We’re excited to unveil **Graph-Enhanced Singular Adaptive Learning (GESAL)**, a framework that lets LLMs like `meta-llama/Llama-3.2-1B` adapt in real time using user feedback. Check out the code and white paper on GitHub! 🔗 **Code**: [https://github.com/writer/AI-Adaptive-Learning-GESAL](https://github.com/writer/AI-Adaptive-Learning-GESAL) --- ## Why GESAL? Static LLMs struggle to adapt without heavy retraining. GESAL solves this with: - **SVF**: Adapts weights via \( W' = U (\Sigma \cdot z) V^T \), using few parameters. - **Graph Memory**: Stores adaptations in nodes for scalability. - **RL**: Updates via \( J(z) = \mathbb{E}[\log \pi_z(y|x) r] \) based on feedback. --- ## How It Works Ask "How many R’s in ‘strawberry’?" If it says "2" and you say "no," GESAL learns to say "3" next time, avoiding repeats. --- ## Try It Built with Hugging Face’s `transformers`: ```bash pip install transformers torch numpy python Adaptive_Learning_(GESAL).py ``` Needs a Hugging Face token for Llama-3.2-1B. --- ## Results GESAL hits 95% accuracy after 5 feedbacks vs. LoRA’s 70%. It’s efficient (~0.5M params) and scalable.
View all activity
Organizations
hassenhamdi
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a Space
3 days ago
Running
4
4
Tech Tree Blog
🌳
liked
3 models
3 days ago
jfkback/hypencoder.4_layer
Feature Extraction
•
Updated
11 days ago
•
42
•
1
jfkback/hypencoder.2_layer
Feature Extraction
•
Updated
11 days ago
•
31
•
1
jfkback/hypencoder.8_layer
Feature Extraction
•
Updated
11 days ago
•
138
•
1
liked
6 models
12 days ago
NousResearch/DeepHermes-3-Llama-3-8B-Preview
Text Generation
•
Updated
10 days ago
•
9.38k
•
264
tomg-group-umd/huginn-0125
Text Generation
•
Updated
5 days ago
•
8.79k
•
236
Zyphra/Zonos-v0.1-transformer
Text-to-Speech
•
Updated
13 days ago
•
109k
•
367
deepseek-ai/DeepSeek-R1
Text Generation
•
Updated
5 days ago
•
4.63M
•
•
10.5k
hexgrad/Kokoro-82M
Text-to-Speech
•
Updated
1 day ago
•
1.29M
•
3.47k
microsoft/OmniParser-v2.0
Image-Text-to-Text
•
Updated
11 days ago
•
6.73k
•
1.03k
liked
a model
3 months ago
Lightricks/LTX-Video
Image-to-Video
•
Updated
24 days ago
•
341k
•
1k
liked
a model
4 months ago
hassenhamdi/SSD-1B-fp8_e4m3fn
Text-to-Image
•
Updated
Nov 13, 2024
•
1