Kuldeep Singh Sidhu's picture
6 3

Kuldeep Singh Sidhu

singhsidhukuldeep

AI & ML interests

😃 TOP 3 on HuggingFace for posts 🤗 Seeking contributors for a completely open-source 🚀 Data Science platform! singhsidhukuldeep.github.io

Recent Activity

posted an update 3 days ago
Exciting Research Alert: Multimodal Semantic Retrieval Revolutionizing E-commerce Product Search! Just came across a fascinating paper from @amazon researchers that tackles a crucial challenge in e-commerce search - integrating both text and image data for better product discovery. >> Key Innovations The researchers developed two groundbreaking architectures: - A 4-tower multimodal model combining BERT and CLIP for processing both text and images - A streamlined 3-tower model that achieves comparable performance with reduced complexity >> Technical Deep Dive The system leverages dual-encoder architecture with some impressive components: - Bi-encoder BERT model for processing text queries and product descriptions - Visual transformers from CLIP for image processing - Advanced fusion techniques including concatenation and MLP-based approaches - Cosine similarity scoring for efficient large-scale retrieval >> Real-world Impact The results are remarkable: - Up to 78.6% recall@100 for product retrieval - Over 50% exact match precision - Significant reduction in irrelevant results to just 11.9% >> Industry Applications This research has major implications for: - E-commerce search optimization - Visual product discovery - Large-scale retrieval systems - Cross-modal product recommendations What's particularly impressive is how the system handles millions of products while maintaining computational efficiency through smart architectural choices. This work represents a significant step forward in making online shopping more intuitive and accurate. The researchers from Amazon have demonstrated that combining visual and textual information can dramatically improve search relevance while maintaining scalability.
posted an update 5 days ago
Exciting breakthrough in large-scale recommendation systems! ByteDance researchers have developed a novel real-time indexing method called "Streaming Vector Quantization" (Streaming VQ) that revolutionizes how recommendations work at scale. >> Key Innovations Real-time Indexing: Unlike traditional methods that require periodic reconstruction of indexes, Streaming VQ attaches items to clusters in real time, enabling immediate capture of emerging trends and user interests. Superior Balance: The system achieves remarkable index balancing through innovative techniques like merge-sort modification and popularity-aware cluster assignment, ensuring all clusters participate effectively in recommendations. Implementation Efficiency: Built on VQ-VAE architecture, Streaming VQ features a lightweight and clear framework that makes it highly implementation-friendly for large-scale deployments. >> Technical Deep Dive The system operates in two key stages: - An indexing step using a two-tower architecture for real-time item-cluster assignment - A ranking step that employs sophisticated attention mechanisms and deep neural networks for precise recommendations. >> Real-world Impact Already deployed in Douyin and Douyin Lite, replacing all major retrievers and delivering significant user engagement improvements. The system handles a billion-scale corpus while maintaining exceptional performance and computational efficiency. This represents a significant leap forward in recommendation system architecture, especially for platforms dealing with dynamic, rapidly-evolving content. The ByteDance team's work demonstrates how rethinking fundamental indexing approaches can lead to substantial real-world improvements.
View all activity

Organizations

MLX Community's profile picture Social Post Explorers's profile picture C4AI Community's profile picture

Posts 117

view post
Post
2159
Exciting breakthrough in Retrieval-Augmented Generation (RAG): Introducing MiniRAG - a revolutionary approach that makes RAG systems accessible for edge devices and resource-constrained environments.

Key innovations that set MiniRAG apart:

Semantic-aware Heterogeneous Graph Indexing
- Combines text chunks and named entities in a unified structure
- Reduces reliance on complex semantic understanding
- Creates rich semantic networks for precise information retrieval

Lightweight Topology-Enhanced Retrieval
- Leverages graph structures for efficient knowledge discovery
- Uses pattern matching and localized text processing
- Implements query-guided reasoning path discovery

Impressive Performance Metrics
- Achieves comparable results to LLM-based methods while using Small Language Models (SLMs)
- Requires only 25% of storage space compared to existing solutions
- Maintains robust performance with accuracy reduction ranging from just 0.8% to 20%

The researchers from Hong Kong University have also contributed a comprehensive benchmark dataset specifically designed for evaluating lightweight RAG systems under realistic on-device scenarios.

This breakthrough opens new possibilities for:
- Edge device AI applications
- Privacy-sensitive implementations
- Real-time processing systems
- Resource-constrained environments

The full implementation and datasets are available on GitHub: HKUDS/MiniRAG
view post
Post
513
Exciting Research Alert: Multimodal Semantic Retrieval Revolutionizing E-commerce Product Search!

Just came across a fascinating paper from @amazon researchers that tackles a crucial challenge in e-commerce search - integrating both text and image data for better product discovery.

>> Key Innovations
The researchers developed two groundbreaking architectures:
- A 4-tower multimodal model combining BERT and CLIP for processing both text and images
- A streamlined 3-tower model that achieves comparable performance with reduced complexity

>> Technical Deep Dive
The system leverages dual-encoder architecture with some impressive components:
- Bi-encoder BERT model for processing text queries and product descriptions
- Visual transformers from CLIP for image processing
- Advanced fusion techniques including concatenation and MLP-based approaches
- Cosine similarity scoring for efficient large-scale retrieval

>> Real-world Impact
The results are remarkable:
- Up to 78.6% recall@100 for product retrieval
- Over 50% exact match precision
- Significant reduction in irrelevant results to just 11.9%

>> Industry Applications
This research has major implications for:
- E-commerce search optimization
- Visual product discovery
- Large-scale retrieval systems
- Cross-modal product recommendations

What's particularly impressive is how the system handles millions of products while maintaining computational efficiency through smart architectural choices.

This work represents a significant step forward in making online shopping more intuitive and accurate. The researchers from Amazon have demonstrated that combining visual and textual information can dramatically improve search relevance while maintaining scalability.

models

None public yet

datasets

None public yet