Nicolay Rusnachenko

nicolay-r

https://nicolay-r.github.io/

AI & ML interests

Information Retrieval・Medical Multimodal NLP (🖼+📝) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

reacted to DawnC's post with ❤️ about 12 hours ago

🌟 PawMatchAI: Making Breed Selection More Intuitive! 🐕 Excited to share the latest update to this AI-powered companion for finding your perfect furry friend! The breed recommendation system just got a visual upgrade to help you make better decisions. ✨ What's New? Enhanced breed recognition accuracy through strategic model improvements: - Upgraded to a fine-tuned ConvNeXt architecture for superior feature extraction - Implemented progressive layer unfreezing during training - Optimized data augmentation pipeline for better generalization - Achieved 8% improvement in breed classification accuracy 🎯 Key Features: - Smart breed recognition powered by AI - Visual matching scores with intuitive color indicators - Detailed breed comparisons with interactive tooltips - Lifestyle-based recommendations tailored to your needs 💭 Project Vision Combining my passion for AI and pets, this project represents another step toward my goal of creating meaningful AI applications. Each update aims to make the breed selection process more accessible while improving the underlying technology. 👉 Try it now: https://huggingface.co./spaces/DawnC/PawMatchAI Your likes ❤️ on this space fuel this project's growth! #AI #MachineLearning #DeepLearning #Pytorch #ComputerVision See translation

reacted to sayakpaul's post with 🚀 about 12 hours ago

Commits speak louder than words 🤪 * 4 new video models * Multiple image models, including SANA & Flux Control * New quantizers -> GGUF & TorchAO * New training scripts Enjoy this holiday-special Diffusers release 🤗 Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0

reacted to singhsidhukuldeep's post with 🧠 about 12 hours ago

Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: 🚀 Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512×512 pixels with 14×14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage ⚡️ Under The Hood: - Multi-stage training process with progressive resolution scaling (224→384→512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample 📊 Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) 🎯 Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!

View all activity

Organizations

None yet

nicolay-r's activity

upvoted 2 papers 11 days ago

Stream of Search (SoS): Learning to Search in Language

Paper • 2404.03683 • Published Apr 1 • 29

STaR: Bootstrapping Reasoning With Reasoning

Paper • 2203.14465 • Published Mar 28, 2022 • 8

upvoted an article 11 days ago

Article

Reverse Thinking Makes LLMs Stronger Reasoners

•

22 days ago

• 2

upvoted an article 3 months ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

•

Jun 23

• 34

upvoted a paper 3 months ago

LML: Language Model Learning a Dataset for Data-Augmented Prediction

Paper • 2409.18957 • Published Sep 27 • 9

upvoted a collection 5 months ago

Chinchunmei on WASSA2024 Shared-Task 1

Collection

This is the model cards collection for Chinchunmei team in the WASSA2024 Shared-Task 1: Empathy Detection and Emotion Classification. • 5 items • Updated Jul 3 • 2

upvoted 2 papers 6 months ago

PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits

Paper • 2305.02547 • Published May 4, 2023 • 7

Personality Traits in Large Language Models

Paper • 2307.00184 • Published Jul 1, 2023 • 20

upvoted 2 articles 8 months ago

Article

Vision Language Models Explained

Apr 11

• 236

Article

Fine-tune Llama 3 with ORPO

•

Apr 22

• 228

upvoted 5 papers 8 months ago

RuSentNE-2023: Evaluating Entity-Oriented Sentiment Analysis on Russian News Texts

Paper • 2305.17679 • Published May 28, 2023 • 1

Large Language Models in Targeted Sentiment Analysis

Paper • 2404.12342 • Published Apr 18 • 2

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Paper • 2403.16999 • Published Mar 25 • 4

RegionGPT: Towards Region Understanding Vision Language Model

Paper • 2403.02330 • Published Mar 4 • 2

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

Paper • 2403.09333 • Published Mar 14 • 14

upvoted 3 papers 9 months ago

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11 • 30

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 87

OmniFusion Technical Report

Paper • 2404.06212 • Published Apr 9 • 74