50 62 107

Dmitry Ryumin

DmitryRyumin

https://dmitryryumin.github.io

DmitryRyumin

AI & ML interests

Machine Learning and Applications, Multi-Modal Understanding

Recent Activity

reacted to singhsidhukuldeep's post with 🔥 7 days ago

Exciting New Tool for Knowledge Graph Extraction from Plain Text! I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data. KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs. The technical approach is fascinating: 1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text 2. It aggregates graphs across sources to reduce redundancy 3. Most importantly, it applies iterative LM-based clustering to refine the raw graph The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor"). The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%. For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models. The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!

liked a Space 7 days ago

FunAudioLLM/CosyVoice2-0.5B

upvoted a collection 11 days ago

C2SER

View all activity

Organizations

DmitryRyumin's activity

reacted to singhsidhukuldeep's post with 🔥 7 days ago

Post

6690

Exciting New Tool for Knowledge Graph Extraction from Plain Text!

I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.

KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.

The technical approach is fascinating:

1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph

The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").

The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.

For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.

The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!

liked a Space 7 days ago

193

CosyVoice2-0.5B

🥳

Generate realistic voice audio from text and audio prompts

upvoted a collection 11 days ago

C2SER

Collection

3 items • Updated 12 days ago • 1

liked a Space 12 days ago

2.14k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 12 days ago

yandex/YandexGPT-5-Lite-8B-pretrain

Updated 12 days ago • 7.62k • 162

upvoted a paper 13 days ago

SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published 17 days ago • 91

reacted to m-ric's post with 🚀 13 days ago

Post

4622

We now have a Deep Research for academia: SurveyX automatically writes academic surveys nearly indistinguishable from human-written ones 🔥

Researchers from Beijing and Shanghai just published the first application of a deep research system to academia: their algorithm, given a question, can give you a survey of all papers on the subject.

To make a research survey, you generally follow two steps, preparation (collect and organize papers) and writing (outline creation, writing, polishing). Researchers followed the same two steps and automated them.

🎯 For the preparation part, a key part is find all the important references on the given subject.
Researchers first cast a wide net of all relevant papers. But then finding the really important ones is like distilling knowledge from a haystack of information. To solve this challenge, they built an “AttributeTree” object that structures key information from citations. Ablating these AttributeTrees significantly decreased structure and synthesis scores, so they were really useful!

📝 For the writing part, key was to get a synthesis that's both short and true. This is not easy to get with LLMs! So they used methods like LLM-based deduplication to shorten the too verbose listings made by LLMs, and RAG to grab original quotes instead of made-up ones.

As a result, their system outperforms previous approaches by far!

As assessed by LLM-judges, the quality score os SurveyX even approaches this of human experts, with 4.59/5 vs 4.75/5 🏆

I advise you to read the paper, it's a great overview of the kind of assistants that we'll get in the short future! 👉 SurveyX: Academic Survey Automation via Large Language Models (2502.14776)
Their website shows examples of generated surveys 👉 http://www.surveyx.cn/

reacted to their post with 🔥 17 days ago

Post

3639

🚀🎭🌟 New Research Alert - WACV 2025 (Avatars Collection)! 🌟🎭🚀
📄 Title: EmoVOCA: Speech-Driven Emotional 3D Talking Heads 🔝

📝 Description: EmoVOCA is a data-driven method for generating emotional 3D talking heads by combining speech-driven lip movements with expressive facial dynamics. This method has been developed to overcome the limitations of corpora and to achieve state-of-the-art animation quality.

👥 Authors: @FedeNoce , Claudio Ferrari, and Stefano Berretti

📅 Conference: WACV, 28 Feb – 4 Mar, 2025 | Arizona, USA 🇺🇸

📄 Paper: https://arxiv.org/abs/2403.12886

🌐 Github Page: https://fedenoce.github.io/emovoca/
📁 Repository: https://github.com/miccunifi/EmoVOCA

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #EmoVOCA #3DAnimation #TalkingHeads #SpeechDriven #FacialExpressions #MachineLearning #ComputerVision #ComputerGraphics #DeepLearning #AI #WACV2024

1 reply

posted an update 17 days ago

Post

3639

1 reply

updated a collection 17 days ago

🎭 Avatars

Collection

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 71 items • Updated 17 days ago • 83

New activity in DmitryRyumin/MASAI about 1 month ago

Fixed typos

#2 opened about 1 month ago by

markitantov

Reset the tracked ID

#1 opened about 1 month ago by

markitantov

upvoted a collection about 1 month ago

DeepSeek R1 (All Versions)

Collection

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 10 days ago • 206

upvoted a paper about 1 month ago

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 84

liked a Space about 1 month ago

123

Video Depth Anything

👀

Generate depth video from input video

reacted to not-lain's post with 🔥 about 1 month ago

Post

4248

I have just released a new blogpost about kv caching and its role in inference speedup 🚀
🔗 https://huggingface.co./blog/not-lain/kv-caching/
some takeaways :