CompVis Community

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

omerbartal authored a paper 21 days ago

DynVFX: Augmenting Real Videos with Dynamic Content

RafailFridman authored a paper 21 days ago

DynVFX: Augmenting Real Videos with Dynamic Content

loubnabnl authored a paper 22 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

View all activity

compvis-community's activity

DanahY

authored a paper 21 days ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published 23 days ago • 27

loubnabnl

authored a paper 22 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 24 days ago • 195

jindi

authored a paper about 1 month ago

Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18 • 15

Skylion007

authored a paper about 2 months ago

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 88

akhaliq

posted an update 2 months ago

Post

11846

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: akhaliq/anychat

2 replies

toshas

authored a paper 2 months ago

Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion

Paper • 2412.13389 • Published Dec 18, 2024 • 7

toshas

posted an update 2 months ago

Post

1281

Introducing ⇆ Marigold-DC — our training-free zero-shot approach to monocular Depth Completion with guided diffusion! If you have ever wondered how else a long denoising diffusion schedule can be useful, we have an answer for you!

Depth Completion addresses sparse, incomplete, or noisy measurements from photogrammetry or sensors like LiDAR. Sparse points aren’t just hard for humans to interpret — they also hinder downstream tasks.

Traditionally, depth completion was framed as image-guided depth interpolation. We leverage Marigold, a diffusion-based monodepth model, to reframe it as sparse-depth-guided depth generation. How the turntables! Check out the paper anyway 👇

🌎 Website: https://marigolddepthcompletion.github.io/
🤗 Demo: prs-eth/marigold-dc
📕 Paper: https://arxiv.org/abs/2412.13389
👾 Code: https://github.com/prs-eth/marigold-dc

Team ETH Zürich: Massimiliano Viola ( @mviola ), Kevin Qu ( @KevinQu7 ), Nando Metzger ( @nandometzger ), Bingxin Ke ( @Bingxin ), Alexander Becker, Konrad Schindler, and Anton Obukhov ( @toshas ). We thank
Hugging Face for their continuous support.

jimmyyhwu

authored 2 papers 2 months ago

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Paper • 2403.12945 • Published Mar 19, 2024

TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning

Paper • 2412.10447 • Published Dec 11, 2024 • 5

Humphrey

authored a paper 2 months ago

OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation

Paper • 2412.09585 • Published Dec 12, 2024 • 11

xu3kev

authored a paper 3 months ago

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Paper • 2412.02980 • Published Dec 4, 2024 • 13

toshas

authored a paper 3 months ago

Video Depth without Video Models

Paper • 2411.19189 • Published Nov 28, 2024 • 36

akhaliq

posted an update 3 months ago

Post

12492

QwQ-32B-Preview is now available in anychat

A reasoning model that is competitive with OpenAI o1-mini and o1-preview

try it out: akhaliq/anychat

1 reply

akhaliq

posted an update 3 months ago

Post

4149

New model drop in anychat

allenai/Llama-3.1-Tulu-3-8B is now available

try it here: akhaliq/anychat

loubnabnl

posted an update 3 months ago

Post

2831

Making SmolLM2 reproducible: open-sourcing our training & evaluation toolkit 🛠️ https://github.com/huggingface/smollm/

- Pre-training code with nanotron
- Evaluation suite with lighteval
- Synthetic data generation using distilabel (powers our new SFT dataset HuggingFaceTB/smoltalk)
- Post-training scripts with TRL & the alignment handbook
- On-device tools with llama.cpp for summarization, rewriting & agents

Apache 2.0 licensed. V2 pre-training data mix coming soon!

Which other tools should we add next?