nicolay-r (Nicolay Rusnachenko)

replied to ychen's post 4 days ago

And to clarify your findings on those words you can measure such degree with tf-idf application for your annotated texts. Basically, if you have a set of positive and negative responses from GPT-4o, you can calculate so-called Semantic Orientation (SO) based on Pointwise Mutual Information (PMI). This would give a consistecy to your observations.
This comes from the relatively old classics: https://arxiv.org/pdf/cs/0212032

replied to ychen's post 4 days ago

Oh, that sound interesting and looks like your focus are patients then, while mine majorly was mass-media (authors) and dialogues (character conversations).

To make sure I understood you correctly frames are basically describing how a sentiment is related to entities in a sentence—is this a roughly correct understanding?

That's right, so it acts as a word that connects several parties (including entities), that are scientifically declared as "roles" with the polarity score ("positive", "negative"). So that in your case "sounds like", "rough", "tough" could be treated as negative by GPT-4o with respect to the topic of the question.

As for the frames, here is might be more general definition you might be interested to check (see diagram):
https://aclanthology.org/D18-2008.pdf
The concept is the same, while and instead of words they refer to them as triggers.

replied to ychen's post 5 days ago

Thank you @ychen for sharing this! I was curious, because the word freq analysis you're attempted to do is very aligned with lexicons construction and frames in the domain of sentiment analysis. In particular, this could be enhanced up to analysis on a specific set of words, usually dubbed as frames. So and unlike just words, frames goes further with sentiment of subject towards objects.

FYI. We cover the similar for news and domain specific (Russian language) here: https://github.com/nicolay-r/RuSentiFrames

updated 5 models 5 days ago

reacted to prithivMLmods's post with 🚀 5 days ago

Post

5751

It's really interesting about the deployment of a new state of matter in Majorana 1: the world’s first quantum processor powered by topological qubits. If you missed this news this week, here are some links for you:

🅱️Topological qubit arrays: https://arxiv.org/pdf/2502.12252

⚛️ Quantum Blog: https://azure.microsoft.com/en-us/blog/quantum/2025/02/19/microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits/

📖 Read the story: https://news.microsoft.com/source/features/innovation/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/

📝 Majorana 1 Intro: https://youtu.be/Q4xCR20Dh1E?si=Z51DbEYnZFp_88Xp

🌀The Path to a Million Qubits: https://youtu.be/wSHmygPQukQ?si=TS80EhI62oWiMSHK

3 replies

·

posted an update 5 days ago

Post

893

📢 If you're interesting in quick application of target sentiment analysis towards your data, you might be insterested in using fine-tuned FlanT5-xl version. Reason is a quick performance: I've added batching support for series of sentiment analysis models in this card:
nicolay-r/sentiment-analysis-advances-665ba391e0eba729021ea101

The provider implementation:
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_flan_t5.py

📺 How to quick launch:
https://github.com/nicolay-r/bulk-chain/blob/master/test/test_provider_batching.py

Reason for using? experimenting in out-of domain, the noticed the performance of xl version similar to LLaMA-3-3b-instruct.

🔑 Key takeaways of adaptaiont:
- paddings and truncation strategies for batching mode:
- https://huggingface.co./docs/transformers/en/pad_truncation
- add_special_tokens=False causes a drastic changes in the result behaviour (FlanT5 models).
💥 Crashes on pad_token_id=50256 during generation proces.
🔻 use_bf16 mode performs 3 times slower on CPU.

🚀 Performance for BASE sized model:
nicolay-r/flan-t5-tsa-thor-base
17.2 it/s (prompt) and 5.22 it/s (3-step CoT) (CPU Core i5-1140G7)

There are other domain-oriented models could be launched via the same provider:
nicolay-r/flan-t5-emotion-cause-thor-base

Reference: https://github.com/huggingface/transformers/issues/26061

posted an update 6 days ago

Post

3703

📢 If you're looking for translating massive dataset of JSON-lines / CSV data with various set of source fields, then the following update would be relevant. So far and experimenting with adapting language specific Sentiment Analysis model, got a change to reforge and relaese bulk-translate 0.25.2.
⭐️ https://github.com/nicolay-r/bulk-translate/releases/tag/0.25.2

The update has the following major features
- Supporting schemas: all the columns to be translated are now could be declared within the same prompt-style format. using json this automatically allows to map them onto output fields
- The related updates for shell execution mode: schema parameter is now available alongside with just a prompt usage before.

Benefit is that your output is invariant. You can extend and stack various translators with separated shell laucnhes.

Screenshot below is the application of the google-translate engine in manual batching mode.
🚀 Performance: 2.5 it / sec (in the case of a single field translation)

🌟 about bulk-translate: https://github.com/nicolay-r/bulk-translate
🌌 nlp-thirdgate: https://github.com/nicolay-r/nlp-thirdgate?tab=readme-ov-file

1 reply

·

replied to ychen's post 8 days ago

Thanks! Any publicly available resources of such a synthetic texts that would lead to your observations?

reacted to ychen's post with 👍 8 days ago

Post

2447

Here's some annoying keywords that 4o tends to use when responding to personal experiences with negative sentiments. Will be updated over time.

rough, tough, sound like, sounds like, frustrating, overwhelming

9 replies

·

liked a model 8 days ago

Qwen/Qwen2.5-3B-Instruct

Text Generation • Updated Sep 25, 2024 • 665k • • 201

posted an update 11 days ago

Post

2342

📢 For those who start to work with LLM streaming in web, here is a minimalistic example in JS for accessing server hosted by FastAPI via REST:
https://gist.github.com/nicolay-r/840425749cf6d3e397da3d329e894d59

The code above is a revised verison for accessing Replicate API posted earlier
https://huggingface.co./posts/nicolay-r/390307941200307

The key difference from Replicate API:
- using only POST for passing a body with parameters and fetching the reader.

posted an update 13 days ago

Post

2408

📢 For those who consider a quick and inplace annotation of entities in JSON / CSV tabular data, I got a good news. So far releasing the latest version of the bulk-ner which does these things for you:
🌟 https://github.com/nicolay-r/bulk-ner/releases/tag/0.25.2

bulk-ner is a no-string wrapper over NER service using popular frameworks like DeepPavlov, Spacy, Flair.

What's new? The latest 0.25.2 version has the following key features:
🔧 Fixed: 🐛 the output ignores other input content in input #31
🔥 Schemas support: you can annotate various coulmns by combining them as you wish and map onto the other output colums (see 📸 below) #28

Below is the screenshot on how you can quick start of using it with Spacy models.

🌌 List of other providers @ nlp-thirdgate:
https://github.com/nicolay-r/nlp-thirdgate/tree/master/ner

reacted to csabakecskemeti's post with 👍 13 days ago

Post

1616

I found if we apply the reasoning system prompt (that has been published on the NousResearch/DeepHermes-3-Llama-3-8B-Preview model card) other models are also react to it and start mimicking reasoning. Some better some worse. I've seen internal monologue and self questioning.

Here's a blogpost about it:
http://devquasar.com/ai/reasoning-system-prompt/

reacted to fffiloni's post with 🔥 13 days ago

Post

4196

I was thinking i need to step up my game on training Flux LoRas models, time to have some fun ! ☀️

Expect a new drop per week on aesthetics that catched my attention, here are 3 of them that worked really well !

fffiloni/cute-comic-800
fffiloni/carbo-800
fffiloni/oniric-750

reacted to tianchez's post with 🚀 13 days ago

Post

4026

Introducing VLM-R1!

GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks?

The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task).

https://github.com/om-ai-lab/VLM-R1

3 replies

·

reacted to benhaotang's post with 🚀 13 days ago

Post

2386

Try out my updated implementation of forked OpenDeepResearcher(link below) as an OpenAI compatible endpoint, but with full control, can be deployed completely free with Gemini api or completely locally with ollama, or pay-as-you-go in BYOK format, the AI agents will think dynamically based on the difficulties of given research, compatible with any OpenAI compatible configurable clients(Msty, Chatbox, even vscode AI Toolkit playground).

If you don't want to pay OpenAI $200 to use or want to take control of your deep research, check out here:
👉 https://github.com/benhaotang/OpenDeepResearcher-via-searxng

**Personal take**

Based on my testing against Perplexity's and Gemini's implementation with some Physics domain questions, mine is comparable and very competent at finding even the most rare articles or methods.

Also a funny benchmark of mine to test all these searching models, is to trouble shot a WSL2 hanging issue I experienced last year, with prompt:

> wsl2 in windows hangs in background with high vmmem cpu usage once in a while, especially after hibernation, no error logs captured in linux, also unable to shutdown in powershell, provide solutions

the final solution that took me a day last year to find is to patch the kernel with some steps documented in carlfriedrich's repo and wait Microsoft to solve it(it is buried deep in wsl issues). Out of the three, only my Deep Research agent has found this solution, Perplexity and Gemini just focus on other force restart or memory management methods. I am very impressed with how it has this kind of obscure and scarce trouble shooting ability.

**Limitations**

Some caveats to be done later:
- Multi-turn conversation is not yet supported, so no follow-up questions
- System message is only extra writing instructions, don't affect on search
- Small local model may have trouble citing source reliably, I am working on a fix to fact check all citation claims

1 reply

·

Nicolay Rusnachenko

AI & ML interests

Recent Activity

Organizations

nicolay-r's activity

nicolay-r/flan-t5-tsa-thor-large

nicolay-r/flan-t5-emotion-cause-thor-base

nicolay-r/flan-t5-tsa-thor-xl

nicolay-r/flan-t5-tsa-prompt-xl

nicolay-r/flan-t5-tsa-thor-base

Qwen/Qwen2.5-3B-Instruct