AI & ML interests

Small LMs for small computers

Recent Activity

M4-ai's activity

Locutusqueย 
posted an update 3 days ago
view post
Post
2268
๐ŸŽ‰ Exciting news, everyone! I've just released **Thespis-Llama-3.1-8B**, a new language model designed for enhanced roleplaying! โœจ๏ธ

It's built on Llama-3.1 and fine-tuned with a focus on Theory of Mind reasoning to create more believable and engaging characters. It even learned a few tricks on its own, like adding in-character thought processes! ๐Ÿง 

Check it out here: Locutusque/Thespis-Llama-3.1-8B

Give it a try and let me know what you think! I'm especially interested in feedback on how well the characters stay in role and if the responses feel natural. Looking forward to seeing what amazing stories you create! โœ๏ธ
prithivMLmodsย 
posted an update 3 days ago
view post
Post
3541
Dropping some of the custom fine-tunes based on SigLIP2,
with a single-label classification problem type! ๐ŸŒ€๐Ÿงค

- AI vs Deepfake vs Real : prithivMLmods/AI-vs-Deepfake-vs-Real-Siglip2
- Deepfake Detect : prithivMLmods/Deepfake-Detect-Siglip2
- Fire Detection : prithivMLmods/Fire-Detection-Siglip2
- Deepfake Quality Assess : prithivMLmods/Deepfake-Quality-Assess-Siglip2
- Guard Against Unsafe Content : prithivMLmods/Guard-Against-Unsafe-Content-Siglip2

๐ŸŒ Collection : prithivMLmods/siglip2-custom-67bcdb2de8fe96b99fb4e19e
KnutJaegersbergย 
posted an update 6 days ago
prithivMLmodsย 
posted an update 6 days ago
view post
Post
5748
It's really interesting about the deployment of a new state of matter in Majorana 1: the worldโ€™s first quantum processor powered by topological qubits. If you missed this news this week, here are some links for you:

๐Ÿ…ฑ๏ธTopological qubit arrays: https://arxiv.org/pdf/2502.12252

โš›๏ธ Quantum Blog: https://azure.microsoft.com/en-us/blog/quantum/2025/02/19/microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits/

๐Ÿ“– Read the story: https://news.microsoft.com/source/features/innovation/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/

๐Ÿ“ Majorana 1 Intro: https://youtu.be/Q4xCR20Dh1E?si=Z51DbEYnZFp_88Xp

๐ŸŒ€The Path to a Million Qubits: https://youtu.be/wSHmygPQukQ?si=TS80EhI62oWiMSHK
ยท
mmhamdyย 
posted an update 7 days ago
view post
Post
2689
๐ŸŽ‰ We're excited to introduce MemoryCode, a novel synthetic dataset designed to rigorously evaluate LLMs' ability to track and execute coding instructions across multiple sessions. MemoryCode simulates realistic workplace scenarios where a mentee (the LLM) receives coding instructions from a mentor amidst a stream of both relevant and irrelevant information.

๐Ÿ’ก But what makes MemoryCode unique?! The combination of the following:

โœ… Multi-Session Dialogue Histories: MemoryCode consists of chronological sequences of dialogues between a mentor and a mentee, mirroring real-world interactions between coworkers.

โœ… Interspersed Irrelevant Information: Critical instructions are deliberately interspersed with unrelated content, replicating the information overload common in office environments.

โœ… Instruction Updates: Coding rules and conventions can be updated multiple times throughout the dialogue history, requiring LLMs to track and apply the most recent information.

โœ… Prospective Memory: Unlike previous datasets that cue information retrieval, MemoryCode requires LLMs to spontaneously recall and apply relevant instructions without explicit prompts.

โœ… Practical Task Execution: LLMs are evaluated on their ability to use the retrieved information to perform practical coding tasks, bridging the gap between information recall and real-world application.

๐Ÿ“Œ Our Findings

1๏ธโƒฃ While even small models can handle isolated coding instructions, the performance of top-tier models like GPT-4o dramatically deteriorates when instructions are spread across multiple sessions.

2๏ธโƒฃ This performance drop isn't simply due to the length of the context. Our analysis indicates that LLMs struggle to reason compositionally over sequences of instructions and updates. They have difficulty keeping track of which instructions are current and how to apply them.

๐Ÿ”— Paper: From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions (2502.13791)
๐Ÿ“ฆ Code: https://github.com/for-ai/MemoryCode
KnutJaegersbergย 
posted an update 8 days ago
view post
Post
670
Mimicking Consciousness in LLMs: Ascending the Dimensions of Thought with Recurrent Processing

This blog post explores how **recurrent processing** can transform Large Language Models (LLMs) to mimic aspects of human thought by engaging in iterative feedback loops. Inspired by string theory, the post describes how LLMs can "ascend dimensions" of cognition, progressing through foundational cognitive loopsโ€”such as basic cognition, executive functions, and meta-cognitionโ€”before advancing into **world simulation**. In this stage, LLMs explore higher dimensions, perceiving non-linear time, simulating branching possibilities, and integrating multiple realities. The interaction between the **Generator** and **Reflective Compass** allows AI systems to refine their outputs iteratively, moving toward a **point attractor** where ideas become coherent and polished. While this process doesn't bestow true consciousness, it offers a compelling imitation of reflective and adaptive thinking, leading to smarter dialogue, enhanced creativity, and more robust problem-solving.

https://huggingface.co./blog/KnutJaegersberg/oscillatory-recurrence-for-llms
prithivMLmodsย 
posted an update 10 days ago
view post
Post
3885
Dino: The Minimalist Multipurpose Chat System ๐ŸŒ 
Agent-Dino : prithivMLmods/Agent-Dino
Github: https://github.com/PRITHIVSAKTHIUR/Agent-Dino

By default, it performs the following tasks:
{Text-to-Text Generation}, {Image-Text-Text Generation}
@image: Generates an image using Stable Diffusion xL.
@3d: Generates a 3D mesh.
@web: Web search agents.
@rAgent: Initiates a reasoning chain using Llama mode for coding explanations.
@tts1-โ™€, @tts2-โ™‚: Voice generation (Female and Male voices).
@yolo : Object Detection
prithivMLmodsย 
posted an update 12 days ago
view post
Post
4471
The last week of Impression Craft Arts and sketches from strangerzonehf๐ŸŽจ๐Ÿง‘๐Ÿปโ€๐ŸŽจ

- Collection : strangerzonehf/Flux-Ultimate-LoRA-Collection

Adapters:
+ Ld-Art : strangerzonehf/Ld-Art
+ Animeopix-Flux : strangerzonehf/Animeopix-Flux
+ Flux-Super-Paint-LoRA : strangerzonehf/Flux-Super-Paint-LoRA
+ CinematicShot-Pics-Flux : strangerzonehf/cinematicShot-Pics-Flux
+ Oil-Wall-Art-Flux : strangerzonehf/Oil-Wall-Art-Flux
+ Pixelo-Flux : strangerzonehf/Pixelo-Flux
+ Abstract-Shattered : strangerzonehf/Abstract-Shattered
+ Neon-Impressionism-Flux : strangerzonehf/Neon-Impressionism-Flux
+ NewG-Art : strangerzonehf/NewG-Art

๐ŸชงDemo : prithivMLmods/FLUX-LoRA-DLC
๐Ÿค—Page : https://huggingface.co./strangerzonehf
AtAndDevย 
posted an update 13 days ago
view post
Post
2381
@nroggendorff is that you sama?
  • 2 replies
ยท
mmhamdyย 
posted an update 17 days ago
view post
Post
2961
โ›“ Evaluating Long Context #2: SCROLLS and ZeroSCROLLS

In this series of posts about tracing the history of long context evaluation, we started with Long Range Arena (LRA). Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation. But it wasn't introduced to evaluate LLMs, but rather the transformer architecture in general.

๐Ÿ“œ The SCROLLS benchmark, introduced in 2022, addresses this gap in NLP/LLM research. SCROLLS challenges models with tasks that require reasoning over extended sequences (according to 2022 standards). So, what does it offer?

1๏ธโƒฃ Long Text Focus: SCROLLS (unlike LRA) focus mainly on text and contain inputs with thousands of words, testing models' ability to synthesize information across lengthy documents.
2๏ธโƒฃ Diverse Tasks: Includes summarization, question answering, and natural language inference across domains like literature, science, and business.
3๏ธโƒฃ Unified Format: All datasets are available in a text-to-text format, facilitating easy evaluation and comparison of models.

Building on SCROLLS, ZeroSCROLLS takes long text evaluation to the next level by focusing on zero-shot learning. Other features include:

1๏ธโƒฃ New Tasks: Introduces tasks like sentiment aggregation and sorting book chapter summaries.
2๏ธโƒฃ Leaderboard: A live leaderboard encourages continuous improvement and competition among researchers.

๐Ÿ’ก What are some other landmark benchmarks in the history of long context evaluation? Feel free to share your thoughts and suggestions in the comments.

- SCROLLS Paper: SCROLLS: Standardized CompaRison Over Long Language Sequences (2201.03533)
- ZeroSCROLLS Paper: ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding (2305.14196)
KnutJaegersbergย 
posted an update 20 days ago
view post
Post
2716
A Brief Survey of Associations Between Meta-Learning and General AI

The paper titled "A Brief Survey of Associations Between Meta-Learning and General AI" explores how meta-learning techniques can contribute to the development of Artificial General Intelligence (AGI). Here are the key points summarized:

1. General AI (AGI) and Meta-Learning:
- AGI aims to develop algorithms that can handle a wide variety of tasks, similar to human intelligence. Current AI systems excel at specific tasks but struggle with generalization to unseen tasks.
- Meta-learning or "learning to learn" improves model adaptation and generalization, allowing AI systems to tackle new tasks efficiently using prior experiences.

2. Neural Network Design in Meta-Learning:
- Techniques like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks enable self-improvement and adaptability for deep models, supporting generalization across tasks.
- Highway networks and ResNet-style models use shortcuts for efficient backpropagation, allowing deeper models that can be used in meta-learning frameworks.

3. Coevolution:
- Coevolution involves the mutual evolution of multiple components, such as learners or task-solvers, to improve overall performance.
- Coevolution between learners enhances collaboration and competition within AI systems, while coevolution between tasks and solvers (e.g., POWERPLAY and AI-GA frameworks) pushes solvers to adapt to increasingly complex tasks.

4. Curiosity in Meta-Learning:
- Curiosity-based exploration encourages AI systems to discover new, diverse features of the environment, avoiding local optima.
- Curiosity-based objectives can be combined with performance-based objectives to ensure efficient exploration and adaptation in complex tasks.

5. Forgetting Mechanisms:
- Forgetting is crucial to avoid memory overload in AI systems

https://arxiv.org/abs/2101.04283
prithivMLmodsย 
posted an update 20 days ago
view post
Post
4252
QwQ Edge Gets a Small Update..! ๐Ÿ’ฌ
try now: prithivMLmods/QwQ-Edge

๐Ÿš€Now, you can use the following commands for different tasks:

๐Ÿ–ผ๏ธ @image 'prompt...' โ†’ Generates an image
๐Ÿ”‰@tts1 'prompt...' โ†’ Generates speech in a female voice
๐Ÿ”‰ @tts2 'prompt...' โ†’ Generates speech in a male voice
๐Ÿ…ฐ๏ธ@text 'prompt...' โ†’ Enables textual conversation (If not specified, text-to-text generation is the default mode)

๐Ÿ’ฌMultimodality Support : prithivMLmods/Qwen2-VL-OCR-2B-Instruct
๐Ÿ’ฌFor text generation, the FastThink-0.5B model ensures quick and efficient responses, prithivMLmods/FastThink-0.5B-Tiny
๐Ÿ’ฌImage Generation: sdxl lightning model, SG161222/RealVisXL_V4.0_Lightning

Github: https://github.com/PRITHIVSAKTHIUR/QwQ-Edge

graph TD
    A[User Interface] --> B[Chat Logic]
    B --> C{Command Type}
    C -->|Text| D[FastThink-0.5B]
    C -->|Image| E[Qwen2-VL-OCR-2B]
    C -->|@image| F[Stable Diffusion XL]
    C -->|@tts| G[Edge TTS]
    D --> H[Response]
    E --> H
    F --> H
    G --> H
KnutJaegersbergย 
posted an update 21 days ago
view post
Post
1654
Artificial general intelligence through recursive data compression and grounded reasoning: a position paper


This paper proposes a system to achieve AGI through general data compression and grounded reasoning.

General Data Compression involves creating a flexible algorithm that adapts to input data to simplify and compress it recursively, identifying simple, orthogonal features to avoid redundancy. The algorithm measures AGI progress by solving problems based on increasing complexity, and it expands its search space according to the data itself. Compression is applied not only to data but also to model parameters, and sequences are segmented based on compressibility.

Grounded Reasoning refers to forming representations with various granularities, crucial for commonsense reasoning and AGI. The system simulates the real world as its model, switching between representations and maximizing resourcefulness. Key ideas include the world as its own model for reasoning and actions aimed at maximizing entropy to test hypotheses.

The paper emphasizes simplicity, data-dependent bias, recursion, orthogonality, resourcefulness, and grounding in real-world contexts as fundamental principles in building an AGI system.

https://arxiv.org/abs/1506.04366
  • 1 reply
ยท
Tonicย 
posted an update 25 days ago
view post
Post
2272
๐Ÿ™‹๐Ÿปโ€โ™‚๏ธhey there folks ,

Goedel's Theorem Prover is now being demo'ed on huggingface : Tonic/Math

give it a try !
prithivMLmodsย 
posted an update 26 days ago
view post
Post
4796
o3-Mini and Deepseek R1
Worked out with some famous and weird examples.

๐Ÿ”ฅBlog: https://huggingface.co./blog/prithivMLmods/o3-mini-vs-deepseek-r1

Prompt : Using HTML, CSS, and JavaScript in a single HTML file to create a simulation of the solar system. Pay extreme attention to the UI to make it as intuitive as possible. Ensure that every planet appears as a sphere and is labeled with its corresponding name.

example 1: o3 Mini , example 2: Deepseek R1

Q2 : https://huggingface.co./blog/prithivMLmods/o3-mini-vs-deepseek-r1#q2--web-solar-system-explorer
  • 1 reply
ยท
KnutJaegersbergย 
posted an update 28 days ago
view post
Post
1035
Anthropomorphic reasoning about neuromorphic AGI safety

Summary of "Anthropomorphic Reasoning About Neuromorphic AGI Safety"
This paper explores safety strategies for neuromorphic artificial general intelligence (AGI), defined as systems designed by reverse-engineering essential computations of the human brain. Key arguments and proposals include:

1. Anthropomorphic Reasoning Validity:
- Neuromorphic AGIโ€™s design and assessment rely on human cognition models, making anthropomorphic reasoning (using human-like traits) critical for safety analysis. Comparisons to human behavior and neural mechanisms provide insights into AGI behavior and risks.

2. Countering Safety Criticisms:
- The authors challenge claims that neuromorphic AGI is inherently more dangerous than other AGI approaches. They argue all AGI systems face intractable verification challenges (e.g., real-world unpredictability, incomputable action validation). Neuromorphic AGI may even offer safety advantages by enabling comparisons to human cognitive processes.

3. Motivational Architecture:
- Basic drives (e.g., curiosity, social interaction) are essential for cognitive development and safety. These pre-conceptual, hardwired drives (analogous to human hunger or affiliation) shape learning and behavior. The orthogonality thesis (intelligence and goals as independent) is contested, as neuromorphic AGIโ€™s drives likely intertwine with its cognitive architecture.

4. Safety Strategies:
- **Social Drives**: Embedding drives like caregiving, affiliation, and cooperation ensures AGI develops prosocial values through human interaction.
- **Bounded Reward Systems**: Human-like satiation mechanisms (e.g., diminishing rewards after fulfillment) prevent extreme behaviors (e.g., paperclip maximization).
- **Developmental Environment**: Exposure to diverse, positive human interactions and moral examples fosters

https://ccnlab.org/papers/JilkHerdReadEtAl17.pdf
Abhaykoulย 
posted an update 29 days ago
view post
Post
3915
๐Ÿ”ฅ THE WAIT IS OVER... HAI-SER IS HERE! ๐Ÿ”ฅ

Yo fam, this ain't just another AI dropโ€” this is the FUTURE of emotional intelligence! ๐Ÿš€

Introducing HAI-SER, powered by Structured Emotional Reasoning (SER), the next-level AI that doesnโ€™t just understand your wordsโ€”it feels you, analyzes your emotions, and helps you navigate lifeโ€™s toughest moments. ๐Ÿ’ก

๐Ÿ’ฅ What makes HAI-SER a game-changer?
๐Ÿ”น Emotional Vibe Check โ€“ Gets the mood, energy, and whatโ€™s really going on ๐ŸŽญ
๐Ÿ”น Mind-State Analysis โ€“ Breaks down your thoughts, beliefs, and patterns ๐Ÿคฏ
๐Ÿ”น Root Cause Deep-Dive โ€“ Unpacks the WHY behind your emotions ๐Ÿ’ก
๐Ÿ”น Impact Check โ€“ Sees how itโ€™s affecting your life and mental health ๐Ÿ’”
๐Ÿ”น Safety Check โ€“ Prioritizes your well-being and crisis management ๐Ÿšจ
๐Ÿ”น Healing Game Plan โ€“ Custom strategies to help you bounce back ๐Ÿ’ช
๐Ÿ”น Growth Potential โ€“ Turns struggles into opportunities for self-improvement ๐Ÿ“ˆ
๐Ÿ”น How to Approach โ€“ Teaches you and others how to communicate and heal ๐Ÿค
๐Ÿ”น Personalized Response โ€“ Not just generic adviceโ€”real talk, tailored to YOU ๐Ÿ’ฏ

No more robotic AI responses. No more surface-level advice. HAI-SER gets deep, analyzing emotions with precision and giving real, actionable support.

This ainโ€™t just AIโ€”this is your digital therapist, life coach, and hype squad all in one. Whether itโ€™s mental health, career struggles, relationships, or personal growth, HAI-SER has your back.

๐Ÿš€ The future of emotionally intelligent AI is HERE.
Are you ready? ๐Ÿ”ฅ๐Ÿ’ฏ

HelpingAI/HAI-SER
ยท