A Multimodal Symphony: Integrating Taste and Sound through Generative AI
Abstract
In recent decades, neuroscientific and psychological research has traced direct relationships between taste and auditory perceptions. This article explores multimodal generative models capable of converting taste information into music, building on this foundational research. We provide a brief review of the state of the art in this field, highlighting key findings and methodologies. We present an experiment in which a fine-tuned version of a generative music model (MusicGEN) is used to generate music based on detailed taste descriptions provided for each musical piece. The results are promising: according the participants' (n=111) evaluation, the fine-tuned model produces music that more coherently reflects the input taste descriptions compared to the non-fine-tuned model. This study represents a significant step towards understanding and developing embodied interactions between AI, sound, and taste, opening new possibilities in the field of generative AI. We release our dataset, code and pre-trained model at: https://osf.io/xs5jy/.
Community
Generative AI has been making waves in creative domains, from text and image generation to music composition. However, one sensory modality has remained largely unexplored in the realm of AI-driven creativity: taste. In A Multimodal Symphony: Integrating Taste and Sound through Generative AI, we investigate how AI can bridge the gap between taste and sound, generating music that embodies the essence of different flavors.
The Science Behind Taste-Sound Associations
Neuroscientific and psychological research has shown that certain auditory characteristics influence how we perceive taste. High-pitched sounds, for instance, are often linked to sweetness, while low-pitched, resonant tones can evoke bitterness. These crossmodal correspondences form the foundation for our study, where we fine-tuned a generative music model to produce compositions aligned with specific taste descriptions.
Fine-Tuning MusicGEN for Taste-Based Composition
For our experiment, we fine-tuned MusicGEN, an open-source music generation model, on a dataset enriched with taste and emotional descriptors. Using the Taste & Affect Music Database, we trained the model to associate musical elements—such as tempo, timbre, and harmony—with specific taste profiles (sweet, sour, bitter, and salty). The goal was to determine whether this fine-tuned model could generate music that listeners perceive as more representative of the given taste prompts compared to its non-fine-tuned counterpart.
Evaluating the Generated Music
To validate our approach, we conducted an online survey with 111 participants, who listened to audio clips and rated their coherence with corresponding taste descriptions. The results were promising: our fine-tuned model generated music that was significantly more aligned with the intended taste attributes, particularly for sweet, bitter, and sour prompts. However, representations of saltiness proved more challenging, indicating a need for further refinement in dataset composition and model training.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms (2025)
- Music for All: Exploring Multicultural Representations in Music Generation Models (2025)
- A Comprehensive Survey on Generative AI for Video-to-Music Generation (2025)
- Can Impressions of Music be Extracted from Thumbnail Images? (2025)
- NOTA: Multimodal Music Notation Understanding for Visual Large Language Model (2025)
- MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation (2025)
- RenderBox: Expressive Performance Rendering with Text Control (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper