Ilaria Manco

I am a Research Scientist in the Magenta team at Google DeepMind, working on music understanding and generation with a focus on interactive, real-time generative models.

Research & Background

I completed my PhD at the UKRI Centre for Doctoral Training in Artificial Intelligence and Music (AIM) at Queen Mary University of London, in collaboration with Universal Music Group. My research focused on developing multimodal deep learning methods to enable automatic music understanding by learning richer representations from language and audio.

Throughout my PhD, I have also spent time as a Student Researcher with the Magenta team at Google DeepMind and completed research internships at Adobe in San Francisco, Sony R&D in Tokyo and The Alan Turing Institute in London (as an Engage student).

Previously, I worked as a data scientist and machine learning engineer, and obtained an MSci in Physics from Imperial College London. For my master, I specialised in complex systems and computational physics, with a thesis on spatial scaling in human mobility models.

When I take off my researcher hat, I enjoy doing all things related to electronic music: I DJ, produce, and host a monthly radio show focussed on experimental dance music.

Latest Updates

DEC 2025

Invited talk at the Workshop on AI for Music at NeurIPS 2025 in San Diego

OCT 2025

Our paper "Live Music Models" has been accepted to the NeurIPS 2025 Creative AI track

JAN 2025

Joined Google DeepMind as a Research Scientist

Older updates
NOV 2024

Defended my PhD thesis "Learning Music Representations from Audio and Language" 🎓

NOV 2024

Presented a tutorial on Connecting Music Audio and Natural Language and two papers at ISMIR 2024 in San Francisco, one of which won a Best Paper Award 🥇

AUG 2024

New survey paper released: Foundation Models for Music: A Survey.

JUL 2024

Presented MuChoMusic at the DMLR workshop @ ICML 2024 in Vienna

JUN 2024

Two papers accepted at ISMIR 2024

DEC 2023

Presented a paper at the Workshop on ML for Audio at NeurIPS 2023 in New Orleans

NOV 2023

Started a Student Researcher position with the Magenta team at Google DeepMind in London

JUL 2023

Joined Adobe Research in San Francisco for a summer internship in the Audio Research Group in San Francisco

DEC 2022

Presented a paper and LBD at ISMIR 2022 in Bangalore

JUL 2022

Started a research internship in Yuki Mitsufuji's lab at Sony R&D in Tokyo

MAY 2022

Presented Learning Music Audio Representations via Weak Language Supervision at ICASSP 2022 in Singapore

DEC 2021

Gave a talk at the Neural Audio Synthesis Workshop

DEC 2021

Gave a talk at the 2021 Intelligent Sensing Winter School

APR 2021

MusCaps paper accepted for oral presentation at IJCNN 2021

FEB 2021

Presented my work at the Alan Turing Institute for the Engage Student Research Showcase

DEC 2020

Co-organised the Digital Music Research Network (DMRN+15) workshop

AUG 2020

Attended the Deep Learning Reinforcement Learning (DLRL) summer school

APR 2020

Awarded a Turing Enrichment Scheme studentship

Selected Publications

Full List →

Live Music Models

Antoine Caillon, Brian McWilliams, Cassie Tarakajian, Ian Simon, Ilaria Manco, Jesse Engel, Noah Constant, Yunpeng Li, Timo I. Denk, Alberto Lalama, Andrea Agostinelli, Cheng-Zhi Anna Huang, Ethan Manilow, George Brower, Hakan Erdogan, Heidi Lei, Itai Rolnick, Ivan Grishchenko, Manu Orsini, Matej Kastelic, Mauricio Zuluaga, Mauro Verzetti, Michael Dooley, Ondrej Skopek, Rafael Ferrer, Savvas Petridis, Zalán Borsos, Äaron van den Oord, Douglas Eck, Eli Collins, Jason Baldridge, Tom Hume, Chris Donahue, Kehang Han, Adam Roberts

39th Annual Conference on Neural Information Processing Systems (NeurIPS) Creative AI Track, 2025

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models

Benno Weck*, Ilaria Manco*, Emmanouil Benetos, Elio Quinton, György Fazekas, Dmitry Bogdanov

25th International Society for Music Information Retrieval Conference (ISMIR), 2024

[Best Paper Award]

Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning

Ilaria Manco, Justin Salamon, Oriol Nieto

25th International Society for Music Information Retrieval Conference (ISMIR), 2024

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

Ilaria Manco*, Benno Weck*, SeungHeon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

Machine Learning for Audio Workshop @ NeurIPS, 2023

Song Describer: a Platform for Collecting Textual Descriptions of Music Recordings

Ilaria Manco, Benno Weck, Philip Tovstogan, Minz Won, Dmitry Bogdanov

23rd International Society for Music Information Retrieval Conference (ISMIR) Late breaking/Demo, 2022

Contrastive Audio-Language Learning for Music (MusCALL)

Ilaria Manco, Emmanouil Benetos, Elio Quinton, György Fazekas

23rd International Society for Music Information Retrieval Conference (ISMIR), 2022

Learning Music Audio Representations Via Weak Language Supervision

Ilaria Manco, Emmanouil Benetos, Elio Quinton, Gyorgy Fazekas

2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

MusCaps: Generating Captions for Music Audio

Ilaria Manco, Emmanouil Benetos, Elio Quinton, Gyorgy Fazekas

International Joint Conference on Neural Networks (IJCNN), 2021

Selected Talks

Invited Talk · 2025

Real-time Music Generation: Lowering Latency and Increasing Control

NeurIPS 2025 Workshop on AI for Music (San Diego)

Tutorial · 2024

Connecting Music Audio and Natural Language

ISMIR 2024 (San Francisco)

Invited Talk · 2023

Music & Language Models

Universal Music Group (London)

Guest Lecture · 2022

Bridging Audio and Language to Improve Automatic Music Understanding

Singapore University of Technology & Design (Singapore)

Curriculum Vitae

Full CV

Education

PhD in AI & Music

Queen Mary University of London

2019—2024

MSci in Physics

Imperial College London

2014—2018

Research Scientist

2025—Present

Google DeepMind (San Francisco)

Working on real-time generative music models and interactive tools within the Magenta team

Research Internships

Google DeepMind (London) 2023—2024
Adobe Research (San Francisco) 2023
Sony R&D (Tokyo) 2022
The Alan Turing Institute (London) 2021