Welcome to the companion website of our ICASSP Paper. You will find audio examples corresponding to the results presented in the article here.

Abstract

A novel model was recently proposed by Schulze-Forster & al. for unsupervised music source separation. This model allows to tackle some of the major shortcomings of some modern source separation frameworks. Specifically, it eliminates the need for isolated sources during training, performs efficiently with limited data, and can handle homogeneous sources (such as singing voice). Nevertheless, this model relies on an external multipitch estimator and incorporates an adhoc voice assignment procedure. In this paper, we propose to extend this framework and to build a complete, fully differentiable model by integrating a multipitch estimator and a novel differentiable voice assignment module within the core model. We show the merits of our approach though a set of experiments, and we highlight in particular its potential for processing diverse and unseen data.

Index Terms - Unsupervised source separation, multiple singing voices, differentiable models, deep learning

A Fully Differentiable Model for Unsupervised Singing Voice Separation

Abstract