stem-voice-clone

Multi-stem singing voice cloning toolkit.
Clone any singing voice onto every vocal stem track — preserving all layers.

Listen: Before & After

All audio below is AI-generated with Suno — fully copyright-free.

Original Song

Source

Pop ballad with male vocal + backing vocals + instrumental

Target Voice

Reference

Funk pop with female vocal — this is the voice we want to clone

Converted Result

Output

Same song structure, but now sung with the target voice

Each vocal stem (main vocal, backing vocals) was converted individually, preserving the full layered production. This is what makes stem-voice-clone different from single-track tools.

How It Works

Multi-Stem Conversion
Each vocal track is converted separately — main, harmonies, ad-libs all preserved.
Zero-Shot
No training needed. Just provide 15-25 seconds of the target voice.
Silence Masking
Prevents AI hallucination in silent regions of each stem.
Auto Detection
Automatically classifies vocal vs instrumental tracks from your folder.

Get Started

git clone --recursive https://github.com/Daewooki/stem-voice-clone.git
cd stem-voice-clone
install.bat          # Windows
./install.sh         # Linux/Mac

python convert.py ./my_stems/ --ref singer.mp3
View on GitHub