Multi-stem singing voice cloning toolkit.
Clone any singing voice onto every vocal stem track — preserving all layers.
Listen: Before & After
All audio below is AI-generated with Suno — fully copyright-free.
Original Song
Source
Pop ballad with male vocal + backing vocals + instrumental
Target Voice
Reference
Funk pop with female vocal — this is the voice we want to clone
Converted Result
Output
Same song structure, but now sung with the target voice
Each vocal stem (main vocal, backing vocals) was converted individually, preserving the full layered production. This is what makes stem-voice-clone different from single-track tools.
How It Works
Multi-Stem Conversion
Each vocal track is converted separately — main, harmonies, ad-libs all preserved.
Zero-Shot
No training needed. Just provide 15-25 seconds of the target voice.
Silence Masking
Prevents AI hallucination in silent regions of each stem.
Auto Detection
Automatically classifies vocal vs instrumental tracks from your folder.
Get Started
git clone --recursive https://github.com/Daewooki/stem-voice-clone.git
cd stem-voice-clone
install.bat # Windows
./install.sh # Linux/Mac
python convert.py ./my_stems/ --ref singer.mp3