HVC-Audio-Convert Base Models

Overview

These models serve as the foundational components for HVC-Audio-Convert (Soft-VC Voice Conversion), an advanced voice conversion framework that combines SoftVC feature extraction with the VITS (Conditional Variational Autoencoder with Adversarial Learning) architecture.

Key Features

High-quality voice conversion capabilities
Pre-trained on diverse vocal datasets
Supports cross-lingual voice conversion
Compatible with HVC-Audio-Convert v4.0 and newer

Technical Details

Architecture: Based on VITS (Conditional Variational Autoencoder)
Feature Extraction: Hibernates content encoder
Training Data: Curated multi-speaker datasets
Model Format: PyTorch checkpoints

Usage

Download the desired base model
Use with HVC-Audio-Convert framework
Fine-tune on target voice data
Perform voice conversion

Requirements

HVC-Audio-Convert framework
Python 3.8+
PyTorch 1.13.0+
CUDA compatible GPU (recommended)

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Citation

If you use these models in your research, please cite: