Sang-Hoon Lee, Ji-Hoon Kim, Kang-Eun Lee, Seong-Whan Lee
Although recent advances in neural vocoder have shown significant improvement, most of these models have a trade-off between audio quality and computational complexity. Since the large model has a limitation on the low-resource devices, a more efficient neural vocoder should synthesize high-quality audio for practical applicability. In this paper, we present Fre-GAN 2, a fast and efficient high-quality audio synthesis model. For fast synthesis, Fre-GAN 2 only synthesizes low and high-frequency parts of the audio, and we leverage the inverse discrete wavelet transform to reproduce the target-resolution audio in the generator. Additionally, we also introduce adversarial periodic feature distillation, which makes the model synthesize high-quality audio with only a small parameter. The experimental results show the superiority of Fre-GAN 2 in audio quality. Furthermore, Fre-GAN 2 has a 10.91×generation acceleration, and the parameters are compressed by 21.23×than Fre-GAN.
Script : Printing in the only sense with which we are at present concerned differs from most if not from all the arts and crafts represented in the Exhibition .
|
|||
---|---|---|---|
Ground Truth |
WaveNet |
HiFi-GAN V1 |
HiFi-GAN V2 |
Fre-GAN V1
|
Fre-GAN 2 V1 (Single-level iDWT) |
Fre-GAN 2 V1 (Multi-level iDWT) |
|
Fre-GAN V2
|
Fre-GAN 2 V2 (Single-level iDWT) |
Fre-GAN 2 V2 (Multi-level iDWT) |
Fre-GAN 2* V2 (Multi-level iDWT, APFD) |
Script : This is best furthered by the avoidance of irrational swellings and spiky projections and by the using of careful purity of line..
|
|||
---|---|---|---|
Ground Truth |
WaveNet |
HiFi-GAN V1 |
HiFi-GAN V2 |
Fre-GAN V1
|
Fre-GAN 2 V1 (Single-level iDWT) |
Fre-GAN 2 V1 (Multi-level iDWT) |
|
Fre-GAN V2
|
Fre-GAN 2 V2 (Single-level iDWT) |
Fre-GAN 2 V2 (Multi-level iDWT) |
Fre-GAN 2* V2 (Multi-level iDWT, APFD) |
Script : The supply of which was, however, limited, and there were not always enough to give bedding to all. The stock was diminished by theft.
|
|||
---|---|---|---|
Ground Truth |
WaveNet |
HiFi-GAN V1 |
HiFi-GAN V2 |
Fre-GAN V1
|
Fre-GAN 2 V1 (Single-level iDWT) |
Fre-GAN 2 V1 (Multi-level iDWT) |
|
Fre-GAN V2
|
Fre-GAN 2 V2 (Single-level iDWT) |
Fre-GAN 2 V2 (Multi-level iDWT) |
Fre-GAN 2* V2 (Multi-level iDWT, APFD) |
Script : He slept in the same bed with a highwayman on one side and a man charged with murder on the other.
|
|||
---|---|---|---|
Ground Truth |
WaveNet |
HiFi-GAN V1 |
HiFi-GAN V2 |
Fre-GAN V1
|
Fre-GAN 2 V1 (Single-level iDWT) |
Fre-GAN 2 V1 (Multi-level iDWT) |
|
Fre-GAN V2
|
Fre-GAN 2 V2 (Single-level iDWT) |
Fre-GAN 2 V2 (Multi-level iDWT) |
Fre-GAN 2* V2 (Multi-level iDWT, APFD) |
Script : The committee seems to have fully realized even at this early date eighteen fifteen.
|
|||
---|---|---|---|
Ground Truth |
WaveNet |
HiFi-GAN V1 |
HiFi-GAN V2 |
Fre-GAN V1
|
Fre-GAN 2 V1 (Single-level iDWT) |
Fre-GAN 2 V1 (Multi-level iDWT) |
|
Fre-GAN V2
|
Fre-GAN 2 V2 (Single-level iDWT) |
Fre-GAN 2 V2 (Multi-level iDWT) |
Fre-GAN 2* V2 (Multi-level iDWT, APFD) |
Script : Their type is on the lines of the German and French rather than of the Roman printers.
|
|||
---|---|---|---|
Fre-GAN 2 V2 (Multi-level iDWT, 500k) |
APFD (500k) |
L1 distance (500k) |
AFD (500K) |
Sub-audio modelling
Script :The committee seems to have fully realized even at this early date eighteen fifteen.
|
|
---|---|
Fre-GAN 2 V2 (Multi-level iDWT, 500k) |
PQMF (500k) |