Hizlari-bektore manipulazioaren bidezko genero-anbiguoko hizketaren sintesia euskaraz
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
There is a growing interest in text-to-speech (TTS) systems with gender-ambiguous voices, among other things due to their potential to avoid gender biases and stereotypes in voice assistants and smart speakers. In this paper we present and evaluate some novel methods that apply voice morphing techniques to speaker embeddings in order to obtain neural network-based gender-ambiguous voiced TTS systems for the Basque language. The speaker embeddings are obtained training a multi-speaker Tacotron 2. We compare the performance of systems with and without speaker embedding normalization with a scaling parameter, and also the application of these systems to the average embeddings of each gender and to real voice embeddings. The results prove that the methods presented are valid to obtain gender-ambiguous voices with acceptable, albeit improvable, quality.
##plugins.themes.bootstrap3.article.details##
speech synthesis, gender-ambiguous voice, speaker embeddings, voice morphing, Basque
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.