The KittenTTS model was first released as a developer preview on 8/4/25, this guide works with version 0.1 or 0.2 of the KittenTTS model.
KittenTTS is distributed as a Python package, which makes installation easy. It's most convenient to install KittenTTS in the same virtual environment that you created and installed Blinka into for the Voice Bonnet setup.
source env/bin/activate
Replace env/ with the path to and name of your virtual environment if you used a different one.
Next install KittenTTS with pip:
pip install https://github.com/KittenML/KittenTTS/releases/download/0.1/kittentts-0.1.0-py3-none-any.whl
The installation will take several minutes depending on your internet connection, and whether you are using a Pi 4 or Pi 5.
This basic Python script demonstrates how to generate a .wav audio file from text.
from kittentts import KittenTTS
m = KittenTTS("KittenML/kitten-tts-nano-0.2")
audio = m.generate("This high quality TTS model works without a GPU", voice='expr-voice-2-f' )
# available_voices : [ 'expr-voice-2-m', 'expr-voice-2-f', 'expr-voice-3-m', 'expr-voice-3-f', 'expr-voice-4-m', 'expr-voice-4-f', 'expr-voice-5-m', 'expr-voice-5-f' ]
# Save the audio
import soundfile as sf
sf.write('output.wav', audio, 24000)
Save the script into a file and run it while inside of the virtual environment. It will create output.wav. The first time that you generate audio it will automatically download the model files. You'll see output like this when that happens.
To play the wav file, use the aplay program.
aplay output.wav
If you do not hear the sound output from the voice bonnet, then you need to target the correct sound card with the -D or --device arguments the same way as it was done for the speaker-test during the voice bonnet setup. Use aplay -l to find the card number and then put it in place of the # in plughw:#,0. For example with sound card number 3:
aplay --device plughw:3,0 output.wav
Page last edited August 22, 2025
Text editor powered by tinymce.