Here is a step by step method for making your own words and sentences via the IBM text to speech web site.
Obtaining MP3 Files from the BlueMix Text to Speech Engine
The IBM speech capability is called BlueMix. You can navigate to the site via a modern web browser to https://text-to-speech-demo.ng.bluemix.net/.
Select the voice you would like to use.
For examples later in this tutorial, I chose Allison which is a female American English voice. While there are thirteen voices in ten variants, you do not have the choice of male and female in all languages. If you would like a different voice than those listed, you should look at other text to voice services.
Type the word/phrase/sentence/text you would like converted to speech in the input box.
You can press > to play the clip in the browser or the Speak button to do the same.
When you are happy with the results, click Download to save the file on your computer. The file will be in MP3 format which is fine.
If the application is not reading your text the way you want, consider alternative spellings of words, even if the spelling looks silly since we're only looking for a good sounding clip.
You can use SSML to change the American English voices. Speech Synthesis Markup Language adds tags to tell a speech engine how you want text read. IBM has documentation here and tags here if you need this capability.
Make all the MP3 clips for your entire project and put them in locations which are easily found for the next step.
Converting BlueMix MP3 Files to WAV Files
The conversion is easily done using the open source program Audacity which is available for all operating systems. If you do not already have the program installed, you can download it from the official site and install it.
Open Audacity and you'll see the main page.
Use File -> Open and navigate to the voice MP3 file for your first conversion. Here I have opened a file 0.mp3 which contains a recording saying "zero".
You'll notice the BlueMix site generates monotone (one channel) audio at 22,050 Hz 16-bit PWM which is (usually) exactly what we want without fiddling with any changes.
Use File -> Export -> Export as WAV to convert the file to a WAV file.
Audacity opens a Save File dialog. Save the WAV file where you'll remember it on your hard drive.
Repeat the same process for all your MP3 files and you'll have a collection of WAV files suitable for use on microcontroller projects.
You will want to write your code such that your WAV files are easy to use. Adafruit tutorials put limited numbers of WAV files in the main (root) directory on the project CIRCUITPY drive. For more than a few files, it is suggested that you make a subdirectory to put the files into. For the number voice WAV files used in this guide, a subdirectory called /numbers was created to hold the files.
If Space is Limited...
All the files above were created by BlueMix to be 22,050 Hz. If you have WAV files that take up a great amount of space, your space on the board CIRCUITPY flash drive might fill up.
If you need additional space, you might consider creating files at a lower sampling rate (lower than 22,050 Hz). Perhaps use 11,025 Hz or maybe down to 8000 Hz. The fidelity of the voice will suffer as you reduce the sample rate. Before Exporting the file to WAV, you will want to Resample the file. In Audacity, use Tracks -> Resample. Type in the new rate. Then you can Export as described above.
This will result in smaller files with a reduction in sound quality. It is not suggested to go lower than 8,000 Hz as that will reproduce voice to 4,000 Hz which will not reproduce the high notes in voice.
Page last edited March 08, 2024
Text editor powered by tinymce.