# Python Edge Speech Recognition with Voice2JSON

## Overview

![](https://cdn-learn.adafruit.com/assets/assets/000/102/718/medium800/raspberry_pi_Main_Image.jpg?1623175813)

Many of the reliable speech recognition systems today such as Amazon Alexa or Google assistant connect to the internet and remote servers to process the speech data. However, with [Voice2JSON](http://voice2json.org/), you can have your speech recognition data processed right on your Raspberry Pi This is called edge detection.

[Voice2JSON](http://voice2json.org/) works on top of other existing speech recognition dictionaries. This guide will be using a profile based on PocketSphinx because of its large vocabulary. PocketSphinx is maintained by Carnegie Mellon University.

[Voice2JSON](http://voice2json.org/) is a speech recognition tool for listening to speech and translating that to an intent. This allows your program to easily respond to the intent instead of worrying about the specific syntax of what was spoken. For instance, if somebody says "set light to green" instead of "change light to green" then it will still work.

While [Voice2JSON](http://voice2json.org/) was not designed to be run in Python, this guide does it anyways by launching the executable with subprocess commands and monitoring the JSON output. By leveraging Blinka, the CircuitPython compatibility layer, some powerful things can be done with the Adafruit BrainCraft HAT.

## Parts
### Adafruit BrainCraft HAT - Machine Learning for Raspberry Pi 4

[Adafruit BrainCraft HAT - Machine Learning for Raspberry Pi 4](https://www.adafruit.com/product/4374)
The idea behind the BrainCraft HAT is that you’d be able to “craft brains” for Machine Learning on the EDGE, with Microcontrollers & Microcomputers. On&nbsp;ASK AN ENGINEER, our founder & engineer chatted with&nbsp;Pete Warden, the technical lead of the mobile,...

In Stock
[Buy Now](https://www.adafruit.com/product/4374)
[Related Guides to the Product](https://learn.adafruit.com/products/4374/guides)
![Video of a white hand hovering a coffe mug over a Adafruit BrainCraft HAT thats connected to a Raspberry Pi 4. Display detects that its a coffee mug. ](https://cdn-shop.adafruit.com/product-videos/640x480/4374-00.jpg)

### Raspberry Pi 4 Model B - 2 GB RAM

[Raspberry Pi 4 Model B - 2 GB RAM](https://www.adafruit.com/product/4292)
The Raspberry Pi 4 Model B is the newest Raspberry Pi computer made, and the Pi Foundation knows you can always make a good thing _better_! And what could make the Pi 4...

Out of Stock
[Buy Now](https://www.adafruit.com/product/4292)
[Related Guides to the Product](https://learn.adafruit.com/products/4292/guides)
![Angled shot of Raspberry Pi 4](https://cdn-shop.adafruit.com/640x480/4292-03.jpg)

You will need 2 of these speakers:

### Mono Enclosed Speaker - 3W 4 Ohm

[Mono Enclosed Speaker - 3W 4 Ohm](https://www.adafruit.com/product/3351)
Listen up! This 2.8" x 1.2" speaker&nbsp;is&nbsp;a&nbsp;great addition to any audio project where you need 4 ohm impedance and 3W or less of power. We particularly like this&nbsp;speaker&nbsp;as it is&nbsp;small and enclosed for good audio volume and quality. It has a handy JST 2PH...

In Stock
[Buy Now](https://www.adafruit.com/product/3351)
[Related Guides to the Product](https://learn.adafruit.com/products/3351/guides)
![Enclosed Speaker with JST cable](https://cdn-shop.adafruit.com/640x480/3351-01.jpg)

### Official Raspberry Pi Power Supply 5.1V 3A with USB C

[Official Raspberry Pi Power Supply 5.1V 3A with USB C](https://www.adafruit.com/product/4298)
The official Raspberry Pi USB-C power supply is here! And of course, we have 'em in classic Adafruit black! Superfast with just the right amount of cable length to get your Pi 4 projects up and running!

Best for use with Pi 4 series, [Pi...](https://www.adafruit.com/product/5814)

In Stock
[Buy Now](https://www.adafruit.com/product/4298)
[Related Guides to the Product](https://learn.adafruit.com/products/4298/guides)
![Angled shot of Official Raspberry Pi Power Supply 5.1V 3A with USB C with Power plug facing down. ](https://cdn-shop.adafruit.com/640x480/4298-04.jpg)

# Python Edge Speech Recognition with Voice2JSON

## Raspberry Pi Setup

Danger: 

First to setup all of the packages on the Raspberry Pi. If you haven't done so already, take a look at the [Adafruit BrainCraft HAT - Easy Machine Learning for Raspberry Pi](https://learn.adafruit.com/adafruit-braincraft-hat-easy-machine-learning-for-raspberry-pi) guide if you are using the BrainCraft HAT.

This will take you through all the steps needed to get the Raspberry Pi updated and the BrainCraft HAT all set up to the point needed to continue. However, skip the display Module Setup portion since you will be using Python to draw to the display.

Warning: 

Be sure you have some speakers hooked up to the BrainCraft HAT, either through the JST ports on the front or the headphone jack. You will need these later for the speech synthesis.

![](https://cdn-learn.adafruit.com/assets/assets/000/102/544/medium800/raspberry_pi_internet_of_things___iot_braincraft.jpeg?1622652784)

## Set your Timezone

If you haven't done so already, be sure your timezone is set correctly. A freshly setup Raspberry Pi is usually set to GMT by default. You can change it by typing:

`sudo raspi-config`

Select **Localisation Options.**

![raspberry_pi_Raspi-Config_Localisation.png](https://cdn-learn.adafruit.com/assets/assets/000/102/706/medium640/raspberry_pi_Raspi-Config_Localisation.png?1623172403)

The select **&nbsp;Timezone**. This will take you to a section where you can select your Timezone. The organization is a bit unusual. For instance, you were in the US Pacific Timezone, you would select **US** and **Pacific Ocean**.

![raspberry_pi_Raspi-Config_Timezone.png](https://cdn-learn.adafruit.com/assets/assets/000/102/707/medium640/raspberry_pi_Raspi-Config_Timezone.png?1623172453)

This will ensure that it is able to tell you the correct time. You can find more information about using **raspi-config** in the [official documentation](https://www.raspberrypi.org/documentation/configuration/raspi-config.md).

## Install Voice2JSON

Installing Voice2JSON is fairly straightforward. First you need to install some prerequisites by running the following command:

```terminal
sudo apt-get install libasound2 libasound2-data libasound2-plugins
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/708/medium800/raspberry_pi_libsound2_install.png?1623172681)

Next verify you are on the armhf architecture by typing:

```terminal
dpkg-architecture | grep DEB_BUILD_ARCH=
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/709/medium800/raspberry_pi_architecture.png?1623172826)

Next download the package with **wget** and install the Voice2JSON file:

```terminal
wget https://github.com/synesthesiam/voice2json/releases/download/v2.0/voice2json_2.0_armhf.deb
sudo apt install ./voice2json_2.0_armhf.deb
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/710/medium800/raspberry_pi_Install_Voice2JSON.png?1623173124)

## Speech Synthesis Library

Voice2Json is also capable of making use of speech synthesis, so it's helpful to have the espeak library installed:

```terminal
sudo apt-get install espeak-ng
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/711/medium800/raspberry_pi_Espeak-ng_Install.png?1623173250)

## Install a Profile

Voice2JSON uses profiles in order to combine a language with a speech recognition engine. The profile is not included as part of the package installation, so you will need to install that separately. Though there are many additional profiles, this setup installs the US English/PocketSphinx profile using the following commands:

```terminal
mkdir -p ~/.config/voice2json
curl -SL https://github.com/synesthesiam/en-us_pocketsphinx-cmu/archive/v1.0.tar.gz | tar -C ~/.config/voice2json --skip-old-files --strip-components=1 -xzvf -
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/712/medium800/raspberry_pi_Profile_Install.png?1623173367)

## Latest Pillow Library

The demo project uses `displayio` which uses Pillow, or the Python Imaging Library, underneath. To get the latest version of Pillow, you can install by upgrading to the latest PIP and then installing Pillow with the following commands:

```terminal
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade Pillow
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/713/medium800/raspberry_pi_Pillow_Install.png?1623173449)

## CircuitPython Libraries

A few CircuitPython libraries are needed for this project. These can be easily installed through PIP using the following command:

```terminal
python3 -m pip install adafruit-circuitpython-st7789 adafruit-circuitpython-dotstar
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/714/medium800/raspberry_pi_CP_Lib_Install.png?1623173689)

After that finishes, you should be ready to configure your setup.

# Python Edge Speech Recognition with Voice2JSON

## Configuring Custom Commands

To configure custom commands, you will need to edit the **sentences.ini** file located inside the `~/.config/voice2json` folder. You can find out a lot more information from the [official documentation](http://voice2json.org/sentences.html). The demo code will only be covering a subset of the available features.

Go ahead and make sure your **sentences.ini** file looks like the following:

https://github.com/adafruit/Adafruit_Learning_System_Guides/blob/main/Voice2Json_Edge_Detection/sentences.ini

There are 3 different intents here that can be communicated in a few different ways each. For the **GetTime** intent, there are a few ways of wording it that will yield the same result.

For the **ChangeLightColor** intent, this uses a few list variables to limit the responses to a manageable set. It also uses some optional words to give more flexibility to the phrasing.

For the **DisplayPicture** intent, this is similar to the previous intent, except with **type** , it is not returning value since it doesn't matter for the demo. It just makes it easier to expand the vocabulary without lots of extra typing.

## Train the Profile

Before you can run the demo, you will need to train your profile. You can do that by running the following command:

```terminal
voice2json train-profile
```

![](https://cdn-learn.adafruit.com/assets/assets/000/102/716/medium800/raspberry_pi_Train_Profile.png?1623174995)

It usually only takes about 5 seconds on a Raspberry Pi 4. You can ignore the warning about the missing word. It just means that it didn't exist in the dictionary, but it seems to have no trouble recognizing it.

# Python Edge Speech Recognition with Voice2JSON

## Demo Code

To download code and libraries to your Raspberry Pi, click the **Download Project Bundle** button below to get the code and other project files as a zip file.

https://github.com/adafruit/Adafruit_Learning_System_Guides/blob/main/Voice2Json_Edge_Detection/demo.py

If you haven't already done so, be sure to copy **sentences.ini** over to `~/.config/voice2json` folder that was created during the Profile Installation step in the Raspberry Pi Setup.

The **lib** folder can be ignored since you should have already installed the necessary libraries during the Raspberry Pi Setup.

The **demo.py** file and **images** folder should be uploaded to the home folder of your Raspberry Pi. Once you have everything in place, start it up by typing:

```terminal
python3 demo.py
```

Danger: 

## Use
Wait until it says **Ready** in the terminal and then you can begin speaking. You can ask it to tell you the time, to display a **cat** or **adafruit** image, or to set the left, middle, or right lights to **red** , **orange** , **yellow** , **green** , **blue** , **purple** , **white** , or **off**.

![](https://cdn-learn.adafruit.com/assets/assets/000/102/717/medium800/raspberry_pi_Terminal_Output.png?1623175331)

Here's a video of the demo in action:

https://youtu.be/p5QJj1aCjnQ

## Demo Code Walkthrough

Next we're going to go over the demo code section by section. First, you'll notice quite a few imports. Most of these are standard python imports, but here's the purpose of the different imports:

- **os** and **random** are used to get a list of files and folders for automatically finding images and randomly select one.
- **subprocess** is used to to run the Voice2JSON file from within Python
- **json** is used to decode the output from Voice2JSON
- **re** is the regular expression library and is used to convert the Intent names to proper Python function names to make adding new ones easier
- **datetime** is used for getting the current Date and Time
- **displayio** is the Display library which was rewritten to run with Blinka
- The remaining libraries are CircuitPython libraries for accessing various BrainCraft accessories

```python
import os
import subprocess
import random
import json
import re
from datetime import datetime
import board
import displayio
import fourwire
import adafruit_dotstar
from adafruit_st7789 import ST7789
```

Next are a few settings including the name of the images folder, the Voice2JSON commands for listening and speaking, and a pre-compiled regular expression pattern to speed things up a bit.

```python
IMAGE_FOLDER = "images"

listen_command = "/usr/bin/voice2json transcribe-stream | /usr/bin/voice2json recognize-intent"
speak_command = "/usr/bin/voice2json speak-sentence '{}'"
pattern = re.compile(r'(?&lt;!^)(?=[A-Z])')
```

After that, the DotStars are initialized and set to off.

```python
dots = adafruit_dotstar.DotStar(board.D6, board.D5, 3, brightness=0.2, pixel_order=adafruit_dotstar.RBG)
dots.fill(0)
```

Then there are a couple of data structures used to provide meaning to the recognized values. The colors will translate the name to the actual value displayed on the DotStar. Altering these values would change the color displayed.

The lights list is just to give positional information to the name. Altering these values would change the order that it thinks the DotStars are in.

```python
colors = {
    'red': 0xff0000,
    'green': 0x00ff00,
    'blue': 0x0000ff,
    'yellow': 0xffff00,
    'orange': 0xff8800,
    'purple': 0x8800ff,
    'white': 0xffffff,
    'off': 0
}

lights = ['left', 'middle', 'right']
```

Next is the `get_time()` function. It really just reads the current time, formats it, and uses the speak function which we will get into more detail further down.

```python
def get_time():
    now = datetime.now()
    speak("The time is {}".format(now.strftime("%-I:%M %p")))
```

Next up are the picture display functions. The `display_picture()` function is the main handler and starts off by getting the full path to the image folder and passing it into `get_random_file()`.

The `get_random_file()` function does exactly what it sounds like, it randomly returns a file inside the specified path. It starts off by getting all files and filters them down to image files. Then it selects one at random and returns it.

The `load_image()` function will clear any existing sprite from the main splash group and then create a new sprite from the image located at the specified path and display it on the screen.

```python
def display_picture(category):
    path = os.getcwd() + "/" + IMAGE_FOLDER + "/" + category
    print("Showing a random image from {}".format(category))
    load_image(path + "/" + get_random_file(path))

def get_random_file(folder):
    filenames = []
    for item in os.listdir(folder):
        if os.path.isfile(os.path.join(folder, item)) and item.endswith((".jpg", ".bmp", ".gif")):
            filenames.append(item)
    if len(filenames):
        return filenames[random.randrange(len(filenames))]
    return None

def load_image(path):
    "Load an image from the path"
    if len(splash):
        splash.pop()
    # CircuitPython 6 &amp; 7 compatible
    bitmap = displayio.OnDiskBitmap(open(path, "rb"))
    sprite = displayio.TileGrid(bitmap, pixel_shader=getattr(bitmap, 'pixel_shader', displayio.ColorConverter()))

    # # CircuitPython 7+ compatible
    # bitmap = displayio.OnDiskBitmap(path)
    # sprite = displayio.TileGrid(bitmap, pixel_shader=bitmap.pixel_shader)

    splash.append(sprite)
```

The `change_light_color()` function is the handler for changing lights. It will change the light at the specified position to the specified color. It starts off by figuring out the DotStar index by looking up the position name, then sets the DotStar at that index to the corresponding color value.

```python
def change_light_color(lightname, color):
    dotstar_number = lights.index(lightname)
    dots[dotstar_number] = colors[color]
    print("Setting Dotstar {} to 0x{:06X}".format(dotstar_number, colors[color]))
```

Finally, there's the `speak()` function, which simply takes the value of `speak_command`, substitutes in the specified text, and runs the command using the `shell_command()` function.

```python
def speak(sentence):
    for output_line in shell_command(speak_command.format(sentence)):
        print(output_line, end='')
```

The `shell_command()` function is where a lot of the magic happens. It is responsible for running the given command in a subprocess and returning any output. It will keep running and yielding output until the subprocess has completely finished running.

```python
def shell_command(cmd):
    popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True, universal_newlines=True)
    for stdout_line in iter(popen.stdout.readline, ""):
        yield stdout_line 
    popen.stdout.close()
    return_code = popen.wait()
    if return_code:
        raise subprocess.CalledProcessError(return_code, cmd)
```

The `process_output()` function is the main JSON processing function. It starts off by decoding the JSON and making sure to only process if a timeout hadn't occurred, which happens regularly from Voice2JSON when no speech is detected.

If it detects genuine recognition, it will take the given **Intent** name and convert it to a Python function. All of the Intent names are defined in the **sentences.ini** file. It will make sure the function has been defined by looking in globals() and call it if it has.

The slots are the parameters that are defined in the **sentences.ini** file in curly braces. The `**` operator (double asterisk) is used for keyword argument unpacking and will pass the parameters in as function arguments automatically. The parameter and function argument names must match.

```python
def process_output(line):
    data = json.loads(line)
    if not data['timeout'] and data['intent']['name']:
        func_name = pattern.sub('_', data['intent']['name']).lower()
        if func_name in globals():
            globals()[func_name](**data['slots'])
```

The next bit of code is standard displayio setup for the display on the BrainCraft HAT.

```python
displayio.release_displays()
spi = board.SPI()
tft_cs = board.CE0
tft_dc = board.D25
tft_lite = board.D26

isplay_bus = fourwire.FourWire(spi, command=tft_dc, chip_select=tft_cs)

display = ST7789(
    display_bus,
    width=240,
    height=240,
    rowstart=80,
    rotation=180,
    backlight_pin=tft_lite,
)

splash = displayio.Group()
display.show(splash)
```

Finally, is the code to run the main `listen_command` and process the output. If you've been programming in Python for a while, you may notice there is no main while loop. That's because the Voice2JSON is run inside of a subprocess and since it has a main loop, it isn't necessary to add one.

```python
for output_line in shell_command(listen_command):
    process_output(output_line)
```


## Featured Products

### Adafruit BrainCraft HAT - Machine Learning for Raspberry Pi 4

[Adafruit BrainCraft HAT - Machine Learning for Raspberry Pi 4](https://www.adafruit.com/product/4374)
The idea behind the BrainCraft HAT is that you’d be able to “craft brains” for Machine Learning on the EDGE, with Microcontrollers & Microcomputers. On&nbsp;ASK AN ENGINEER, our founder & engineer chatted with&nbsp;Pete Warden, the technical lead of the mobile,...

In Stock
[Buy Now](https://www.adafruit.com/product/4374)
[Related Guides to the Product](https://learn.adafruit.com/products/4374/guides)
### Raspberry Pi 4 Model B - 2 GB RAM

[Raspberry Pi 4 Model B - 2 GB RAM](https://www.adafruit.com/product/4292)
The Raspberry Pi 4 Model B is the newest Raspberry Pi computer made, and the Pi Foundation knows you can always make a good thing _better_! And what could make the Pi 4...

Out of Stock
[Buy Now](https://www.adafruit.com/product/4292)
[Related Guides to the Product](https://learn.adafruit.com/products/4292/guides)
### Mono Enclosed Speaker - 3W 4 Ohm

[Mono Enclosed Speaker - 3W 4 Ohm](https://www.adafruit.com/product/3351)
Listen up! This 2.8" x 1.2" speaker&nbsp;is&nbsp;a&nbsp;great addition to any audio project where you need 4 ohm impedance and 3W or less of power. We particularly like this&nbsp;speaker&nbsp;as it is&nbsp;small and enclosed for good audio volume and quality. It has a handy JST 2PH...

In Stock
[Buy Now](https://www.adafruit.com/product/3351)
[Related Guides to the Product](https://learn.adafruit.com/products/3351/guides)
### Official Raspberry Pi Power Supply 5.1V 3A with USB C

[Official Raspberry Pi Power Supply 5.1V 3A with USB C](https://www.adafruit.com/product/4298)
The official Raspberry Pi USB-C power supply is here! And of course, we have 'em in classic Adafruit black! Superfast with just the right amount of cable length to get your Pi 4 projects up and running!

Best for use with Pi 4 series, [Pi...](https://www.adafruit.com/product/5814)

In Stock
[Buy Now](https://www.adafruit.com/product/4298)
[Related Guides to the Product](https://learn.adafruit.com/products/4298/guides)

## Related Guides

- [Raspberry Pi Care and Troubleshooting](https://learn.adafruit.com/raspberry-pi-care-and-troubleshooting.md)
- [Adafruit BrainCraft HAT - Easy Machine Learning for Raspberry Pi](https://learn.adafruit.com/adafruit-braincraft-hat-easy-machine-learning-for-raspberry-pi.md)
- [Using Google Assistant on the BrainCraft HAT or Voice Bonnet](https://learn.adafruit.com/using-google-assistant-on-the-braincraft-hat.md)
- [Raspberry Pi YouTube Boombox](https://learn.adafruit.com/youtube-radio.md)
- [Basic TensorFlow Object Recognition on any Computer or iOS device with Google Colab](https://learn.adafruit.com/basic-tensorflow-object-recognition-in-the-cloud-google-colab.md)
- [Build an ML Package Detector with Lobe](https://learn.adafruit.com/build-an-ml-package-detector.md)
- [Running TensorFlow Lite Object Recognition on the Raspberry Pi 4 or Pi 5](https://learn.adafruit.com/running-tensorflow-lite-on-the-raspberry-pi-4.md)
- [BrainCraft Camera Case](https://learn.adafruit.com/braincraft-camera-case.md)
- [Build an ML Rock Paper Scissors Game with Lobe](https://learn.adafruit.com/lobe-rock-paper-scissors.md)
- [diy lofi hip hop raspberry pi radio](https://learn.adafruit.com/lofi-hip-hop-raspberry-pi-radio-braincraft.md)
- [Raspberry Pi + Teachable Machine = Teachable Pi](https://learn.adafruit.com/teachable-machine-raspberry-pi-tensorflow-camera.md)
- [Machine Learning 101 with Lobe and BrainCraft](https://learn.adafruit.com/machine-learning-101-lobe-braincraft.md)
- [CircuitPython on the Arduino Nano RP2040 Connect](https://learn.adafruit.com/circuitpython-on-the-arduino-nano-rp2040-connect.md)
- [Adafruit NeoRGB Stemma](https://learn.adafruit.com/adafruit-neorgb-stemma.md)
- [NeoPixel Run LED Arcade Game](https://learn.adafruit.com/pixel-chase-game.md)
- [PyPortal Pet Planter with Adafruit IO](https://learn.adafruit.com/pyportal-pet-planter-with-adafruit-io.md)
