In this simple benchmark, we compare two ways of finding the amplitude of a signal.   normalized_rms computes it in a traditional way, handling each number one by one in Python code.  normalized_rms_ulab computes more quickly by working on groups of numbers in a ulab array at the same time.  ulab.numpy.std computes most quickly by moving all operations from Python to ulab.

# SPDX-FileCopyrightText: 2020 Jeff Epler for Adafruit Industries
# SPDX-FileCopyrightText: 2020 Zoltán Vörös for Adafruit Industries
#
# SPDX-License-Identifier: MIT

import time
import math
from ulab import numpy as np

def mean(values):
    return sum(values) / len(values)

def normalized_rms(values):
    minbuf = int(mean(values))
    samples_sum = sum(
        float(sample - minbuf) * (sample - minbuf)
        for sample in values
    )

    return math.sqrt(samples_sum / len(values))

def normalized_rms_ulab(values):
    # this function works with ndarrays only
    minbuf = np.mean(values)
    values = values - minbuf
    samples_sum = np.sum(values * values)
    return math.sqrt(samples_sum / len(values))

# Instead of using sensor data, we generate some data
# The amplitude is 5000 so the rms should be around 5000/1.414 = 3536
nums_list = [int(8000 + math.sin(i) * 5000) for i in range(100)]
nums_array = np.array(nums_list)

def timeit(s, f, n=100):
    t0 = time.monotonic_ns()
    for _ in range(n):
        x = f()
    t1 = time.monotonic_ns()
    r = (t1 - t0) * 1e-6 / n
    print("%-30s : %8.3fms [result=%f]" % (s, r, x))

print("Computing the RMS value of 100 numbers")
timeit("traditional", lambda: normalized_rms(nums_list))
timeit("ulab, with ndarray, some implementation in python", lambda: normalized_rms_ulab(nums_array))
timeit("ulab only, with list", lambda: np.std(nums_list))
timeit("ulab only, with ndarray", lambda: np.std(nums_array))

On my Metro M4, the ulab code computes almost exactly the same value, but over 40 times faster. Take care if running this on a board with an LCD display such as the CLUE, printing the results on the display takes a lot longer than the computation itself, and completely distorts the results.

traditional               :    2.951ms [result=3535.843611]
ulab, algorithm in python :    0.251ms [result=3535.853624]
ulab only, with list      :    0.336ms [result=3535.854340]
ulab only, with ndarray   :    0.068ms [result=3535.854340]

This guide was first published on Mar 06, 2020. It was last updated on Mar 06, 2020.

This page (A Simple Benchmark) was last updated on Oct 21, 2021.

Text editor powered by tinymce.