This is a fairly literal port of Phil's C code. Of course, management of the display is quite different since it's different hardware. The major optimizations are using an array of Boolean to track where grains are, and replacing as much multiplication and division as possible with bit shifting. Other than that it is a pretty faithful port of Phil's code. So much so that I've reused his comments verbatim.

The code is complex and not immediately obvious, but the comments do a good job of walking you through it.

It doesn't always act exactly as you might expect, but keep in mind that this isn't a physics simulation, it's more of an approximation. The rules are quite simple and, as Phil said on Show & Tell, it's an example of fairly complex emergent behavior.

I'm quite pleased and impressed with how much I could get CircuitPython to do and how well it performs given the speed and memory constraints it's running under.

Division & multiplication by powers of 2

The optimizations around division and multiplication are based on the mapping between base 10 numbers and binary numbers. Especially where 2 and powers of 2 are concerned.

If we take the number 10 and represent it in binary, we have 00001010 (we'll stick to 8 bits for these examples). If we divide 10 by 2 we get 5. In binary that's 00000101. So if we keep dividing by 2 (integer division)

10        00001010
 5        00000101
 2        00000010
 1        00000001

In decimal it's a free for all without any noticeable pattern. But look at the binary representations. Each time we divide by 2, the digits move one place to the right (with the rightmost one being discarded and a 0 begin shifted into the left.

Another thing to notice is, that since each shift right is a dividing by 2, shifting multiple times divides by a power of 2. Shifting twice is dividing by 4 (2 squared), 3 times is dividing by 8 (2 cubed), etc.  In general, shifting right n digits is dividing by 2 to the power of n.

Multiplying by powers of 2 works the same way, but by shifting left rather than right.

So why is this relevant? Shifting is a lot faster than division. A good compiler (like GCC used by the Arduino IDE) or a hardware integer math unit will make this optimization (i.e. shifting for factors that are powers of 2)  without the programmer having to worry about it. However, CircuitPython and MicroPython don't.

This comes up quite a bit in this code. Grain positions are kept in a coordinate system that has 256 times the resolution of the pixel coordinate system. To get the pixel location of a grain, both x and y need to be divided by 256. Fortunately that's just shifting 8 times.

Remove unnecessary code

One optimization that had a significant impact was getting rid of my Dotstar Featherwing library in favor of using the underlying Dotstar library directly. When it struck me that the display was just being updated all at once, in one place, getting rid of the library was an obvious win.  Not only did it free memory, but it also removed a non-trivial amount of work that was being done each time through the loop.

Consider the Dotstar update code. In early versions I started by clearing the display out of habit. That's completely unnecessary when it's updating each pixel. Look for easy wins like this. The code should only do what it absolutely needs to. Anything more is wasting memory and/or time.

I expect I could get another jump in performance if I stripped down the Dotstar library and made a version customized for this application. One that did just what is required.

Optimize carefully

This was a good exercise in optimization. The first version of the python code could barely move 5 grains around in anything approaching a smooth speed. The current version does a reasonable job with 10 grains.

I do want to warn you about trying to optimize in an all or nothing way. Be very deliberate and cautious. Pick one thing to optimize and do it.  Measure the results. If it didn't help significantly, remove it and try something else. The thing about optimizing is that it generally makes the code harder to understand so only do optimizations that make a difference.

The code

Here is the code in its entirety.

# The MIT License (MIT)
# Copyright (c) 2018 Dave Astels
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.

Ported from the C code writen by Phillip Burgess
as used in
Explainatory comments are used verbatim from that code.

import math
import random

import adafruit_dotstar
import adafruit_lsm303
import board
import busio

N_GRAINS = 10  # Number of grains of sand
WIDTH = 12  # Display width in pixels
HEIGHT = 6  # Display height in pixels
MAX_FPS = 45  # Maximum redraw rate, frames/second
GRAIN_COLOR = (0, 0, 16)
MAX_X = WIDTH * 256 - 1
MAX_Y = HEIGHT * 256 - 1

class Grain:
    """A simple struct to hold position and velocity information
    for a single grain."""

    def __init__(self):
        """Initialize grain position and velocity."""
        self.x = 0
        self.y = 0
        self.vx = 0
        self.vy = 0

grains = [Grain() for _ in range(N_GRAINS)]
i2c = busio.I2C(board.SCL, board.SDA)
sensor = adafruit_lsm303.LSM303(i2c)
wing = adafruit_dotstar.DotStar(
    board.D13, board.D11, WIDTH * HEIGHT, 0.25, False)

oldidx = 0
newidx = 0
delta = 0
newx = 0
newy = 0

occupied_bits = [False for _ in range(WIDTH * HEIGHT)]

def index_of_xy(x, y):
    """Convert an x/column and y/row into an index into
    a linear pixel array.

    :param int x: column value
    :param int y: row value
    return (y >> 8) * WIDTH + (x >> 8)

def already_present(limit, x, y):
    """Check if a pixel is already used.

    :param int limit: the index into the grain array of
    the grain being assigned a pixel Only grains already
    allocated need to be checks against.
    :param int x: proposed clumn value for the new grain
    :param int y: proposed row valuse for the new grain
    for j in range(limit):
        if x == grains[j].x or y == grains[j].y:
            return True
    return False

for g in grains:
    placed = False
    while not placed:
        g.x = random.randint(0, WIDTH * 256 - 1)
        g.y = random.randint(0, HEIGHT * 256 - 1)
        placed = not occupied_bits[index_of_xy(g.x, g.y)]
    occupied_bits[index_of_xy(g.x, g.y)] = True
    g.vx = 0
    g.vy = 0

while True:
    # Display frame rendered on prior pass.  It's done immediately after the
    # FPS sync (rather than after rendering) for consistent animation timing.

    for i in range(NUMBER_PIXELS):
        wing[i] = GRAIN_COLOR if occupied_bits[i] else (0, 0, 0)

    # Read accelerometer...
    f_x, f_y, f_z = sensor.raw_acceleration
    ax = f_x >> 8  # Transform accelerometer axes
    ay = f_y >> 8  # to grain coordinate space
    az = abs(f_z) >> 11  # Random motion factor
    az = 1 if (az >= 3) else (4 - az)  # Clip & invert
    ax -= az  # Subtract motion factor from X, Y
    ay -= az
    az2 = (az << 1) + 1  # Range of random motion to add back in

    # ...and apply 2D accel vector to grain velocities...
    v2 = 0  # Velocity squared
    v = 0.0  # Absolute velociy
    for g in grains:
        g.vx += ax + random.randint(0, az2)  # A little randomness makes
        g.vy += ay + random.randint(0, az2)  # tall stacks topple better!

        # Terminal velocity (in any direction) is 256 units -- equal to
        # 1 pixel -- which keeps moving grains from passing through each other
        # and other such mayhem.  Though it takes some extra math, velocity is
        # clipped as a 2D vector (not separately-limited X & Y) so that
        # diagonal movement isn't faster

        v2 = g.vx * g.vx + g.vy * g.vy
        if v2 > 65536:  # If v^2 > 65536, then v > 256
            v = math.floor(math.sqrt(v2))  # Velocity vector magnitude
            g.vx = (g.vx // v) << 8  # Maintain heading
            g.vy = (g.vy // v) << 8  # Limit magnitude

    # ...then update position of each grain, one at a time, checking for
    # collisions and having them react.  This really seems like it shouldn't
    # work, as only one grain is considered at a time while the rest are
    # regarded as stationary.  Yet this naive algorithm, taking many not-
    # technically-quite-correct steps, and repeated quickly enough,
    # visually integrates into something that somewhat resembles physics.
    # (I'd initially tried implementing this as a bunch of concurrent and
    # "realistic" elastic collisions among circular grains, but the
    # calculations and volument of code quickly got out of hand for both
    # the tiny 8-bit AVR microcontroller and my tiny dinosaur brain.)

    for g in grains:
        newx = g.x + g.vx  # New position in grain space
        newy = g.y + g.vy
        if newx > MAX_X:  # If grain would go out of bounds
            newx = MAX_X  # keep it inside, and
            g.vx //= -2  # give a slight bounce off the wall
        elif newx < 0:
            newx = 0
            g.vx //= -2
        if newy > MAX_Y:
            newy = MAX_Y
            g.vy //= -2
        elif newy < 0:
            newy = 0
            g.vy //= -2

        oldidx = index_of_xy(g.x, g.y)  # prior pixel
        newidx = index_of_xy(newx, newy)  # new pixel
        # If grain is moving to a new pixel...
        if oldidx != newidx and occupied_bits[newidx]:
            # but if that pixel is already occupied...
            # What direction when blocked?
            delta = abs(newidx - oldidx)
            if delta == 1:  # 1 pixel left or right
                newx = g.x  # cancel x motion
                # and bounce X velocity (Y is ok)
                g.vx //= -2
                newidx = oldidx  # no pixel change
            elif delta == WIDTH:  # 1 pixel up or down
                newy = g.y  # cancel Y motion
                # and bounce Y velocity (X is ok)
                g.vy //= -2
                newidx = oldidx  # no pixel change
            else:  # Diagonal intersection is more tricky...
                # Try skidding along just one axis of motion if
                # possible (start w/ faster axis). Because we've
                # already established that diagonal (both-axis)
                # motion is occurring, moving on either axis alone
                # WILL change the pixel index, no need to check
                # that again.
                if abs(g.vx) > abs(g.vy):  # x axis is faster
                    newidx = index_of_xy(newx, g.y)
                    # that pixel is free, take it! But...
                    if not occupied_bits[newidx]:
                        newy = g.y  # cancel Y motion
                        g.vy //= -2  # and bounce Y velocity
                    else:  # X pixel is taken, so try Y...
                        newidx = index_of_xy(g.x, newy)
                        # Pixel is free, take it, but first...
                        if not occupied_bits[newidx]:
                            newx = g.x  # Cancel X motion
                            g.vx //= -2  # Bounce X velocity
                        else:  # both spots are occupied
                            newx = g.x  # Cancel X & Y motion
                            newy = g.y
                            g.vx //= -2  # Bounce X & Y velocity
                            g.vy //= -2
                            newidx = oldidx  # Not moving
                else:  # y axis is faster. start there
                    newidx = index_of_xy(g.x, newy)
                    # Pixel's free! Take it! But...
                    if not occupied_bits[newidx]:
                        newx = g.x  # Cancel X motion
                        g.vx //= -2  # Bounce X velocity
                    else:  # Y pixel is taken, so try X...
                        newidx = index_of_xy(newx, g.y)
                        # Pixel is free, take it, but first...
                        if not occupied_bits[newidx]:
                            newy = g.y  # cancel Y motion
                            g.vy //= -2  # and bounce Y velocity
                        else:  # both spots are occupied
                            newx = g.x  # Cancel X & Y motion
                            newy = g.y
                            g.vx //= -2  # Bounce X & Y velocity
                            g.vy //= -2
                            newidx = oldidx  # Not moving
        occupied_bits[oldidx] = False
        occupied_bits[newidx] = True
        g.x = newx
        g.y = newy
Last updated on Jan 23, 2018 Published on Jan 23, 2018