The Adafruit_Protomatter repository on Github contains all source code for the project. Create a new branch, to facilitate merging pull requests later.

Files there include:


With determination and a little luck, this is the only file to edit to add support for a new device. Several #defines and functions are declared here, tying device-specific registers and peripherals to library-known names.


Like caramel between the crisp cookie center and delicious chocolate, this middle layer works between the low-level details set in arch.h and the higher-level functions provided by the Arduino or CircuitPython front-ends. It provides essential matrix-related operations without touching hardware-specific features. With rare exceptions, you shouldn’t need to edit this.


Enumerations, structures and function prototypes for core.c. You shouldn’t need to edit this.


This is the Arduino “wrapper” for the lower-level matrix code. It works together with the Adafruit_GFX library to provide a common syntax for drawing shapes, text, etc. Shouldn’t require editing.


Header file to accompany Adafruit_Protomatter.cpp. Arduino sketches should #include this. Again, file shouldn’t require editing.

The remaining files in the repository are for Github automation, the Arduino Library manager, Arduino examples and so forth. Probably won’t need editing, except for an occasional bump to the version number in

Expanding arch.h For a New Part

Inside arch.h, you’ll find whole sections of code conditionally compiled in #if defined (and corresponding #endif) statements. You’ll want to add a new conditional check for your device — neither too specific (there’s usually some #define that broadly relates to a family of related devices), nor too vague that it accidentally compiles on an incompatible chip.

For example, anything in the Microchip SAMD51 family is covered by this:

#if defined(__SAMD51__)
  ...a whole bunch of code...
#endif // end __SAMD51__

Within each device family section, it’s further divided by (usually) a pair of #if defined checks, one to test if we’re compiling in the Arduino environment, another if CircuitPython…perhaps others in the future:

#if defined(ARDUINO)
  ...a whole bunch of code...
#elif defined(CIRCUITPY)
  ...a whole bunch of code...

This is because each environment has different convenience functions for certain operations. Arduino, for example, provides digitalWrite() for GPIO output. It’s different in CircuitPython, usually distinct to each architecture. The Protomatter code builds on these common operations. Additionally, in the Arduino setting, the library is usually attached to a specific timer/counter peripheral, set at compile-time, whereas in CircuitPython timers are a dynamically-allocated resource…no telling what timer you’re using until run-time.

THEREFORE, each architecture and environment is expected to establish a known and fixed set of macros or functions providing these operations. core.c, which #includes arch.h, then goes about its business using only those known function names, never having to refer to device-specific hardware.

There are three groups of macros and functions: one related to GPIO, one related to timers, and one miscellaneous category. With just a few exceptions where noted, the following macros or functions are required.

The macros/functions are all prefixed with _PM_ (for Protomatter), sort of a brute-force namespacing of things (to reduce likelihood of collisions with user code) since we’re in simple C here.

GPIO-Related Macros/Functions

“Pin numbers,” as described here, refer to an environment’s particular indexing of pins, which might not map directly to a device’s PORTs and bits. Arduino digital and analog pins, for example, might really be scattered all over the place, but are exposed to the programmer as a tidy sequential list starting from zero. Other environments may have their own numbering system…but it’s always assumed there’s some numbering system. If not, you’ll need to make one up.


Return (void *) address of PORT OUT register corresponding to a pin number. Code calling this can cast it to whatever type’s needed (usually a volatile uint32_t *).


Return (void *) address of PORT atomic bit-set register corresponding to a pin number. Again, calling code can cast as needed.

“Atomic” refers to an operation that is uninterrupted and irreducibly self-contained — not a read-modify-write sequence. Most modern microcontrollers distinguish the PORT OUT register from SET and CLEAR.


Return (void *) address of PORT atomic bit-clear register corresponding to a pin number. Cast as needed.


Return (void *) address of PORT atomic toggle-bits register. Cast as needed.

Not all devices offer this, in which case is must be left undefined (not a defined-but-empty macro)!


Return bit mask (usually uint32_t) within PORT register corresponding to a pin number.

When compiling for Arduino, this just maps to digitalPinToBitMask()…other environments will need an equivalent.


Return index (offset) of byte (0 to 3) within 32-bit PORT corresponding to a pin number.

If a device has 16-bit PORTs, this returns 0 or 1.


Return index of word (0 or 1) within 32-bit PORT corresponding to a pin number.

If a device has 16-bit PORTs, this always returns 0 (a macro is fine).


Set a pin to output mode.

In Arduino this maps to pinMode(pin, OUTPUT). Other environments will need an equivalent.


Set a pin to input mode, no pullup.

In Arduino this maps to pinMode(pin, INPUT).


Set an output pin to a logic-high or 1 state.

In Arduino this maps to digitalWrite(pin, HIGH).


Set an output pin to a logic-low or 0 state.

In Arduino this maps to digitalWrite(pin, LOW).

Timer-Related Macros/Functions

The (void*) argument passed to these functions is some implementation-specific representation of a timer peripheral. In some cases (such as on SAMD microcontrollers) it’s simply a pointer directly to a timer/counter peripheral’s register base address. If an implementation requires more data associated alongside a peripheral, this could instead be a pointer to a struct, or an integer index.


A defined numerical constant - the source clock rate (in Hz) that's fed to whatever timer peripheral is used, e.g. 48000000 for a 48 MHz timer.

A prescaler should be chosen that allows the timer’s resolution (e.g. 16-bit) to work with the longest intervals needed by the matrix-driving code (hard to say specifically, but let’s aim for up to 250 microseconds). It’s fine if the timer isn’t running at single-instruction-cycle speed…a prescaler of 2, 4 or 8 still provides ample resolution for what we need.


Initialize (but do not start) timer, readying it for Protomatter use.

_PM_timerStart(void*, count)

Start or restart the requested timer (first argument), setting the period (second argument) in “ticks,” whatever units the timer is operating on, as established by _PM_timerFreq. Timer must be previously initialized.


Stop a previously-started timer (single argument), returning its current timer counter value.


Return current timer counter value (whether timer is running or stopped).

A timer interrupt service routine is also required, syntax for which varies between architectures.

Usually the ISR needs to be related to a timer peripheral at compile-time, which is another reason why the Arduino implementation is always tied to a specific timer…other libraries, and the Arduino core itself, have their own ISRs for specific timers, we can’t take them all and dole them out on request.

Miscellaneous Macros/Functions


Matrix bitmap width (both in RAM and as issued to the device) is rounded up (if necessary) to a multiple of this value as a way of explicitly unrolling the innermost data-stuffing loops.

So far all HUB75 displays we’ve encountered are a multiple of 32 pixels wide, but in case something new comes along, or if a larger unroll actually decreases performance due to cache size, this can be set to whatever works best (any additional data is simply shifted out the other end of the matrix).

Leave undefined to use the default of 8 (e.g. four loop passes on a 32-pixel matrix, eight if 64-pixel). Only certain chunkSizes are actually implemented right now.


Function or macro to delay some number of microseconds.

For Arduino, this just maps to delayMicroseconds(). Other environments will need to provide their own or map to an an equivalent existing function.


Additional code (typically some number of inline assembly NOPs) needed to delay the clock fall after RGB data is written to PORT. Only required on fast devices. By default, if left undefined, no delay happens.


Additional code (e.g. NOPs) needed to delay clock rise after writing RGB data to PORT. No delay if left undefined.


Numeric constant, the minimum allowable timer “ticks” for the least bitplane display time.


Very rare, if an architecture is so peculiar that it requires a fully custom innermost data-issuing cycle (set RGB bits, raise and lower clock), it can be defined by this name.

If undefined, a default sequence will be used.

If adapting to some environment that’s neither Arduino nor CircuitPython: it’s easiest if the internal representation of an image in RAM matches what Adafruit_GFX is using: one 16-bit unsigned word per pixel, row major, no row padding. For example, a 64x32 pixel matrix will use 64x32 uint16_ts, or 4 kilobytes. The first word corresponds to pixel (0,0) at the top left.

Note that this is the drawing canvas into which points or lines or other primitives are drawn, but it’s distinct from additional space required by Protomatter, which must refresh the matrix plane-by-plane. Call the _PM_convert_565() function to process the simple canvas to the shuffled matrix representation and update the display.

If a different canvas representation is used, you’ll have to provide your own conversion function…_PM_convert_565() (and the functions it calls in turn) might offer some insights there.

Arduino and CircuitPython implementations already handle this.

Insights and Surprises

Troubleshooting by just looking at an attached matrix probably won’t yield much success. An oscilloscope or logic analyzer is really helpful. Initially, take a good look at the clock signal…this is the fastest signal the software has to generate (but mustn’t run too fast for the matrix, hence _PM_clockHoldHigh and _PM_clockHoldLow, if needed). Second, watch the !OE (output enable) signal…if the timer is properly configured and working, you should see the time interval between pulses double with each bitplane (e.g. N microseconds, N*2 microseconds, N*4, etc.). You should also see an obvious sequential bit-count among the row-select address lines.

A couple devices threw us for a loop…these problems were surmountable, but worth specifically mentioning here as it may be relevant to future porting efforts…


PORT bit-set and bit-clear registers do not correspond to 32-bit ports. Instead, a single 32-bit register has 16-bit set and clear sections. At least the bits are contiguous, and _PM_portSetRegister() and _PM_portClearRegister() can just return pointers to the upper or lower half of the register. Constrained to 16 bits, this does mean that STM32 is limited to a maximum of two concurrent matrix chains.


A little peculiar in that the bit-set and bit-clear registers aren’t entirely atomic. If two set or clear operations occur in rapid sequence (as happens in a couple places in the library), the second has no effect. One solution would be adding NOP instructions, but this is kludgey in that it doesn’t automatically handle faster CPU variants if those come along in the future. Workaround was to always alternate bit-set with bit-clear, using a bitmask of 0 for the second operation. This waits for the first operation to “latch,” and the second has no effect…we can follow up with another bit-set and it works reliably now.

ESP32 needs the timer ISR function (and any sub-functions it calls) in RAM rather than flash. This is done with an IRAM_ATTR attribute on a function, and broke our rule of “keep any device-specific code out of core.c.” So…if any other devices also require in-RAM ISRs, and if they use an attribute other than IRAM_ATTR…one should #define IRAM_ATTR to whatever attribute is required there, so that section of the code will handle either case.

ESP32-S2 and -S3

GPIO set/clear operations are somewhat slower than the original ESP32, which would result in a flickery display, so these two make an exception to the GPIO+timer rule and rely on chip-specific peripherals. On the ESP32-S2, the Dedicated GPIO peripheral is used. On ESP32-S3, the LCD controller peripheral. An interesting side effect of this, because the ESP32 family has very flexible pin-MUXing capabilities, is that any pins can be used to drive the matrix…there’s no specific order or continuity required.

This guide was first published on May 13, 2020. It was last updated on Oct 20, 2020.

This page (Adding a New Device) was last updated on May 12, 2020.

Text editor powered by tinymce.