Greed for Speed

One of my first Arduino peripherals was this 3.5″ touchscreen from Adafruit. It has a 480×320 resolution, an SD card reader and two interfaces (SPI and 8-bit parallel).

However, after first connecting it to my Arduino Uno, this absolutely beautiful display turned out to be rather slow, even in parallel mode. Back then, there were no fast libraries for its HX8357D chip, so I soon bought a smaller 320×240 ili9341 display for projects that required fast graphics (fractals, cellular automata etc.). But I never stopped looking for a faster way to drive my 480×320 ‘Ferrari’.

New hope arose when ESP8266 entered the market, especially after ‘Bodmer’ added support for the HX8357D chip to his amazing TFT_eSPI library on Github. With the ESP8266’s clock speed set to 180 MHz, cellular automata now grew considerably faster, but still noticeably slower than on ili9341 displays (with half the amount of pixels, that is).

Then came ESP32. The cpu-hungry algorithms that generate complex graphics proved to run much faster on this dual core board. Finally, some of my math visualisations started looking acceptable on my beloved 480×320 display.

A decisive breakthrough was the arrival of the WROVER variant of ESP32. It came with PSRAM, more than enough for (multi) buffering a 480×320 display (4 MB, or even 8 MB on model B). Writing pixel values to a buffer in PSRAM first, and then push (part of) this buffer to the display in one transaction, is much faster than writing to individual pixels.

A welcome dessert was the speed gain after increasing the SPI frequency from 27 MHz to 40 MHz (a setting in the TFT_eSPI library’s user_setup.h file), which finally made my Adafruit display fully competitive. And, as the proverbial cherry on the cake, controlling it over its 8-bit parallel interface made it even faster again!

This video shows the display at top speed. If you’ve ever seen this standard graphic test on an Arduino Uno, you’ll know that I’ve made some progress here. My smartphone’s camera had some trouble filming the display (it even turned yellow into white…).

Benchmark sketch on an ESP32, driving the 480×320 display over its 8-bit parallel interface

Guess this is where my Speed Quest has its happy ending. Below are the benchmark results (in microseconds) for the 320×240 ili9341 and the 480×320 HX8357D displays, both driven by an ESP32 (LOLIN D32 Pro). Output is from the library’s graphicstest.ino sketch, running on a single core, without the use of display buffering.

 

 

 

Chips from the lathe

This first post of 2020 just shows some videos of projects from the past three months.

Here’s an improvement (I hope) of my previous attempt to simulate fire on a TFT display. I’ve added a glowing particles effect, did some parameter fine tuning and changed the color palette. The simulated area forms a layer over a jpg image, read from SD card.

Alas, my phone’s camera misses most of the improvements…

 

The next video shows the tesselation process of a 2D area according to the Majority Voting principle. In the initial state, every pixel randomly gets one out of three* possible colors. Then, in each (discrete and simultaneous) next step, every pixel takes the color of the majority of its ‘neighbours’. It’s no surprise that the chosen definition of neighborhood has great influence on this self-organizing process and its final state.

* The classical example uses 2 colors, but I chose to have 3 for a more fashionable (?) look.

 

Finally, meet Perrier Loop, the most complex Self-replicating Cellular Automaton that I managed to produce on a 320×240 TFT display (so far), pushing my cellular automata framework for ESP32 to its limits. Grid cells can be in one out of 64 states (colors). State transitions are governed by 637 rules that use 16 variables (placeholders for up to 7 states), so the real number of rules is much larger! Each cell complies to this same set of rules to determine its next state, based on its current state and that of its 4 orthogonal neighbours.

And the result looks so simple….

ADS-B Exchange!

Despite some turbulence within the ADS-B Exchange community after their API policy change, owner James Stanford kindly sent me a key for the new REST API 🙂

In exchange for feeding ADSBx, non-commercial users may use the API for free. Your key needs to be included in the header of each (https) API request.

The following code is at the top of my new What’s Up? sketch for monitoring all aircraft within a distance of 25 nautical miles from my location (latitude- and longitude filters have not (yet) been implemented in the new API, so they need to be applied on the query results programmatically).

Then, inside loop(), I create an instance of WiFiClientSecure and send the API request over https, including my API key in the header:

The json response can now be processed as usual by reading from the stream ‘client’. Although the key names of the new API differ from the old ones, I didn’t have to make any changes to the json streaming parser that I wrote for earlier ADS-B projects because it takes these names from a global array that now looks like:

Some keys from the old API do not (yet) have an equivalent in the new json response. As a temporary solution, I now host a local API on my webserver, returning aircraft model and flight operator for an icao or opicao code. It’s powered by a MySQL database, imported from json files found at https://github.com/Mictronics/readsb.

A further change in the new json reponse forced me to rewrite my function for character decoding. The new API uses Unicode code points instead of multi-byte utf-8 for encoding special characters. When using GFX-based graphic libraries, these need to be converted to Extended Ascii Code page 850 (“Latin-1”), or to ‘romanized’ cyrillic characters.

[UPDATE] Meanwhile, I discovered that Bodmer’s unbeatable TFT_eSPI library offers great unicode support and lets you create font files with only the desired unicode blocks!

It all works OK, although responses from the new ADBSx API show more drop outs then before. Also, jumps between consecutive position updates seem larger, making the so arduously conquered flicker free icon movements less striking.

I expect to post a video of my new (and hopefully final) ESP32 version of What’s Up? in one of the next posts. In order to use the code, you’ll either have to be (or become) an ADSBx feeder, or buy a key. I went for the first option (with two feeders) and made a donation as well, despite my strong aversion to PayPal.

 

 

 

ADS-B Exchange(d?)

 

 

Please do feed!

 

It’s been over two years since the discovery of adsbexchange.com formed the inspiration for my What’s Up? project: visualization of air traffic on a small TFT display. Last week, I decided to rewrite the ESP8266 code for the ESP32 (Wrover) module, hoping that its extra CPU speed and PSRAM would make the aircraft icons move more smoothly over the map. The result was totally satisfying (as long as it lasted), and I even managed to build in my internet radio code for receiving live air traffic control (ATC) stations.

But then, on the very next day, the free adsbexchange API stopped working…

I understand the owner’s decision, though, and even feel guilty about not having made my two FlightAware receivers feed ADSBx much earlier. In a post on the site’s forum, the clearly ‘pissed’ owner (James Stanford) complains about the fact that many FA and FR24 feeders use his API, but don’t feed ADSBx. That must be annoying, but as for me (and there may be others), there’s a simple explanation for it: ‘Fear of Linux….‘.

When I started with ADS-B, the FlightAware logo on my dongle encouraged me to try their Piaware image first. Everything worked fine out of the box, and for me, with close to zero Linux experience, that was a black box (well, actually a transparent lunch box). However, making it feed ADBSx as well would require me to execute cryptic Linux shell commands… And then I also got confused by sources saying Piaware takes control over what can be installed, so I decided not to jeopardize my precious feeder.

Encouraged by the broken API, finding this Piaware 3.7.1. based setup guide convinced me that it was safe to make the move, hoping to meet the conditions for receiving an API key. After executing the following commands, my FA feeders now feed ADBSx as well!
sudo apt update
sudo apt install git socat
git clone https://github.com/adsbxchange/adsb-exchange.git
cd adsb-exchange
chmod +x setup.sh
sudo ./setup.sh

Note: the new version of the setup.sh script will no longer auto-start the feeder from rc.local, but from systemd.

It’ll take a while before a new feeder appears on all ADSBx pages, but you can immediately check if messages from your ip address are being received by ADSBx here. You will see someting like this:

 

 

 

 

 

 

 

 

 

 

Once the mlat syncs with nearby peers have been established, your feeder will also appear on the MLAT Sync Stats page for your region.

After an hour or so, the location of your feeder will be indicated on the coverage map of your region (http://www.adsbexchange.com/coverage-4A/?new for my location).

If you are currently feeding FA (I can’t speak for FR24), why not start feeding ADSBx as well? It’s entirely safe and will not affect your FA feeder in any way. Moreover: you’ll be supporting their dedication to sharing unfiltered flight data.

With my brand new What’s UP? sketch currently being grounded, I’ve humbly requested an API key from ADSBx.

Self-replicating Cellular Automata

Studying John von Neumann’s self-reproducing universal constructor* reminded me of some unfinished business in my earlier Cellular Automata (CA) projects. Always looking for coding challenges, I wondered if I could write a sketch for simulating self-reproducing CA loops that would fit inside an ESP32.

The general idea is to have a 2-dimensional grid of cells, each of which can be in one out of n states. These cells periodically change their states simultaneously, according to a set of rules. Each cell complies with the same set of rules, which lets it calculate its new state based on its current state and that of its direct neighbours. The way this ‘organism’ will evolve also depends on its initial pattern. A self-reproducing CA is a combination of states, rules and initial pattern that will continuously produce copies of the initial pattern, that will produce more copies of the initial pattern… etc. etc.

I guess that Von Neumann’s universal automaton, with its 29 cell states and large set of rules, could be well out of ESP32’s league. Fortunately there are some famous (be it less universal) simplifications, so I started with the simplest one I could find: Chou-Reggia-2.

ESP32 WROVER simulating Chou-Reggia-2 on a 320×240 display (1-pixel per cell)

The video shows the indefatigable CA at work, producing ever more copies of the initial pattern (just 5 cells). It mimics a much sought after form of life: old cells don’t die…

They’re just like kids, they grow up so fast 😉

 

In order to squeeze my Chou-Reggia-2 algorithm into the ESP32, I had to come up with some new programming- and storage techniques for cellular automata. I ended up writing a framework for CA sketches, that can import patterns and rules from Golly. It runs on ESP32 WROVER modules, using its PSRAM for storing a double display/states buffer. The two cores of the ESP32 cooperatively calculate a new grid state based on the current one.

As a proof of concept for the new framework, I initialised it with the specific data for Langton’s Loop, an automaton similar to Chou-Reggia-2, but with more rules. Everything worked fine, as did two slightly more complicated automata that have variables in their Golly rules, as well as substantially more states: Evoloop-finite and Perrier Loop.

Now that I have it up and running, my CA framework for ESP32 WROVER modules reduces simulation of most 2D cellular automata to feeding it with their specific data: states, transition rules (with or without variables), neighborhood (Moore or von Neumann) and rule symmetry, all imported from Golly. State colors are not taken from Golly, but chosen to meet the condition (stateColor) % nStates == state, a crucial break-through idea for making the framework fit inside an ESP32 while preserving speed and high refresh rates.

[ After further optimization, my CA sketches run 5x faster than what is shown in the videos! ]

Close up of Langton’s Loop on an ESP32-driven 320×240 TFT display (slow version)

Because every ‘cell’ is represented by a single pixel on a small 2.4″ display, I had to zoom in on the pattern in order to show the difference between ‘living’ and ‘dead’ cells (Langton’s ‘organism’ grows like coral: inner cells die, but their cell-walls are preserved).

 

* Von Neumann’s abstract and purely logical reasoning about self-replication eventually helped biologists to understand how (real) cells manage to make exact copies of themselves.

 

 

Turbocharged TFT display

Here are some first results of the (speed) progress that I’ve made since my previous post about using PSRAM as display buffer.

The first video shows a 200×150 pixels animation generated by looping over 61 frames (ripped from an animated gif). The partial refresh of the relevant display area is so fast that I had to include delays between the frames.

 

 

The next video shows my dual core/ dual buffer Julia Fractal Zoomer sketch in action. The two ESP32 cores cooperate in writing Julia Fractals with ever increasing zoom levels to a (double) buffer in PSRAM. When a fractal is finished, the loop function takes care of writing its pixel values from the last completed buffer to the display.

Zooming in on Julia fractal (c = -0.79 -0.15i) ; max iterations: 128

 

This spinning globe was my first sketch that used PSRAM as a display buffer, written to test the concept and measure its speed. It is generated by looping over 21 frames.

 

 

The above videos show a LOLIN TFT-2.4 Shield, driven by a LOLIN D32 Pro with 8MB PSRAM. Sketches will follow after I’ve implemented a new idea that may further increase the zoom speed.

 

 

PSRAM Display Buffer

One possible application of ESP32 WROVER’s PSRAM memory is using it as a (double) display buffer. I thought this could be useful for fractal-based animations on a TFT display, as these sketches often require a considerable amount of calculations for each pixel, followed by an update of that single pixel on the display, This way, a full refresh of the display can take a lot of time, but by writing calculated color values to a PSRAM buffer instead, the actual refresh can be done much faster by pushing the entire buffer to the display after all pixel colors have been calculated in a background process (even simultaneously on a different core, using a double buffering technique).

As a proof of concept, I converted one of my Julia Fractal sketches to a version that continuously draws a specific Julia Set Jc with an incrementing zoom level. The first result is satisfying enough, even though my pixel-by-pixel approach is probably not the most efficient way to interact with PSRAM. I may post a video in due course, but for now you can find the prototype sketch for a 320×240 display at the end of this post. It uses both fast cores and a double buffering technique: while one buffer is being filled by a core 1 task, the other one can be read by a core 0 task for filling the display.

Perhaps this display buffer concept can finally make aircraft position updates in my ‘What’s Up’ Flight Radar sketch completely flicker free. Instead of always having to save clean map tiles from the diplay (with readPixel) before drawing updated aircraft sprites on them, it will allow me to restore the appropriate tiles directly from the full map image stored  in PSRAM. That will circumvent my long time problem with the only fast library for the HX8357D display (Bodmer’s TFT_eSPI library): it’s readPixel() function doesn’t work on that display, so until now I couldn’t use it for my Flight Radar sketches. No longer having to use readPixel allows me drive this nice 480×320 display in that library’s parallel mode, which is very fast! With the maximum of 12 aircraft on display, updating all positions will take < 80 milliseconds. So my next post may be ‘What’s Up – final version‘.


Here’s a quick and dirty dual core ‘Julia Fractal Zoomer’ sketch. For some reason, omitting the vTaskDelay command in loop() makes it run slower! Leaving out these commands from the task functions will crash the ESP32. Make sure to enable PSRAM before compiling the sketch on a WROVER based ESP32 (option will appear under ‘Tools’ in the Arduino IDE or can be set with make menuconfig if you use Espressif’s ESP-IDF).

[UPDATE] the prototype sketch below ran much faster after dividing calculations for the new display buffer over both cores and, as expected, by replacing the individual drawPixel commands by pushing the entire display buffer in one single SPI transaction.

Unripe (Ada)fruit?

I’ve always been an Adafruit fan, gladly supporting their open source contributions by buying their products. But recently, I was very disappointed by the bad performance of their TCA9548A 1-to-8 I2C Expander.

I had bought it to be able to drive two identical SH1106 OLED displays without having to solder on one of the displays in order to give it a different I2C address.

First I ran an I2C scanner, and the result looked promising: all connected I2C devices were detected with their correct I2C address. But then, driving a single display only worked if it was connected to channel 0 of the expander. And even then, the display would often show errors.

Then I tried a DS3231 Real Time Clock, because there’s very little I2C traffic involved in reading  the time from this simple and reliable chip. Even when connected to channel 0, setting the clock didn’t work well and readings of a correctly set clock were mostly  messed up. Since this expander seemed unable to reliably drive a single device, there was no sense in trying to connect multiple devices.

[UPDATE 15-01-2020] quite unexpectedly, I found the *solution* on an Adafruit forum! The multiple-sensor example sketch in Adafruit’s tutorial for this product misses an essential line. After putting the line Wire.begin() in setup(), that is: before the first call of tcaselect(), the module works fine! Wonder why they don’t correct this in their tutorial.

 

Pseudostatic Ram (PSRAM)

 

Just got me a second LOLIN D32 Pro v.2, this time with the newer ESP32-WROVER-B module on board. For just € 1,- extra, it has 4x more RAM and 2x more PSRAM. But what is this PSRAM, and could I use it for memory-hungry sketches, like the ones where AI or pathfinding algorithms are involved?

Searching the Internet gave me a first idea of what it is, but more important: I found out how to use it, which luckily turned out to be remarkably simple.

Below is a simple demo sketch that shows how to address PSRAM. It allocates a 1 MB buffer in PSRAM and then starts writing random integers at random positions, then reading them back for comparison. Make sure that PSRAM is enabled when you compile it, otherwise the ESP32 will panic! (it’s an option in the Tools menu of the Arduino IDE after a WROVER based ESP32 board has been selected).

[edit: a more useful application can be found in this later post]

Note that GPIO pins 16 and 17 are used for communication between ESP32 and PSRAM, so WROVER-based boards will not have these pins broken out.

Also note that content in PSRAM will be lost after the ESP32 loses power (still have to figure out what happens during deep sleep).

My search for information on PSRAM triggered a more general interest in the chip’s architecture. Espressif’s documentation has been very helpful.

PS: in order to use the upper 4 MB of the module’s 8MB PSRAM, you’ll need to build your application with Espressif’s ESP-IDF and enable ‘bank switching’. Instructions can be found here: https://github.com/espressif/esp-idf/tree/master/examples/system/himem.

Spectrum Analyzer Revisited

Encouraged by the happy ending of the Swinging Needles project, I decided to revisit an earlier audio spectrum analyzer. This time, instead of programmatically processing analog signals, I used a VS1053 plugin to let the chip do the Fourier stuff.

Before even starting to read actual frequency band values, I wanted to test my sketch by reading the plugin’s default frequency settings from VS1053’s memory. However, some of these readings made no sense at all, up to the point that I suspected a faulty chip. It took me some time to find out that the chip needs a few seconds of audio input before it will tell you something useful, even if it concerns a fixed setting like number of frequency bands.

Once the above mystery was solved, things became very straightforward. What a versatile chip this VS1053 is! While playing a 320 Kbps internet radio station, it can easily handle 14 frequency bands and let an esp8266 at 160 MHz show the results on a 128×64 Oled display. Part of the credits should go to the authors of the libraries that I used.

 

First impression produced by the sketch below; better videos with sound (and colors?) will follow.

 

Here’s my esp8266 demo sketch for a basic internet radio (fixed station, no metadata, no audio buffer), just to show a 14 band spectrum analyzer on a 128×64 SH1106 Oled display. Used pins are for the rather rare Wemos D1 R1, so you’ll probably have to change them.

 

This is the content of the plugin.h file that needs to be in the directory of the sketch: