All ESP-based Internet Radio projects that I’ve seen so far lack a clean modular setup that programs should have if they offer many optional features and support a broad variety of hardware setups. So I decided to write my own sketch from scratch, consisting of a main sketch for all hardware-independent basic functions, and a growing set of includible modules, one for each optional feature or hardware device.

Now I can compile tailor made sketches for the desired functionality (like support for Air Traffic Control stations) and actually used hardware only. I’m satisfied with the first results, and I learned a lot during this project.

Keypad or web browser controlled prototype in radio mode (left) and ATC mode (right)

The radio on the pictures runs without problems on an ESP8266 board, driving an Adafruit VS1053 breakout and a 2.4″ ili9341 TFT display. It can be controlled with a 16-key matrix keypad and via a browser.

The prototype uses almost 500 lines of code for the radio. David Bird’s excellent METAR functions (ATC mode) take yet another 500 lines. Too big to post on this page, I’m afraid…

It has the following features (more will be added later):

  • Can play mp3 encoded streams from internet radio stations (up to 320 Kbps).
  • Built in webserver accepts control commands via a browser.
  • Can also be controlled with a 4×4 matrix keypad, using VS1053’s own GPIO pins.
  • Offers storage of 40 station presets (grouped in four  bands).
  • Displays station name and song title on TFT display and web page.
  • Displays a decoded weather report (METAR) when playing an air traffic control (ATC) channel. The first 10 stations in the preset list are reserved for ATC stations (I used David Bird’s excellent functions for METAR decoding).
  • A 30 Kbyte ring buffer is used for smooth playback.
  • Can play stations with redirected urls.
  • Designed to handle chunked transfer encoded streams if necessary (I was unable to test this, because I couldn’t find any station that uses this encoding).
  • Stereo dB levels can be read directly from the chip and be used to drive VU meters from PWM pins (or DACs on ESP32). See here for an example.

Below is a flow chart of the initial connection, followed by the streaming process (basically a simple finite state machine). In practice, it all comes down to determining each incoming byte’s function within the stream by keeping track of some counters and states. Once the actual streaming has started, an incoming byte can be one of the next types:

  1. audio byte – will be sent to a ringbuffer that feeds the VS1053 decoder
  2. metadata byte – a human readable character from the song title
  3. metadata size – an integer; multiplied by 16, it is the following song title’s length
  4. CR or LF – used as separators
  5. chunk size (if station uses chunked transfer encoding) – size of the next chunk