My last YouTube video, showing a wireless VU-meter module with a small audio spectrum analyzer, was surprisingly well received. Soon after asking me for the code, some sneaky YouTuber now tries to make money out my idea and code, without even crediting me! Another reason for me to stop sharing code links via YouTube.
Here’s Part I of the project. Part II (the wireless module) is covered in the next post.
Concept
The idea was to let an ESP32 ‘base station’ convert analog stereo output from my amplifier to digital data, make it calculate VU levels and FFT frequency magnitudes and broadcast their values via ESP-NOW. Any ESP-NOW ‘receiver’ within range could then use them for audio visualizion on a display. The base station can be placed out of sight and be powered by a 5V adapter plugged into an AC outlet on the back of my amplifier.
Part I – Base station
Any ESP32 can perform the tasks outlined above, but conversion of the AC audio signal would normally require some rectifier circuit or two voltage dividers, as an ESP32 can only handle DC voltages (in the 0-3.3V range). Although that totally works, I was lucky to come across the AI-Thinker AudioKit V2.2 (and some great libraries).
Powered by an ESP32-A1S chip with built-in es8388 codec, this board allows you to plug an analog stereo source directly into its LINEIN input. You can program it with Espressif’s own Audio Development Framework, but I prefer the Arduino IDE, allowing me to use Phil Schatzmann’s arduino-audio-tools library, truly a Swiss Army Knife for audio projects.
I first tried the above library’s streams-audiokit-audiokit example. It loops back the analog input via a digital (I2S) stream to the board’s earphone plug, but unfortunately it produced a terrible distortion. My guess that the used es8388 library caused the trouble seemed right when I used a different es8388 library instead. Its streams-i2s-i2s example produced crystal clear sound, and since that example further uses the arduino-audio-tools library, all of that library’s extensive functionality remains available.
Now that digital audio data flowed through the ESP32, I only had to find out how to tap it programmatically. Few libraries on Github will have better documentation and support than the arduino-audio-tools library. Built on the concept of Arduino streams, it allows you to control an audio stream from source to destination (sink). With a little help from Phil (issue #161), I figured out two tapping mechanisms that I could use for my base station.
1. For calculating VU levels, I defined the following subclass of the library’s I2S class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
class MyI2S : public I2SStream { public: size_t write(const uint8_t *data, size_t len) override { int16_t *buffer16 = (int16_t*)data; int samples16 = len / 2; right = 0; left = 0; int16_t lmax = -32767; int16_t lmin = 32767; int16_t rmax = -32767; int16_t rmin = 32767; float factor = samples16 / 2; for (int j = 0; j < samples16; j = j + 2) { if (buffer16[j] < lmin) lmin = buffer16[j]; if (buffer16[j] > lmax) lmax = buffer16[j]; if (buffer16[j + 1] < rmin) rmin = buffer16[j + 1]; if (buffer16[j + 1] > rmax) rmax = buffer16[j + 1]; } left = abs(lmax - lmin); right = abs(rmax - rmin); sampled = true; return I2SStream::write(data, len); } }; |
The data array holds a stream chunk of 4608 uint8_t values, casted to 2304 int16_t values, corresponding with 1152 left-right sample value pairs. After the for loop, global variables left and right will hold the maximum peak-to-peak values of the analyzed sample for both stereo channels. A global boolean sampled is set to true, which will trigger a freeRTOS task to perform an ESP-NOW broadcast. An object of this customized MyI2S class can now be used as a stream destination. In the code it’s named i2s.
2. For the Fast Fourier Transformation, I used the AudioRealFFT library. It lets you create an object of the AudioRealFFT class that can also be used as a stream destination. I made its callback function fill a global array of 32 frequency magnitudes that will be broadcast via ESP-NOW. In the code it’s named fft.
Having two different destinations for the digital audio stream is no problem, thanks to the library’s MultiOutput class. An object of this class acts as a kind of ‘destination envelope’ that can hold multiple destination objects. In the code it’s named multi and holds the previously mentioned i2s and fft objects.
After specifying input and ouput of the es8388 codec
1 2 |
es_dac_output_t output = (es_dac_output_t) ( DAC_OUTPUT_LOUT1 | DAC_OUTPUT_LOUT2 | DAC_OUTPUT_ROUT1 | DAC_OUTPUT_ROUT2 ); es_adc_input_t input = ADC_INPUT_LINPUT2_RINPUT2; |
we have defined our audio stream from source (line input) to sink (headphone), including taps for calculating both VU levels and FFT magnitudes.
The complete code that I used for the base station is at the bottom of this post . Note that compilation requires version 2.0.2 or higher of the arduino-esp32 package. To avoid conflicts with any other installed es8388 libraries, I put the es8388.h and es8388.cpp files from thaaraak’s library in the sketch folder. In the Arduino IDE , I selected the ESP32 Dev Module with PSRAM enabled and the Huge App partition scheme (probably not necessary).
After uploading the sketch, an AudioKit V2.2 with es8388 codec (connected to an audio source) will broadcast the following struct:
1 2 3 4 5 |
typedef struct send_struct { uint16_t l; uint16_t r; uint8_t m[n_bins]; } send_struct; |
You can check if it works by uploading the following sketch to an ESP32 and view the results (VU values) in the Serial Plotter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
#include "esp_now.h" #include "WiFi.h" typedef struct sent_struct { uint16_t l; uint16_t r; uint8_t m[n_bins]; } sent_struct; sent_struct myData; //callback function that will be executed when data is received void OnDataRecv(const uint8_t * mac, const uint8_t *incomingData, int len) { memcpy(&myData, incomingData, sizeof(myData)); Serial.print(myData.l); Serial.print(","); Serial.println(myData.r); } void setup() { Serial.begin(115200); WiFi.mode(WIFI_MODE_STA); Serial.println(WiFi.macAddress()); //Init ESP-NOW if (esp_now_init() != ESP_OK) { Serial.println("Error initializing ESP-NOW"); return; } esp_now_register_recv_cb(OnDataRecv); } void loop() { } |
Complete code for the base station:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
/* * Wireless Audio Visualizer - Base station (on AI-Thinker AudioKit V2.2 with es8388 codec) * 2022 © paulF (afterBlink) * Used libraries: * https://github.com/thaaraak/es8388 * https://github.com/pschatzmann/arduino-audio-tools * * See https://www.youtube.com/watch?v=mZC5sKY97VI */ #include "esp_now.h" #include "WiFi.h" #include "AudioTools.h" #include "es8388.h" #include "Wire.h" #include "AudioLibs/AudioRealFFT.h" AudioRealFFT fft; // FFT stuff -------------- const uint8_t n_bins = 32; uint8_t mag[n_bins]; uint8_t fft_skips = 64; uint8_t fft_count = 0; // ------------------------ uint8_t broadcastAddress[] = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}; typedef struct send_struct { uint16_t l; uint16_t r; uint8_t m[n_bins]; } send_struct; send_struct myData; esp_now_peer_info_t peerInfo = {}; uint16_t sample_rate = 44100; uint16_t channels = 2; uint16_t bits_per_sample = 16; uint16_t right; uint16_t left; boolean sampled = false; TaskHandle_t vuTask; class MyI2S : public I2SStream { public: size_t write(const uint8_t *data, size_t len) override { int16_t *buffer16 = (int16_t*)data; int samples16 = len / 2; right = 0; left = 0; int16_t lmax = -32767; int16_t lmin = 32767; int16_t rmax = -32767; int16_t rmin = 32767; float factor = samples16 / 2; for (int j = 0; j < samples16; j = j + 2) { if (buffer16[j] < lmin) lmin = buffer16[j]; if (buffer16[j] > lmax) lmax = buffer16[j]; if (buffer16[j + 1] < rmin) rmin = buffer16[j + 1]; if (buffer16[j + 1] > rmax) rmax = buffer16[j + 1]; } left = abs(lmax - lmin); right = abs(rmax - rmin); sampled = true; return I2SStream::write(data, len); } }; void vu_task( void * parameter) { for (;;) { if (sampled == true) { sampled = false; myData.l = left; myData.r = right; for (uint8_t f = 1; f < n_bins - 1; f++) { myData.m[f] = mag[f]; } esp_err_t result = esp_now_send(broadcastAddress, (uint8_t *) &myData, sizeof(send_struct)); } vTaskDelay(2 / portTICK_PERIOD_MS); } } MyI2S i2s; MultiOutput multi; // FFT callback void fftResult(AudioFFTBase &fft) { int diff; auto result = fft.result(); if (fft_count++ == fft_skips) { for (int f = 0; f < n_bins; f++) { mag[f] = ((int)(fft.magnitude(f))) >> 6; } fft_count = 0; } } StreamCopy copier(multi, i2s); void setup(void) { Serial.begin(115200); WiFi.mode(WIFI_STA); if (esp_now_init() != ESP_OK) { Serial.println("Error initializing ESP-NOW"); return; } peerInfo.channel = 0; peerInfo.encrypt = false; memcpy(&peerInfo.peer_addr, broadcastAddress, 6); if (esp_now_add_peer(&peerInfo) != ESP_OK) { Serial.println("Failed to add peer"); return; } AudioLogger::instance().begin(Serial, AudioLogger::Error); // Input/Output Modes es_dac_output_t output = (es_dac_output_t) ( DAC_OUTPUT_LOUT1 | DAC_OUTPUT_LOUT2 | DAC_OUTPUT_ROUT1 | DAC_OUTPUT_ROUT2 ); es_adc_input_t input = ADC_INPUT_LINPUT2_RINPUT2; TwoWire wire(0); wire.setPins( 33, 32 ); es8388 codec; codec.begin( &wire ); codec.config( bits_per_sample, output, input, 30 ); // last parameter is volume codec.pub_es_write_reg(0x09, 0x00); // start I2S in Serial.println("Starting I2S"); auto config = i2s.defaultConfig(RXTX_MODE); config.sample_rate = sample_rate; config.bits_per_sample = bits_per_sample; config.channels = 2; config.i2s_format = I2S_STD_FORMAT; config.pin_ws = 25; config.pin_bck = 27; config.pin_data = 26; config.pin_data_rx = 35; //config.fixed_mclk = 0; config.pin_mck = 0; i2s.begin(config); // Setup FFT auto tcfg = fft.defaultConfig(); tcfg.length = 2 * n_bins; // will produce n_bins frequency bins tcfg.channels = channels; tcfg.sample_rate = sample_rate; tcfg.bits_per_sample = bits_per_sample; tcfg.callback = &fftResult; fft.begin(tcfg); multi.add(i2s); multi.add(fft); multi.begin(config); Serial.println("I2S started..."); xTaskCreatePinnedToCore(vu_task, "VU-spectrum-broadcast", 10000, NULL, 2, &vuTask, 1); } void loop() { copier.copy(); } |