Streaming waveforms

Posted on
  • For a while I have played around with streaming waveforms for audio output ("streaming" means having two buffers and continuously re-filling one while the other is playing). Here are my findings.

    I first streamed from an SD card. Now on the Pico, I stream from an external SPI flash memory, using my home-grown Winbond W25Q module. Comparing the two methods, I can say that streaming from flash works better.

    • Buffers can be much smaller. For streaming at 11 KHz from the SD, I needed a buffer size of 2048 samples. When streaming from the SPI flash, 64 samples are fine.

    • While the highest sampling rate I achieved with SD card was between 12 and 15 KHz, I can stream at 44.1 KHz from the flash RAM.

    I'm not sure why exactly this is, but I assume that there is a considerable overhead for dealing with the filesystem on the SD card. I also found that reading sequentially from the flash works like a charm, while re-positiong (i.e. moving the read pointer to a different memory address) is much more likely to produce glitches, so that seems to take more time.

    It seems that reading from the SD takes quite some fixed amount of time (plus the variable time depending on the chunk size), which would explain why buffers need to be so much bigger.

    Another interesting find is that (at least for audio) glitches can be more acceptable with flash memory, because of their pattern (length of the glitch and interval of occurrence). When streaming from SD with big buffers, you get glitches which are typically tens of milliseconds long (so they sound wrong and disturbing), and they happen a few times per second, which is also an annoying interval. When streaming from flash with small buffers, you get very short glitches (sounding like a click) but so often that they are no longer perceived as individual glitches, but as a background hum with a constant frequency. You can tune that frequency (sampling rate divided by buffer size) and you can also tune its amplitude (trade-off against performance). Audio quality at 8bit is not great anyways, and a little crackling well below the volume of the payload signal is hardly audible, so you can choose to accept controlled buffer underruns for the benefit of less memory impact or higher sampling rate.

    Now what's interesting is that (regardless of the streaming source) certain combinations of sampling rates and buffer sizes seem to work well while others don't. One would expect that increasing the buffer size would generally improve performance (i.e. lessen glitches due to buffer underruns), but that is not always the case. On the contrary, I even found that bigger buffers can make it worse. It's not that hard to find a working configuration for a given application through experimentation, nevertheless, it would be interesting to know why Espruino behaves that way. I can imagine that the time needed for a buffer-refill is not a linear function of the buffer size, but a more staircase-shaped one, but that's just speculation. Does anybody know more?

  • Wow, interesting that you can get up to 44.1kHz... I guess you might be able to do a bit better by manually setting up the SD card and increasing the SPI bitrate...

    But yes, there's quite a lot of overhead in dealing with the SD card. It has to read 512 byte sectors (I think), but also it has to find out where they are in the FAT table...

    I don't believe the time taken to fill the buffer is a staircase, but depending on how your code is written you'll be balancing the time taken to read the buffer from flash (plus a possible delay in execution) against the time taken to output it.

    Would you be willing to post your code for dealing with Waveform from flash? The example code for filling it from SPI actually cheats a bit - it keeps three buffers - the two from the waveform, plus a third from the filesystem. When the callback executes, it immediately updates the Waveform's buffer to avoid delay, and then refills the third buffer from the filesystem.

    If you did similar with the SPI flash code, you might be able to push it a bit more?

  • This is my code:

    var pos;
    var w;
    
    function play() {
    
      refillBuffer = function(b) {
        b.set(flash.send(b));
        pos += b.length;
      };
    
      bufferEventHandler = function(b) {
        if (pos >= len) { // loop sample
          flash.seek(0, 0);
          pos = 0;
        }
        refillBuffer(b);
      };
    
      stop();
      w = new Waveform(bufSize, {doubleBuffer:true});
      pos = len;
      bufferEventHandler(w.buffer);
      bufferEventHandler(w.buffer2);
      w.on("buffer", bufferEventHandler);
      analogWrite(B1, 0.5, {freq:100000});
      w.startOutput(B1, rate, {repeat:true});
    }
    
    function stop() {
      if (w && w.running) w.stop();
    }
    

    I'm not using the same cheat. Do you think it would make a huge difference? In any case the code must accomplish both (fetch new data and copy it) in the time window between two buffer events, so as long as it's fast enough, the order of those should not matter, and if it's not fast enough, you have a problem anyways.

    I just tried 44.1 kHz because it's such a nice omnipresent number (almost 42). I didn't try higher ones, maybe they would work as well. The nice thing is that you could reproduce higher frequencies. For audio (where the 8bit resolution limits quality anyways), this already covers the audible frequency range. Higher sampling rates for audio are mainly relevant in recording (for oversampling or less steep low pass cutoff before the A/D converer) but realtime recording is certainly no fun with this slow-writing flash memory.

    What I didn't consider yet is the possibility that I'm mistaken about the 44.1 kHz and I might actually already be getting significant glitches which I simply don't hear because my test signal (a sawtooth frequency sweep) doesn't reveal them. I don't assume so (since it really sounds quite clean) but it might be the case. But then again, for my audio stuff it wouldn't even matter as long as it's not noticeable :) For other applications one should double check.

    What's the maximum SPI baud rate for the Pico? The Winbond flash works up to 104 MHz. And by the way, it supports synchronous transfers of up to 4 bit on parallel data pins per 1 SPI controller (but I don't think the microcontroller can handle that).

  • Yes, I think it'd make a difference - as would getting rid of the function call... So:

      bufferEventHandler = function(b) {
        b.set(newData);
        if (pos >= len) { // loop sample
          flash.seek(0, 0);
          pos = 0;
        }    
        newData = flash.send(b);
        pos += b.length;
      };
    
    //....
    var newData = flash.send(w.buffer);
    

    While it still has to finish execution in time for the next one, there will be a slight delay between when the buffers are swapped and when the function is called. By filling the buffer at the start, the Waveform can actually swap buffers before execution of your callback has finished, and it'll still be fine.

    I'd definitely try something other than the saw wave just in case there are glitches - maybe something like voice?

    That Winbond chip seems great. Espruino's SPI can hit around 16Mhz I think (I'm not 100% sure on that - it'll depend on what chip, and I think some SPI devices might be able to go a little faster than others too). The chip itself can handle the 4 pin SPI thing too (SDIO-style interface?) but it's not exposed in Espruino - too much work I'm afraid and not enough call for it...

  • Hi,

    I don't know if you are still interested, but I played around with audio playback and found that the buffer passed to the callback seemed to be the wrong one. The following code shows how I refill the buffers for click-free audio :-

    w.on("buffer", function(buf) {
        if (w.currentBuffer == 0)
        {
          w.buffer2.set( /* new data */ );
        }
        else
        {
          w.buffer.set(/* new data */ );
        }
      }
    });
    

    Also, if you want to speed up the SD card access then you can indeed increase the SPI rate :-

    SPI2.setup({mosi:B15, miso:B14, sck:B13,baud:20000000});

    I hope that is useful.

  • Thanks, that's great! I just checked and it seems there was a bug in the code - I'll fix it now, so Espruino should report back the unused buffer in the callback in versions 1v82 and later

  • That is very interesting! I remember I looked at the code for a long time and didn't quite understand this aspect, but since it's also not documented how exactly that event is thrown, I assumed it to be correct.

    I will try that with the Flash RAM when I have time and let you know. It should improve things even more there because of the smaller buffer sizes.

    By the way, it seems a bit clumsy that we need an if-branch to evaluate which buffer to use. We could store references to the buffers in a 2-element array and then reduce that function to:

    w.on("buffer", function(buf) {
        buffers[1 - w.currentBuffer].set( /* new data */ );
    });
    

    but it would be even cooler if the native Waveform object had built-in properties that point to the current and the other buffer.

  • Thinking about it more, the nicest improvement for the Waveform class would be to simply give it an array property buffers that works like the above. Or is there any reason to avoid accessing arrays?

  • Well, when it's working right, the buf argument should point to the correct buffer, so there'll be need for that?

    Having an array would work, but would use up a little bit more RAM and would be a tiny bit slower. It would make more sense though... Only problem is it would stop pretty much all existing code that used waveforms from working - I'm likely to annoy far more people with that than I'll make happy with the change :)

    You can easily fake it though. Right after creating the Waveform class, do:

    w.buffers = [ w.buffer, w.buffer2 ];
    
  • Returning the other buffer from now on will also break existing code (of people who have figured out how to use the old implementation properly, although we're probably not speaking about many).

    Everybody will be happy as long as they know how to use it, so the documentation should mention which buffer is passed by the event, and the update notes for the firmware should also be verbose about the change in behavior.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Streaming waveforms

Posted by Avatar for Dennis @Dennis

Actions