Max data transfer size/rate via USART?

Posted on
  • Background: I'm trying to transfer a stringified JSON object between two Picos (using Serial1, tx:B6, rx:B7 on each, shared GND, console shunted to USB, etc). It works for a couple of KB of data, but beyond a certain size of object the transfer appears to simply fail, leaving the sending Pico in an unresponsive state until it's power-cycled.

    I'm guessing that there's some sort of internal buffer on the Serial port for when JS code delivers data faster than it can be transmitted, but I'm having trouble finding any information on how big it is, or strategies for robustly working around this issue (eg, chunking the data and sending a chunk at a time using setTimout/setInterval is an obvious possibility, but how quickly/slowly should I queue each new chunk up, and/or how do I know when the previous chunk has finished transmitting)?

    Also, does anyone have any information on which baud rates the Pico supports for Serial connections? I'm experimenting to ease the problem by selecting higher baud rates to reduce the chance of buffer overrun, but it's a slow process when you have to connect/disconnect and re-flash two different Picos for every test...

    Basically is there anywhere I can look in more detail regarding the USART specs on the Pico? I've even tried digging into the Espruino source code on Github, but I couldn't find any info there either.

  • Can you post your source?

    Serial defaults to 9600 baud. I'd first try setting up Serial at 57600 or 115200 baud, like Serial1.setup(57600, {rx:B7,tx:B6}); (where B7 and B6 are the pins you're using).

    Serial1.pipe() might be another option, but I haven't played with that.

    Next, sending chunks on an interval would probably work, but isn't the most efficient or robust, since you are depending on the receiving Pico always being ready on time for the next chunk. Which it might not always: sometimes it may have another task on its plate.

    A somewhat more robust, brute force method, is to do your own chunking. Instead of waiting a fixed interval between chunks, you send a chunk, then wait for the receiving device to send you an acknowledgment. Then you send the next chunk, etc. Before all this, you send something like sending 152\n, where 152 is the number of chunks you plan on sending. That way receiver knows when it's done. (Serial1.pipe() probably does exactly this, internally.)

  • Thanks @oesterle - some great suggestions there, particularly investigating Serial1.pipe(). I've veered away from writing my own solution so far, because this feels like something I'd expect to be handled internally by the Espruino runtime (at least some way to read the current state of the Serial buffer, so you know when it's safe to push more data into the connection)... but if there's nothing then I guess it's a straight choice between either creating a pull request into the Espruino runtime or write my own JS routines to manage it.

    The basic setup is pretty simple - just Serial1.setup(28800, { tx:B6, rx:B7 }) to set up the serial port, and

    var strRepresentation = JSON.stringify(obj);
    Serial1.print(strRepresentation + '\u0004');
    

    to send a data chunk down it. For reference, the data object I'm sending includes a few KB of base64 image and audio data, and going over a couple of KB total is about the point it blows up the connection and crashes the chip.

    Then the same setup line on the receiving side, and

    var buffer = '';
    Serial1.on('data', function(data) {
      buffer += data;  // De-chunk into buffer, and process whole buffer once EOT control character is received
    
      if(data.indexOf('\u0004') !== -1) { // ctrl+D - End of Transmission character
        // Process whole data object - JSON.parse(buffer), etc.
        buffer = '';  // And reset buffer
      }
    });
    

    to receive and process it.

    Edit: Ah- looks like Serial.pipe() won't really do it. It's the sending device that crashes due to an overflowed buffer(?), whereas it looks like Serial.pipe() only takes writeable streams (ie, it's another way to receive data from the USART, not a way to push data into it).

  • ... Right. I think I've finally tracked down where the Serial buffer size is defined in the Espruino source code.

    I'm leaving this step-by-step trail of breadcrumbs here in case it helps anyone else to decode the Espruino source code, and to refresh my memory if I ever need to track down chip details like this again in the future.

    1. First, we know we're looking for something to do with the Serial1 port. Serial1 is an instance of the Serial class, and (as per the API documentation) that's a USART.
    2. So step two is to find the Espruino source code repo (thank god for open source!) and search it for USART.
    3. Hmmm - looks like there are a lot of different source code files, so let's look for one that sounds like it belongs to the chip we're using (in this case, the Espruino Pico). According to the Pico page the chip in the Pico is an STM32F401CDU6 32-bit 84MHz ARM Cortex M4 CPU.
    4. Ok... so scrolling down the list of possible hits in the github repo, "stm32f4xx" looks like a pretty good match for "STM32F401CDU6. Let's look in there.
    5. Hah - this looks like the header file for a C structure that represents a USART in memory. The beginning of the file sets up a struct (basically, "object" in JS-speak) that has a bunch of familiar-sounding fields like USART_BaudRate, USART_StopBits and USART_Parity that pretty obviously map to options/parameters used when setting up the Serial prot in the Espruino JS API.
    6. Jackpot! Towards the bottom of the file we find the definition for a function called USART_SendData: USART_SendData(USART_TypeDef* USARTx, uint16_t Data);. That sounds very much like the bit of code that sends data down a USART, so let's jump to the associated .c file (source code) for this .h (header) file.
    7. Find the USART_SendData function in this file and it's a short one that just seems to set a member variable called DR, belonging to a variable called USARTx, that's a pointer to a variable of type USART_TypeDef that's passed into the function.
    8. Ok, so let's back up and find where USART_SendData() is called from, so we can find where this parameter variable comes from.
    9. The first place the function is actually called (as opposed to being defined in a header file) is targets/stm32/stm32_it.c... and STM32 is a reassuring match for our chip (STM32F401CDU6), so let's look in there.
    10. Search that file for USART_SendData( and we find the following likely-looking code:

    /* If we have other data to send, send it */
        int c = jshGetCharToTransmit(device);
        if (c >= 0) {
          USART_SendData(USART, (uint16_t)c);
        }
    

    So this is basically saying "get a single character from the jshGetCharToTransmit() function, and if it's not 0 then send it out over the USART". That's very good - now we just need to find jshGetCharToTransmit() and try to work out where it gets characters from so we can try to work out how big that buffer actually is.
    11. src/jsdevices.c looks like a pretty good bet, so let's look in there (and search withing the file for jshGetCharToTransmit.
    12. The first hit is what we want. The code is a bit confusing, but if you ignore all the if(DEVICE_IS_UART(...)) stuff then a few lines below that we find unsigned char data = txBuffer[tempTail].data;, and that variable is returned from the function. So txBuffer is our buffer that holds all outgoing data to be sent over a USART (Serialn) connection.
    13. Right at the top of the file we find volatile TxBufferItem txBuffer[TXBUFFERMASK+1];, definine an array of TXBufferItems called txBuffer, of size... TXBUFFERMASK+1.
    14. Sigh... ok, so now let's find out where TXBUFFERMASK is defined. Nowhere in this file. awesome.
    15. At least it's only used in two files. We know it's not src/jsdevices.c because we just looked in that one, so it must be the other one, scripts/build_platform_config.py.
    16. Search in that file and we find (third result) codeOut("#define TXBUFFERMASK "+str(bufferSizeTX-1)+" // (max 255)"). This looks like some Python code that writes C (always fun reading code-that-writes-code), but if I'm reading this right it means that at compile-time TXBUFFERMASK is defined as one smaller than bufferSizeTX
    17. ... and bufferSizeTX is defined a few lines above as:

    if LINUX:
      bufferSizeIO = 256
      bufferSizeTX = 256
      bufferSizeTimer = 16
    else:
      bufferSizeIO = 64 if board.chip["ram"]<20 else 128
      bufferSizeTX = 32 if board.chip["ram"]<20 else 128
      bufferSizeTimer = 4 if board.chip["ram"]<20 else 16
    

    Phew!

    So long story short it looks like the size of the Serial output buffer is defined at compile time based on the RAM size of the chip.

    18. Tracking back up the file we find that board is imported via board = importlib.import_module(boardname);, and boardname is passed in as a command-line parameter to the build. Crap.

    19. Ahah - thank god for comments:

    Reads board information from boards/BOARDNAME.py and uses it to generate a header file

    which describes the available peripherals on the board

    So let's take a look in the boards folder and see if we can spot anything that looks like our chip.

    20. I don't know about you, but I have a good feeling about STM32F401CDISCOVERY.py for an STM32F401CDU6 chip, so let's look in there.

    21. Booyah! I spy a chip object with a ram member with a value of 64. Plugging that into our lookup table above, board.chip["ram"] (64) is certainly not <20, which means it's defined as 128.

    That makes our output buffer for serial data a surprisingly-small 128 unsigned chars big.

    That seems way too small for something that only starts choking on writes of a couple of K or more, so perhaps there's another buffer somewhere, further up the stack that I've missed.

  • Hi - sorry I'm a bit late to reply here. You really did have a good look in to it!

    The output buffer is relatively small, yes - but it won't cause a crash or any kind of problem if that is overflowed - the function will just take a longer time to execute as it will wait for the buffer to become non-full.

    For example:

    Serial1.print("Hello"); // returns immediately
    Serial1.print("Really long string .... ..."/* and so on */); // returns as soon as string length-128 characters have been transmitted (as then the rest is in the buffer)
    

    Do you have up to date firmware? Does pressing Ctrl-C break out of it?

    Please could you try and post up a really simple bit of code that crashes the Pico? Something I could just stick in the RHS of the IDE and try myself?

    The only thing that immediately sticks out with what you're doing is:

    var strRepresentation = JSON.stringify(obj);
    Serial1.print(strRepresentation + '\u0004');
    

    Adding a character on to the end of strRepresentation will cause a new variable to be created. If strRepresentation really is big then it could use up all your memory (but it still shouldn't crash - it should warn you of low memory).

    To get around it you'd be better off doing 2 prints:

    Serial1.print(strRepresentation);
    Serial1.print('\u0004');
    
  • Just to add: on the receive side, you can check E.getErrorFlags() to see if the receive buffer has overflowed (which happens if your on('data',.. hander can't process characters fast enough (at the 512 byte input buffer gets full) - but for what you're doing that's super unlikely - I'd only expect potential problems if you go far above 115200 baud or do some hugely long calculation in the data handler.

  • Hi @Gordon - thanks for your pointers.

    To answer your questions:

    1. Both my Picos are running v1.91 of the firmware (I believe the latest version as of posting)
    2. Pressing Ctrl-C does not break out of the unresponsive state, and even clicking to disconnect the Web IDE doesn't work (or at least, takes so long to respond that it appears to be unresponsive). The only thing I've found that works is power-cycling the Pico, at which point the Web IDE auto-disconnects. When the Pico is unresponsive the right hand side of the web IDE works as normal, but the left hand side doesn't register any keypresses, and obviously flashing the chip/clicking disconnect does nothing.

    I took on board what you said about implicit redeclarations/copies of variables (dammit - too long writing JS and too many years since I wrote any C...), so I tried replacing the single concatenation-plus-print line with two separate prints instead... and suddenly the sending Pico seems to be pretty consistently completing the transfer without crashing, so perhaps my problem was indeed a memory issue caused by marshalling the data in memory, rather than a problem punting the data through USART as I suspected (now my receiving Pico's crashing instead, but I suspect it's probably a similar RAM-based problem on that end, too - needs further investigating).

    I didn't see any OOM error messages anywhere and the issue definitely wasn't handled "gracefully" in the REPL/Web IDE, so I'll keep investigating with my old (crashy) code, and I'll see if I can produce a minimal test-case for you in the IDE that reliably crashes the Pico, even if it does just turn out to be an OOM problem.

    Thanks for the pointer about E.getErrorFlags() - that and E.getSizeOf() look like a couple of amazingly useful additions to my debugging toolbox, so thanks again for pointing them out to me!

    I'll keep investigating given what you've told me, and I'll report back anything of interest I find in this thread. Thanks again!

  • Great - thanks!

    If you do get a repeatable test case I'd love to know - I try very hard to make sure that Espruino never crashes and becomes totally unresponsive, so if it is a problem I'd like to try and get it sorted as soon as I can :)

  • EDIT: I've removed this comment because it was misleading/incorrect.

  • @Gordon: Ah - wait. It has nothing to do with the Serial connection after all - it's something to do with passing a variable through JSON.stringify() and then concatenating to it.

    The simplest/most convenient test-code that replicates it is as follows - flash it to the Pico chip and run go() - it just sets sets up a serial connection on B6/B7, kicks a chunk of data down it and waits for the transmission to finish, setting the LEDs at each stage to indicate status.

    var setBusy = function(busy) {
      digitalWrite(LED1, 0+busy);  // Red = busy
      digitalWrite(LED2, 0+(!busy));  // Green = waiting
    };
    
    var blockData = "f3+AgICAgICAgICBgYGBgYGBgoKBgoGBgY... any huge chunk of text data here";
    
    function go() {
      USB.setConsole();
    
      setBusy(true);
    
      console.log(blockData.length);
    
      JSON.stringify(blockData) + 'a';
    
      setBusy(false);
    }
    
    setBusy(false);
    

    The important factors here seem to be the following:

    The fact that the JSON.stringified() string is concatenated with another string afterwards

    If omit the concatenation I can send an arbitrarily large (20k+) messages without any problems. If I concatenate it with another string as per the code above, anything over 3680 characters causes the REPL to become unresponsive.

    That naively suggests to me it's not "just" an OOM problem in the JS code, as surely reserving/sending one 20k string is going to use more memory than even two or three implicit copies of a 3.6k one?

    The length of blockData in chars

    Assuming the concatenation as per the code above, the following boundaries seem to apply on the length of blockData:

    4029 or more = REPL unresponsive, LED stays red (indicating JS processing is stopping immediately after JSON.stringify()?)

    3681 - 4028 = REPL unresponsive, LED turns green after transmission finishes (so somethings killing the REPL, but the rest of the JS is executing correctly?)

    3680 or fewer = LED turns green after transmission ends, REPL remains responsive after code finished executing

    I'm assuming that this is some sort of bug because from what you're saying I should be seeing errors thrown for OOM, and the REPL shouldn't stop responding under any circumstances.

    Is that any help?

  • Yes, that's perfect - thanks! This looks like an internal interpreter problem then - but I'm amazingly surprised something like this has surfaced, since it would appear to be such a common thing to do.

    I'll take a look at this today and see if I can figure out what's wrong.

  • Ok - good news! Looks like this was already fixed and will be in the 1v92 firmware.

    If you do 'advanced flash' from the Web IDE with this url: http://www.espruino.com/binaries/travis/­658c82fe913e6f4283ded26c8cfd02ba997d8267­/espruino_1v91.480_pico_1r3.bin (it's a build from the absolute latest in GitHub) then it should work great.

    Sorry it's caused you so many troubles - I've been meaning to get 1v92 out for a while, but have been delayed a bit by wanting to get an issue with Puck.js and Windows HID fixed.

  • That's awesome - thanks @Gordon!

    It didn't cause me much hassle in the end - it was more a useful learning experience. I'm just happy it was a legit bug and not something obvious I'd got wrong!

    Glad to hear it's already fixed in v1.92, and I can easily work around it for now now I know what causes it - as you pointed out it's actually better to avoid implicit memory allocations by concatenating vars anyway, so I'll just Serial1.print() twice as that's the better solution anyway.

    Thanks again - I just wanted to say I love the open ethos of Espruino (code, hadware, etc), and really appreciate how helpful and responsive you are to the community - even to an idiot newbie just getting started on embedded development. ;-)

  • @jtq, I see in your code that you're using JSON.stringify() with text as a parameter. Doesn't this usually take a JavaScript object as input?

  • Yeah - in my actual code I'm passing a complex JSON object that gets stringified for transmission over the serial port - the code above is a minimal test-case for Gordon, designed specifically to trigger the weird interpreter behaviour I'd noticed.

    The interpreter bug that causes the crash apparently depends only on how long the input to JSON.stringify() is (not what the actual type of the input is), so I simplified it down to a string for this test-case (and also because it's far easier to truncate a string to change its size during repeated testing than to start editing individual bytes out of complex JSON objects). ;-)

    But you can legitimately pass a plain string (or any primitive JS value) into JSON.stringify(), and it'll just pass the same value back, unmolested.

  • Thank you for the link! Believe it or not, after 20+ years of JS I'm still learning new stuff.

    Didn't realize this, but you can also reformat/prettify JSON text by doing something like JSON.stringify(myJSON, null, 2). Very handy.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Max data transfer size/rate via USART?

Posted by Avatar for jtq @jtq

Actions