How do I send part of an ArrayBuffer to SPI

Posted on
Page
of 2
/ 2
Next
  • I am developing a module for the ST7735S display on the MStick-C and have managed to get fairly reasonable performance using only Javascript. The module supports the palletted graphics drawImage routine to avoid having to do complete screen updates - I will post how to access this soon.

    To reduce the overall storage requirement, images are unpalletted and written to the device in chunks. I want to reduce the amount of buffer allocation and use only one permanently allocated buffer which holds a chunk of 16bit pixel data to write to SPI in one operation. The problem I have is that the last chunk will in general be a fragment and so I would like to only write a part of the chunk buffer. I tried using ArrayView as follows:

     if (remnt>0){
         //remnt is the fragment size 
         var b =new Uint16Array(g.chunkbuf.buffer,0,remnt);       
         E.mapInPlace(new Uint8Array(img.buffer, chunks*CHUNKSIZE*img.bpp/8, remnt), b, img.palette, img.bpp);
         spi.write(b.buffer);
    }   
    

    This does not work as b.buffer is in fact g.chunkbuf.buffer and consequently, the spi.write sends the whole chunk buffer rather than part of it. I have had to resort to allocating a new buffer for fragments as in var b =new Uint16Array(remnt). This allocates a new buffer each time which I am concerned may lead to problems with storage allocation due to fragmentation. Have I missed something obvious as to how to solve this? Ideally, I would like spi.write() to have a size parameter.

  • Thanks, that is really neat. I will try to compare the performance with my driver which uses drawImage. I suspect the performance of drawImage implemented on top of single pixel operations will not be great.

    I still think it would be generally useful to be able to write parts of flat buffers to SPI .

  • suspect the performance of drawImage implemented on top of single pixel operations will not be great.

    Well it's using c code and spi_send_many that makes it extrem fast as you can see in this video.

    https://youtu.be/SupaV3gxiq0

  • That's very impressive. I have just managed to build ESP32 firmware including it using your excellent tutorial. I think spi_lcd_unbuf should be added to the standard ESP32 board as most people will want to have a display.

  • Well Gordon was so generous and let me add this library and also did a refactoring/rewording of the tutorials.

    So don't miss to donate and help to improve the ESP32 build

  • This caused me a little grief with your custom board tutorial:

     # Add SPI_LCD_UNBUF to handle 16bit color tft lcd displays
                'SPI_LCD_UNBUF'
    

    Should be LCD_SPI_UNBUF!

  • Hmm, yes has to be changed, thanks for naming.

    Edit: Just created a pr

  • Hi, I have now managed to do a comparison. See video below:

    https://youtu.be/NHtWqIaIGDY

    The numbers in red are the lcd_spi_unbuf driver and the numbers in green are the version I wrote that directly implements drawImage. The application code for both is identical and is trying to increment and display the number every 100ms:

    // display incrementing number
    var pal2color = new Uint16Array([0x0000,0xF100]);
    var buf2 = Graphics.createArrayBuffer(20,64,1,{msb:true});
    buf2.setRotation(3);
    buf2.setColor(1);
    buf2.setFont("Vector",20);
    
    var N = 0;
    function drawNumber() {
       buf2.drawString(N,0,0);
       lcd.drawImage({width:20,height:64,bpp:1,buffer:buf2.buffer, palette:pal2color},30,50);
       buf2.clear();
       ++N;
       if (N>999) N = 0;
    }
    setInterval(drawNumber,100);
    

    The drawImage based driver (green numbers) appears to be over twice as fast as the lcd_spi_inbuf version. I have cheated a bit in that I have chosen an image size that does not create a fragment. My conclusion is that I will try to combine these two drivers to give the best of both and I think that it is still important to get a solution to the original issue I mentioned of being able to write only part of a buffer to spi.

    You can see the full code here

  • Thanks for sharing.

    Yep, of cause drawImage() is always faster than drawString() with vector font.

    Video post #4 is using setFont("6x8",4) because it is much faster than using vector and with option true no clear is needed, so this is similar to drawImage().

    • drawImage() can benefit of spi_send_many
    • lcd.setFont("Vector",20).drawString() looks like it is drawing pixel by pixel.

    My conclusion:

    • use drawImage when using vector font
    • split your screen in small sections and use drawImage to update them
  • Hmm, not sure any more.

    Tested a lcd spi ILI9341 with 320x240 pixel, interval 0.1sec and it's not as slow as in video post #9.

    https://youtu.be/OHwegdm01xw

            g.setColor(0xffff).setFont("Vector", 40).clearRect(120,100,200,140).drawString(times, 120, 100, true);
            g.setColor(0xffff).setFont("6x8", 4).drawString(times, 120, 160, true);
    
  • This does not work as b.buffer is in fact g.chunkbuf.buffer and consequently, the spi.write sends the whole chunk buffer rather than part of it.

    Can't easily verify the spi.write since it does not return anyting but with spi.send it works as expected at least for nrf52. See this

    >var fc=new SPI();fc.setup({sck:D29,miso:D31,mosi:D30,mode:0});
    =undefined
    >fid=Uint8Array([0x9f,0,0,0])
    =new Uint8Array([159, 0, 0, 0])
    >fid2=Uint8Array(fid.buffer,0,2)
    =new Uint8Array([159, 0])
    >fc.send(fid,D27);
    =new Uint8Array([0, 133, 96, 21])
    >fc.send(fid2,D27);
    =new Uint8Array([0, 133])
    

    so it can be seen only two bytes are sent in second case. Maybe don't use spi.write(b.buffer); but spi.write(b); ?

  • Thanks, I did try that and I think you are correct in that now that I have redone the experiment it does seem to send the right amount of data. However, it seems to scramble the pixel data in some way that I have not yet worked out:-(

    Will investigate further (tbc) ........

  • Finally got it to work - perhaps obvious in retrospect!

    E.mapInPlace(
     new Uint8Array(img.buffer, chunks*CHUNKSIZE*img.bpp/8, remnt),  
     g.chunkbuf,  img.palette,  img.bpp
    );
    spi.write(new Int8Array(g.chunkbuf.buffer,0,remnt*2));
    
  • Hey, thanks for your update, look's like a great improvement for flicker free vector font drawing.

    So why not add lcd_spi_unbuf_drawImage() to lcd_spi_unbuf.c to make it even faster?

  • I think that’s a great idea and I will certainly have a go at it when time permits.

  • So why not add lcd_spi_unbuf_drawImage

    I probably wouldn't want to merge something in if it was a specific hack for drawImagejust for lcd_spi_unbuf.c.

    There are some really easy wins for lcd_spi_unbuf.c speed though. Something like:

    int lastx,lasty;
    
    void lcd_spi_unbuf_setPixel(JsGraphics *gfx, int x, int y, unsigned int col) {
      uint16_t color =   (col>>8) | (col<<8); 
      jshPinSetValue(_pin_cs, 0);
      if (x!=lastx+1 || y!=lasty) {
        disp_spi_transfer_addrwin(x, y, LCD_WIDTH, y+1);  
        lastx = x;
        lasty = y;
      } else lastx++; 
      spi_data((uint8_t *)&color, 2);
      jshPinSetValue(_pin_cs, 1);
    }
    
  • Having looked at the Espruino graphics C code, I can see that there is no provision for an image bit blit style callback as there is for pixels and rectangles, so I completely agree that it’s best left as it is. I am impressed that Espruino supports reasonable performance for drawImage using only Javascript.

  • There are some really easy wins for lcd_spi_unbuf.c speed though.

    Thanks, so let me run some tests and come up with a pr.

  • There are some really easy wins for lcd_spi_unbuf.c speed though.

    Any things else you like to point out to improve performance?

  • That's the main one and you should see some decent speed improvements with that.

    The other one would be to move to DMA. If you reserved 2 LCD_WIDTH length buffers then you could:

    • SPI write immediately on the first pixel received (or if doing fillrect)
    • If in progress, write subsequent pixels (on the same line) int the second buffer
    • When the first DMA finishes, kick off DMA for buffer #2

    I believe this is the sort of thing @fanoush already tried when he ported to some watches with SPI displays - you can effectively be doing the transmission at the same time as working out the next stuff to draw.

  • Yes, but it can be done because my driver does two things (in parallel) when updating some rectangular area

    1. palette conversion (any palette depth to 8, 12 or 16 bits RGB)
    2. sending pixel data

    If you keep those steps separated - palette conversion done via E.mapInPlace or drawImage into buffer and then using lcd_spi_unbuf driver for sending the area over SPI, you cannot do it in parallel

    So it comes back to

    So why not add lcd_spi_unbuf_drawImage() to lcd_spi_unbuf.c to make it even faster?

    and

    I probably wouldn't want to merge something in if it was a specific hack for drawImagejust for lcd_spi_unbuf.c

    This is specific optimization for DMA based (in this case SPI) graphics driver - draw image in any bpp/palette source into destination rectangle in different bit depth (mainly RGB). So adding this to the unbuffered driver makes sense to me. @Gordon what version of such code would you merge? if it should reuse some generic palette/bpp conversion graphics code it would need to be callable in chunks - i.e. 'convert next x bytes from source into this buffer'.

    Or is there some other way to look at it? Maybe it could be somehow hooked into drawing algorithms directly and be somehow line based? Thats not what I did, it first needs to be drawn into memory buffer completely and then conversion/blitting is done in g.flip for modified rectangle. With this unbuffered driver it would be the same just without full buffer for whole screen but only for smaller rectangular image so g.flip() would become g.drawImage().

    @jeffmer not sure if you've seen it, the InlineC method of my driver (version for DK08 watch, the destination is 6bit RGB) is here and calling it in g.flip is here

    For P8 watch and 240x240 ST7789 destination is 12 bits RGB (saves 33% vs 16 bit RGB). 12bit mode is also on ST7735 but maybe it is not so visible on smaller screens, with 240x240 the 33% speedup is visible.

  • There are some very specific hacks for Bangle.js graphics: https://github.com/espruino/Espruino/blob/master/libs/graphics/lcd_st7789_8bit.h#L52

    But I'd rather not have IFDEFs piled up on top of each other. I guess what we need is to have a more generic way of adding optimisations.

    If you want to do optimisation on the Graphics library it'd be great - but ideally try and optimise it such that it benefits everyone, not just one specific use case that you have to compile custom firmware for.

  • I updated my copy of lcd_spi_unbuf with the optimisation suggested by Gordon. I did the comparison for drawImage again and the lcd_spi_unbuf version is now as fast if not faster than my original driver. It’s good as I can now save heap by dispensing with the chunk buffer.

    @fanoush - yes, I had a look before - it’s really neat.

  • There are some really easy wins for lcd_spi_unbuf.c speed though.

    just created pr 1924 for this.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

How do I send part of an ArrayBuffer to SPI

Posted by Avatar for jeffmer @jeffmer

Actions