-
i implemented something similar, but with larger chunks (1..10kBytes); sending just 16bits with DMA is very slow. it might be even better to write directly to SPI TX? maybe DMA double buffer (DBM) helps, i did not try till now. but i am not sure if 16bits are enough get rid of the inter-byte gap (due to CPU load for DMA ready scanning). think i will give it a try.
i have identified another performance brake: E.mapInPlace; i think it's use of JSVars slows down the lookup. using asm coded specialiced functions for 1/2/4bpp is about 20x faster.
Just a thought - You could write some assembler code that did basically what the ILI9341pal driver does, but with DMA:
Obviously you've got your current solution with the nice fonts so it's not a big deal, that that could end up being really interesting.