• First of all, duplicating a function under a different name was not my intention.
    The idea was, on a long term, to mark shiftOut and digitalWrite as deprecated and replace it with something like bytePort. The example for bytePort includes an option to have default handling and familiy specific handling. At least, this is the idea. And it is faster in sending data.
    To get a better understanding of where time is going, I tested duration of the scan loop, and switched off one part after the other in source from the branch Gordon posted.
    This is the result, looks like shifting out data is fast, but preparing options and GPIOs takes a lot of time. At least for ESP32.

    scan loop XXXXXXXX
      shiftout Data(sfnc) XXXXXXX
          set options for internal use XXXXXX
      set array of pins XXXXX
      assign output, and pin mask XXXX
      set clock XXX
      push data out XX
     set row address(dfnc) X
    duration(msec) 25,300 18,160 14,350 13,820 11,090 9,290 6,780 3,810

