Is setting the pretokenise flag advisable?

Posted on
  • I've stumbled upon the pretokenise flag (E.setFlags({pretokenise:1})) and I'm wondering if it's worth setting it on an Espruino Puck or WiFi.
    As far as I can guess pretokenised JS code doesn't need to get eval'ed anymore, which has the potential to speed up execution significantly.
    Does it accelerate the execution of long running code, or has the code already been pretokenised after a while?
    Are there any gotchas except of readability and changeability of the tokenised code?
    Any comments are appreciated.

    • Steffen

  • With pretokenise, Espruino still ends up having to do some evaluation, but effectively what happens is all whitespace gets removed and all the keywords get squashed down into a single byte. So you're getting a speedup because it's not having to parse all the keywords, but you're also getting better memory usage - think of it as super-minification.

    The code's pretokenised at upload time, so you're going to see speed improvements for all execution, even long running stuff.

    Gotchas? Readability when using dump() on in stack traces since all whitespace is removed, and I just don't think it's had quite the same level of testing. If you hit some problem with it that you can narrow down to a small piece of code then I'm happy to try and fix though.

  • Sounds good, I'll give it a try. Thanks for the quick reply.

  • It doesn't seem to work from the console neither from the IDE.
    After

    E.setFlags({pretokenise:1})
    function test() {print('Hallo')}
    dump()
    

    function test ... is shown as plain text, no tokens.
    The output of E.getFlags() is

    >E.getFlags()
    ={ deepSleep: 0, pretokenise: 1, unsafeFlash: 0, unsyncFiles: 0 }
    

    Version is 2v02, Espruino Wifi.
    There seems to be something I'm missing.

  • There won't be tokens, because Espruino automatically converts them back to readable text before output - also the function you uploaded doesn't actually contain any reserved words :)

    Try:

    E.setFlags({pretokenise:1});
    function test() {
      for(var i=0;i<10;i++) {
        print(i);
      }
    }
    dump();
    

    This prints:

    function test() {for(var i=0;i<10;i++){print(i);}}
    E.setFlags({ "deepSleep": 0, "pretokenise": 1, "unsafeFlash": 0, "unsyncFiles": 0 });
    

    You'll notice all the whitespace is gone - that happened inside Espruino rather than being anything the IDE did.

    Also if you really want to dive around in the internals:

    >trace(test)
    #30[r1,l1] Function {
      #29[r1,l2] Name String [1 blocks] "\xFFcod"    #38[r1,l0] FlatString [3 blocks] "\xA7(\xACi=0;i<10;i\x98){print(i);}"
      #22[r1,l2] Name String [1 blocks] "\xFFlin"= int 3
    }
    

    So the function code is now "\xA7(\xACi=0;i<10;i\x98){print(i);}" - so you can see for/var/++ have been replaced by character codes>127.

    Just as a quick test:

    E.setFlags({pretokenise:0});
    function test() {
      var t = getTime();
      for(var i=0;i<10;i++) {    
        for(var j=0;j<500;j++) { /* This is a big comment in my code */ };
      }
      print(getTime()-t,process.memory().usage);
    }
    test();
    E.setFlags({pretokenise:1});
    function test() {
      var t = getTime();
      for(var i=0;i<10;i++) {
        for(var j=0;j<500;j++) { /* This is a big comment in my code */ };
      }
      print(getTime()-t,process.memory().usage);
    }
    test();
    // gives
    0.79599380493 51  // off
    0.70184040069 47 // on
    

    So in something like that you're about 10% faster and you saved a few variables too

  • Wow, this is pretty impressive.
    From your example I conclude that I can just put E.setFlags({pretokenise:1}); as the first line in the code passed to E.setBootCode and after the next reboot everything in the boot code and all code I upload later gets tokenized. Is this correct? (please say yes :)

  • Not quite I'm afraid :)

    It works great if you upload to RAM. Anything uploaded after E.setFlags({pretokenise:1}); is sorted - so just stick it at the top of your code.

    However when you do E.setBootCode you're writing the code direct to flash memory from the IDE, and it's actually being executed from there. As a result it can't get tokenised (but then you're not using any RAM to store the code either).

    It's actually an addition that could do into the Web IDE (code could be pre-tokenised in the IDE, so then everything including E.setBootCode would be smaller/faster)

  • Ok, so I had a play and just added this to the Web IDE. I haven't updated it yet but I'll try and get it in soon (the files are in GitHub) - should make a big difference in modules with a lot of this usage.

  • ...it - Espruino's progress - cannot get faster than that! Thanks!

  • Just had a glance at it and it looks neat.
    Currently my Espruinos have a small 'bootloader' set via E.setBootCode (I think I understand now how to tokenise it). It receives strings via TCP and evals them - large strings with class definitions, instantiation etc. Does the code processed by eval get pretokenised, provided the flag is set?

    (side note: I once had a Sinclair ZX81 with 1kB of RAM. The code was immediately entered as tokens.)

  • Does the code processed by eval get pretokenised, provided the flag is set?

    It does, yes :) If you were to manually install the command-line tools from https://github.com/espruino/EspruinoTools then do espruino --board ESPRUINOWIFI --config PRETOKENISE=true foo.js -o out.js then you could actually get the code tokenised in a file, which you could serve up over TCP if you wanted :)

  • Glad to hear eval does it :)
    I've actually grabbed your pretokenise.js from EspruinoTools and done something similar already. First the code was not executable because "return" is defined twice, as LEX_R_BREAK and as LEX_R_RETURN. The returns from a function need to be tokenised as the latter one, but the pretokenise routine returns the first one. After obfuscating the token for LEX_R_BREAK (as "XXreturn") everything runs fine.
    The pre-tokenised code uses many of the tokens including class, new, switch/case, try/catch and some bit shift operators.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Is setting the pretokenise flag advisable?

Posted by Avatar for Steffen @Steffen

Actions