• Hi everyone! I was thinking about the following possible workflow in order to automatically run unit tests or smoke tests in an emulator:

    • build a docker image that contains the Espruino IDE (https://espruino.com/ide) or a smaller version of it
    • when a commit in a PR gets pushed, this triggers a GitHub action
    • optionally the smoke tests could also be run just once a day, e.g. at midnight, to save GitHub Actions minutes
    • this GitHub action uses the Espruino IDE docker image and automatically loads the newly added/changed code in the IDE
    • some test cases that check basic functionality of the app run and confirm the changes don't break other code

    When it comes to test cases, obviously some mocking needs to be done, however I think this is perfectly doable. For example https://www.espruino.com/ReferenceBANGLEJS2#l_E_getTemperature always returns NaN in the emulator, but the easiest way to mock it for the test suite would be to call E.getTemperature = function() { return 20; }; at the start of the test suite.

    For example, here's some pseudo code I came up with for activityreminder:

    // Suite setup
    // Mock some functions
    E.getTemperature = function() { return 27; }; // assume the user is always wearing the smart watch
    settings.maxInnactivityMin = 0; // immediately trigger an inactivity warning by setting max inactivity to 0 min
    
    // Test cases
    // wait a few seconds and assert E.showPrompt() got called (which should happen)
    // if not, fail the test or raise a warning
    // (optionally) take a screenshot and include it as an artifact
    

    Why would this be useful?
    Around 2 weeks ago there was an issue where (if I understood correctly) some firmware changes caused a new date format which broke activityreminder because it still expected/used the old date format. We could detect sudden breaking changes like this one using test cases. For example, the test case above would have failed after the new date format changes, because it couldn't have read the new date format.

    Obviously this is a huge suggestion and is probably low priority, but I'm just sharing my ideas and I'm open for feedback! What are your thoughts about this?

  • There is already something available - it doesn't use the IDE but does use the emulator and run it within Node.js on the command-line - it doesn't need Docker. It's at https://github.com/espruino/BangleApps/blob/master/bin/runapptests.js

    It'll scan through all apps for test files of a certain format, load the app up and try and run through a test script - which eventually could involve comparing screenshots at different points (it's just a matter of including the code from https://github.com/espruino/BangleApps/blob/master/bin/thumbnailer.js).

    The thing is though, there was some talk about this and I made the rough framework but nobody seems particularly bothered about actually creating any tests. For something to catch the activityreminder issue the level of tests required across all apps would be pretty high.

    One thing that might be a big improvement for reliability is actually to catch any Errors created on the watch itself (or even within BangleApps) and bring them into some kind of database - but I've been resisting that as I wonder if there might be privacy implications if somehow personal data could included in the errors.

  • Thanks for the info. If this already exists, cool, no need to build any new images then. And yes, I get the point that automatically catching and uploading errors would definitely be a problem regarding privacy.

  • I had done this a while ago: https://github.com/espruino/BangleApps/blob/master/apps/android/test.js

    Every step in that would probably map nicely to a test object in the json. I think I could refactor that.

    Maybe some of the asserts would be useful as a teststep in the apptests? I could try to integrate those.

    How does the emulator handle "hardware" like GPS? If it does not I probably need to find a new way to track the state of the "internal" GPS instead of checking the PIN mode like it does currently.

  • I think I could refactor that.

    That would be great!

    Maybe some of the asserts would be useful as a teststep in the apptests?

    Yes, absolutely - I was expecting that as tests got added we might want to have some different steps in there.

    How does the emulator handle "hardware" like GPS?

    Right now it doesn't, but I guess a good option would be to have a test step like GPS that actually just called Bangle.emit("GPS",{....}) to fake the GPS, so then you know exactly which coordinates are being fed in and can then check that the code is doing the right thing.

    As I understand it right now things like GPS power are always off in the emulator, but maybe that could be changed or we just upload some boilerplate before the test that replaces Bangle.setGPSPower() with some JS that does what we want

  • I've had a play around porting the first of the test cases to json:

    https://gist.github.com/halemmerich/3d0f1d41d8eebe76eedc8553530e5dd5

    I think this should be all the asserts/features I need. I have a rough implementation for those but it is not yet working correctly.

    Why did you decide on the json based format for test cases? Wouldn't javascript files be better for editing/linting/syntax highlighting?
    Maybe a library that provides asserts and an easy way to setup the emulator for running testcases for a single app? Or the other way round, the test code is a module and apptests.js iterates over the exports which are test functions and handles the environment like now.

  • Thanks!

    Well, really it was a few things:

    • we might want to have some data beforehand (eg to say we need to load app X as well) - but that could be in JSON in a comment at the start of JS I guess
    • Ideally we want to be able to write a command to the emulator and wait for it to complete so we know where the test failed if it did (but I guess if we wrote the whole JS file and then ran it, exceptions would give line numbers)
    • Writing JS to the console isn't quite the same as normal JS - writing for (var i=0;i<10;i++) and then print("Hello") on a newline will run the loop and THEN the print.
    • If you let people write JS they'll try to be clever and it'll no longer be a set of 'do this check that' but something far more complicated
    • I was expecting some commands of the JSON might be doing more in the test harness than just running on the device (eg saving and comparing screenshots) - but maybe we could have the emulator send out some text like COMPARE IMAGE FOOBAR.png followed by the image's base64 and then the test harness could interpret that.

    But yes, potentially we could just have JS files and work around that stuff above. There's nothing in place at the moment, so if you wanted to do a PR I'm all for it.

  • If you let people write JS they'll try to be clever and it'll no longer be a set of 'do this check that' but something far more complicated

    I think this speaks most to me personally :D
    I will try to keep it as simple as possible and in json. I guess if we need more complexity we could just run/eval a js file as a step from the json test definition.

  • we could just run/eval a js file as a step from the json test definition.

    Yes - that sounds like a great plan for the future

  • I have done a draft at https://github.com/espruino/BangleApps/pull/3399.
    It is far from done, but it seems to correctly execute the first test ported from the android apps test.js.

    Output currently looks like the attached image.


    1 Attachment

    • screenshot_area_2024-05-04_00:15:34.png
  • The current state seems to work more or less. It is however a bit cumbersome finding/debugging problems.
    Is there a simple way to implement an interactive console to the emulator? Maybe even connect a web IDE to the emulator running in runapptests.js?

  • That's great, thanks!

    Is there a simple way to implement an interactive console to the emulator?

    Possibly - I think you can get called back whenever you get data from the emulator, and then you could pass that on (same with injecting characters in). It is possible to serve up an iframe with a websocket server, and then you can link it to the IDE and have it magically work that way. A lot of work though!

    I guess this would be a benefit of your JS-only approach - you could just copy/paste the code in the IDE and see what happened.

  • I have found a way to implement a command for opening an interactive console. It "works" with the current state of the pull request alone with three major gotchas:

    1. The pressed keys are not mirrored to the console
    2. History etc. does not work
    3. The debugger does not work

    1 and 2 work with these two changes:
    https://github.com/espruino/EspruinoWebIDE/compare/master...halemmerich:EspruinoWebIDE:emulator
    https://github.com/espruino/EspruinoAppLoaderCore/compare/master...halemmerich:EspruinoAppLoaderCore:emulator

    Do you think the changes in core and IDE would have unwanted side effects elsewhere?

    There seems to be a race condition somewhere that causes the result of the call (=undefined) and what is printed by the call to switch position. That causes occasional test failures. Do you have an idea how to get to the bottom of that problem?
    The demo app/test has saveMemoryUsage twice since more often than not the first one produces NaN because of this issue. Second one seems to always work somehow.
    The race seems to happening independently of the changes in core/IDE.

  • Yes, I think those changes would be ok. I think the idea with the emulator was that actually onConsoleOutput would just get replaced as needed, but I think adding consoleOutputCallback to the options is good.

    the debugger will basically never work as-is. I'm not sure we even build it in (we shouldn't anyway). The issue is the emulator is single-threaded - it just executes and returns each time. But because the debugger has to stop execution in the middle the jsiIdle function doesn't return during debugger execution.

    To make it work you'd have to swap to using the emulator in a web worker/similar which seems like a much larger task.

    Right now we just feed the characters in one at a time and ask the emulator to go around it's idle loop once per character, and then hopefully during the call where \n is sent it executes and sends characters back

    There seems to be a race condition somewhere that causes the result of the call (=undefined) and what is printed by the call to switch position.

    I'm not sure I understand here... Do you think you could give an example of what's on the console when it happens?

  • The issue is the emulator is single-threaded

    Ah, that explains why everything halts on entering debugger into the interactive console.

    As for the example:

    DEBUG: emu.tx print(process.memory().usage)
    
    < 
    < 158
    < =undefined
    DEBUG: getSanitizedLastLine: =undefined
    > CURRENT MEMORY USAGE NaN
    DEBUG: emu.tx print(process.memory().usage)
    
    < 135
    DEBUG: getSanitizedLastLine: 135
    > CURRENT MEMORY USAGE 135
    < 135
    > CURRENT MEMORY USAGE 135
    

    The first time there is an =undefined ending up in the last stored line and that is then parsed instead of the 158 the line before. The code sent to the emulator is identical in both cases as the DEBUG: lines show.
    The emulator in the web IDE does this too and always so it probably isn't a race but something funky with storing the last line? Maybe a stray \r or something like that...

  • Ahh, ok. That's expected - you enter the command, and it's executed, so it prints 158, but then print returns undefined so that's printed to the REPL.

    Just send the command, prefixed with char code 16 (so "\x10") and then Espruino knows to turn 'echo' off for that line, so it won't print the =undefined bit.

    I'll see about removing the debugger from emulator builds...

  • The emu.tx line has the \x10 though. It is exactly the same code that works the second time. Is there maybe something that can happen before which cancels out the echo off for the following line?

  • Found it... the wrap command that runs before in the demo test did not have a \n. Seems to work now. So I assume \x10 only works if at the start of a line?

  • So I assume \x10 only works if at the start of a line?

    Exactly, yes - that's great - thanks!

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Concept: Run smoke tests / unit tests in custom built Espruino IDE docker images with GitHub actions

Posted by Avatar for Ishidres @Ishidres

Actions