Avatar for jonreid

jonreid

Member since Nov 2015 • Last active Aug 2017
  • 8 conversations
  • 73 comments

Most recent activity

    • 8 comments
    • 510 views
  • in JavaScript
    Avatar for jonreid

    My only concern around not having the V1 code in there as a fallback, is that if the V2 download fails and the unit reboots, it is then a non-functional unit until it can successfully download code.

    For our 24x7 remote logging application - this is not really an acceptable possibility.

    So either I need a way of removing all the V1 code from RAM before running V2, or could I as part of the download new code process do this ?

    • erasePage()
    • write a dump of the current running code into Flash
    • then follow by writing the V2 code (downloaded) into Flash
    • If V2 downloads and verifies I can write to my EEPROM that the V2 code is valid to run - so the bootloader knows which version to load.

    There would still be some risks around this, but probably acceptable for our application

    • 12 comments
    • 196 views
  • in JavaScript
    Avatar for jonreid

    Am only just getting on to implementing this now !

    So if I have say V1 of my code save() into the Pico (along with the above bootloader code in onInit() ?)

    1. What is the best way for me to grab the correct V2 minified code that I would put on my server for the pico to download and write into flash (which would be run thru eval) ?

    2. In the fallback scenario - When the new code is run via eval, I assume the existing V1 code would have already loaded at that point ? Do I need to get rid of it from RAM first somehow ?

    Thanks

  • in General
    Avatar for jonreid

    Its interesting, now I know it is a monthly issue, I can see strange things happening to our units on a monthly basis - although I cant exactly explain what is going on ie I would expect my software to crash completely, there definitely seems to be a correlation.

    We need to put some more units out into the field in a few days or so time. Which build would you recommend we put on (given we don't have a months testing time) - the custom one you created or a travis build ?

  • in General
    Avatar for jonreid

    Unfortunately I have no logs of getTime(). However on my debug unit, where the IDE 'Set current time' option was turned on - it crashed after 10 days on 31/07/2017 at 5:18pm (NZ time)

    In my application, I resync the time to the SIM800 time (new Clock object) every 90s. So it is possible that my app would only crash if the clock step back happened when my code was actually doing something that relies on getTime() before the next SIM800 re-sync.

  • in General
    Avatar for jonreid

    Interestingly I just checked my IDE settings and I do have the 'Set current time' option turned on.

    So that makes the Sun Jan 30 2000 15:24:42 time that was in the unit after crash seem not right ? As I am pretty sure the unit was never powered down after the last software was loaded onto the pico.

    I had loaded the new software 10 days earlier in this case.

  • in General
    Avatar for jonreid

    Great thanks for that. I will run this on some units to see if it makes a difference - might take a while to determine if there is a difference compared to standard firmware. I will also try to add some logging to capture the RTC time if I can detect it happens again.

    I do use deepsleep, but only when the power is disconnected, which should not have been the case. When in deepsleep it will be waking up every 4s, so not asleep long.

  • in General
    Avatar for jonreid

    Yep, pretty strange.

    • No I never use setTime() as I was worried about causing an issue like this :)
    • I have observed on 1.87 and 1.92
    • Unfortunately I don't know what the getTime() value was at the time of the crash as I was not logging that to the console. Working backwards it would have been around 949245868 at restart time after the crash.

    To keep absolute time, my code regularly queries the local time from the SIM800, then each time I instantiate a new Clock object. This then keeps things accurate for Pico we have not yet put on the crystal.

    The clock speeding up would explain why my EEPROM read failed at that time also. In my code I do a EEPROM write and then immediately follow with a read to verify. In the AT25 code there is a loop using getTime() to wait for the EEPROM write period. So if this is essentially skipped then the read would fail.

    We never generally set the clock (using setTime) so in all our units it would be pretty random - unless it gets set by the IDE at programming time ?

  • in General
    Avatar for jonreid

    I have a number of Pico based units running 24x7 out in the field and have observed a few times in the last few months the RTC suddenly seeming to speed up.

    I finally managed to catch this while logging the serial console on a unit - the sequence was:

    • first sign of issues was the read process (a write verify) from the SPI EEPROM failed (read all zeros or FF's)
    • Because the clock was going too fast, this was generating timing errors in my code, which I was logging (write to EEPROM - which did appear to work!)
    • the Pico then started generating memory errors during SIM800 upload proess and restarted
    • on restart the unit uploaded to my server the logged error events, which had advanced timestamps on them
    • all the timestamps came from the standard espruino Clock module (based on getTime())
    • the restart rectified the problem

    The whole crash/restart process took around 30seconds (this is based on external timestamps coming from a SIM800), however my logged events (based on getClock()) over the 30second period went in a range of 18mins to 15hours into the future !??

    This particular unit is running on LSI, but I am reasonably sure it has happened on a Pico we put on an external crystal (to be confirmed)

    Is this even possible ??? It just seems that the RTC just suddenly started running around 1800x faster.

    Any ideas appreciated - it would be nice to be able to detect and fix this rather than just crash.

Actions