-
• #2
Hmm - It could be that the disk errors are causing Espruino to leak memory (however looking at the code I can't see how), or it could be that Espruino is running low on memory and that's causing the errors?
Could print the value of
process.memory().usage
each time you read the sensors? It might help to see if the memory usage is rising?Any chance you could strip your code down (removing the DHT read and LCD write) and see if it still happens?
-
• #3
I'll try to strip it down into a minimal example that exhibits the behavior.
While it was running, I checked process.memory() and it was steady at 900 used.
-
• #4
Hmm, that should be ok then... I wrote some simple code to append to a file, and the pulled the SD card out (producing
DISK_ERR
errors), and that didn't cause the memory usage to rise... -
• #5
The DISK_ERR issue itself is trivial to reproduce:
var fs=0; function randint(max) { return Math.round(Math.random()*max); } function onInit() { digitalWrite(A14,1); setTimeout("doInit();",2000); } function doInit() { digitalWrite(A14,0); fs=require("fs") setInterval(function(){dologs();},10000); } function dologs() { digitalWrite(A15,1) var temp=randint(10); fs.appendFile("testlog.log","ASDF,"+temp.toString()+",Time,"+getTime().toFixed()+"\n"); digitalWrite(A15,0) } onInit();
This will run for a while, until eventually, every attempt to write results in DISK_ERR. This itself is a Bad Thing.
However, it does not run out of memory!
So why does it run out of memory when running the full setup?
var g = 0; var d = 0; var inits=new Uint8Array([0,0,0,0]); var t=-1; var rh=-1; var t2=-1; var rh2=-1; var line1=""; var line2=""; var fs; var activepin=""; function onInit() { digitalWrite(A14,1); digitalWrite(A13,1); setTimeout("doInit();",2000); } function doInit() { digitalWrite(A15,1); fs=require("fs") SPI3.setup({ baud: 1000000, sck:B3, mosi:B5 }); digitalWrite(A13,0); g=require("PCD8544").connect(SPI3,B6,B7,B4, function() { g.clear(); g.drawString("LCD OK",0,0); g.drawLine(0,10,84,10); g.flip(); digitalWrite(A15,0); setTimeout("inits[0]=1;endInit();",100); }); d=require("DHT11").connect(C10) d.read(function(a){ t=a.temp; rh=a.rh; inits[1]=1;endInit();}); e=require("DHT22").connect(C11) setTimeout(function(){ e.read(function(a){ t2=a.temp; rh2=a.rh; inits[2]=1;endInit();});},2000); } function endInit() { if (inits[0]==1 && inits[1]==1 && inits[2]==1 && inits [3]==0) { inits[3]=1; digitalWrite(A14,0); g.clear(); g.drawString("Init OK!",0,0); g.flip(); setInterval(function() {doMeasure();},5000); setTimeout(function(){setInterval(function(){doMeasure2();},5000);},2500); setInterval(function(){dologs();},30000); } } function doMeasure() { d.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh=a.rh; t=a.temp; } updateLCD(); }); } function dologs() { fs.appendFile("timetemp.log","DHT11,"+t.toString()+","+rh.toString()+",DHT22,"+t2.toString()+","+rh2.toString()+",Time,"+getTime().toFixed()+"\n"); } function doMeasure2() { e.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh2=a.rh; t2=a.temp; } updateLCD(); }); } function updateLCD() { g.clear(); if (t!=-1) { g.drawString("Temp: "+t.toString()+"C",0,0); g.drawString("RH: "+rh.toString()+"%",0,10); g.drawString("Temp: "+t2.toString()+"C",0,20); g.drawString("RH: "+rh2.toString()+"%",0,30); } else { g.drawString("DHT11 not",0,0); g.drawString("working!",0,10); } g.drawString(line1,0,20); g.drawString(line2,0,30); g.flip(); } onInit();
All I can think of is that this DISK_ERR disrupts other things happening at the same time, like DHT read or LCD update.
Unfortunately, I've never been able to catch it between the first time it gives the DISK_ERR and when it runs out of memory, in order to see if it's leaking memory every time, or whether it survives until the error happens at a critical moment.
-
• #6
setTimeout("doInit();",2000);
change to:
setTimeout(doInit,2000);
then it works on my Espruino.
setTimeout() don't like a string, then it don't call doInit() and therefor it don't require fs and we get DISK_ERR
Edit:
It only reached 102 before DISK_ERR came back. Then there was a 30 DISK_ERR before continuing. -
• #7
Now I have it running again, and right now it reached to 400 logs to the file without error.
I also changed the order of the objects. It seems to Espruino dislike forward references. -
• #8
With 1 sec. log it failed at
1493ASDF,1,Time,1595
1751ASDF,6,Time,1884
then a little later
1842ASDF,7,Time,2016
1843ASDF,7,Time,2017
1844ASDF,1,Time,2017
1845ASDF,6,Time,2017
1846ASDF,2,Time,2017
1847ASDF,1,Time,2017
1848ASDF,3,Time,2017
1849ASDF,2,Time,2017
1850ASDF,4,Time,2017
1851ASDF,1,Time,2018
1852ASDF,4,Time,2018
1853ASDF,3,Time,2018
1854ASDF,7,Time,2018
1855ASDF,1,Time,2018
1856ASDF,3,Time,2018
1857ASDF,4,Time,2018
1858ASDF,1,Time,2018
1859ASDF,4,Time,2018
1860ASDF,7,Time,2018
1861ASDF,10,Time,2018
time is still 1 sec. -
• #9
Hm.
On my Espruino, onInit and doInit are DEFINITELY running - if they weren't, it wouldn't work at all. On my Espruino, I've always been able to put strings into setTimeout/setInterval. The IDE warns me that it's "evil", but it works fine*. What I didn't realize is that you could just pass it the name of a function, unquoted, and have it call that function with no arguments! I thought you had to do:
setTimeout(function(){doInit();},2000);So in your case, it sorts itself out after enough disk errors happen? I don't think it was coming back to life for me, will check logs tonight.
*Note that I have not tested whether this "evil" evaluation has resulted in my eternal damnation; testing this seems like a real chore.
-
• #10
I've determined that the out of memory issue can happen without logging. So I think there is some condition in which my DHT module explodes. Investigations are ongoing.
However, there is still the issue with the SD card getting unhappy and producing DISK_ERRs, which happens when just logging gibberish. Do you have any thoughts on how to debug this further?
-
• #11
And yet, here's a log clearly showing that once the disk errors start, the memory problems begin - I am back to being convinced this is an Espruino bug, as the system was stable for approximately one hour, reporting stable memory usage, then the instant it started disk erroring, it started leaking memory every time it measured the DHT's.
52 952 952 952 952 952 952 952 952 952 ERROR: Unable to write file : DISK_ERR 895 ERROR: Unable to write file : DISK_ERR 854 ERROR: Unable to write file : DISK_ERR 775 ERROR: Unable to write file : DISK_ERR 764 ERROR: Unable to write file : DISK_ERR 707 ERROR: Unable to write file : DISK_ERR 627 ERROR: Unable to write file : DISK_ERR 618 ERROR: Unable to write file : DISK_ERR 560 482 413 356 277 210 152 74 ERROR: Out of Memory! ERROR: Out of Memory! WARNING: Unable to create string as not enough memory at line 1 col 25 {dht.onread(dht.endRead());} ^ in function called from system Execution Interrupted during event processing - clearing all timers and watches. >process.memory() ={"free":997,"usage":803,"total":1800,"history":1,"stackEndAddress":536909500,"flash_start":134217728,"flash_binary_end":134438664,"flash_code_start":134443008,"flash_length":262144} >
The code that's generating this:
var g = 0; var d = 0; var inits=new Uint8Array([0,0,0,0]); var curdht=false; var t=-1; var rh=-1; var t2=-1; var rh2=-1; var line1=""; var line2=""; var fs; var activepin=""; function onInit() { digitalWrite(A14,1); digitalWrite(A13,1); setTimeout("doInit();",2000); } function doInit() { digitalWrite(A15,1); fs=require("fs") SPI3.setup({ baud: 1000000, sck:B3, mosi:B5 }); digitalWrite(A13,0); g=require("PCD8544").connect(SPI3,B6,B7,B4, function() { g.clear(); g.drawString("LCD OK",0,0); g.drawLine(0,10,84,10); g.flip(); digitalWrite(A15,0); setTimeout("inits[0]=1;endInit();",100); }); d=require("DHT11dev").connect(C10) d.read(function(a){ t=a.temp; rh=a.rh; inits[1]=1;endInit();}); e=require("DHT22dev").connect(C11) setTimeout(function(){ e.read(function(a){ t2=a.temp; rh2=a.rh; inits[2]=1;endInit();});},2000); } function endInit() { if (inits[0]==1 && inits[1]==1 && inits[2]==1 && inits [3]==0) { inits[3]=1; digitalWrite(A14,0); g.clear(); g.drawString("Init OK!",0,0); g.flip(); setInterval(function() {console.log(process.memory().free);curdht=!curdht;if(curdht){doMeasure();}else{doMeasure2();}},5000); setInterval(function(){dologs();},30000); setBusyIndicator(A13); } } function doMeasure() { d.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh=a.rh; t=a.temp; } updateLCD(); }); } function dologs() { fs.appendFile("timetemp2.log","DHT11,"+t.toString()+","+rh.toString()+",DHT22,"+t2.toString()+","+rh2.toString()+",Time,"+getTime().toFixed()+"\n"); } function doMeasure2() { e.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh2=a.rh; t2=a.temp; } updateLCD(); }); } function updateLCD() { g.clear(); if (t!=-1) { g.drawString("Temp: "+t.toString()+"C",0,0); g.drawString("RH: "+rh.toString()+"%",0,10); g.drawString("Temp: "+t2.toString()+"C",0,20); g.drawString("RH: "+rh2.toString()+"%",0,30); } else { g.drawString("DHT11 not",0,0); g.drawString("working!",0,10); } g.drawString(line1,0,20); g.drawString(line2,0,30); g.flip(); }
Does anyone have any thoughts on what might be going on? Is there any way to get some information about where all that memory is going when it starts to fail?
Using Espruino v55.
-
• #12
That's pretty strange... You get the memory leak with the disk error:
627 ERROR: Unable to write file : DISK_ERR 618 ERROR: Unable to write file : DISK_ERR 560 482 413 356
But then the disk errors stop and you're still losing memory! You could try typing
trace()
, which dumps the internal datastructures. I guess you might be able to find something in there that hinted at where the memory was going...If you could try and get the simplest possible piece of code that still got the memory leaks then that'd really help matters.
With the DISK_ERR, I'm unsure what to do as it's the filesystem lib that's failing and when I have tried it, it does recover eventually - the development version (or 1v56 when it is released) now returns false is appendFile fails, so you could try again if needed.
By the way, Using strings in setInterval/etc it's really evil in Espruino - it's actually pretty efficient. It's just that jslint (which is what does the warnings) thinks it is and I haven't got around to convincing it otherwise :)
-
• #13
Got a trace of it just before it exploded (down to 61 free at start). Clearly a repetitive structure - but uh... I can't make sense of the trace... What are these multiplied things?
http://drazzy.com/espruino/trace.txtAnother one, where the trace was taken right after the problem first manifested.
http://drazzy.com/espruino/trace2.txt -
• #14
Thanks - that's really helpful... So it looks like there are a whole bunch of setTimeouts coming from the DHT11 module, but I have no idea why!
The only thing I can think is that somehow the RTC is getting really confused and the value from
getTime
actually jumps backwards at some point (and that might explain why you get the DISK_ERR).Please can you also log the value from
getTime()
alongsideprocess.memory()
?How long does this happen after the last hard reset? From the logs it looks like both times have been after around 4 hours? Do you think it's been around that amount of time in each case?
-
• #15
I've discovered a pretty nasty problem I think that could explain your issues. After 65535 seconds, the RTC goes into the second 16 bit word, and I wasn't combining the two 16 bit words properly - so time would suddenly jump back at that point.
It should be fixed in 1v56 now
-
• #16
I'm still trying to track this down - something happens after around 8000 seconds, and the RTC reports the correct time, but
getTime()
reports around 16000 secs -
• #17
Interesting - the time to failure is definitely less than 65K seconds (just under 20 hrs). Not all of these failures have taken the same length of time. Sometimes it would fail after 2-3 hrs (maybe 8000 seconds?), while the most recent one, where the trace was taken when it first started circling the drain, took around 5 hours (16000 seconds?) to fail (I really need to put pinstrips onto my second espruino, so I can have one running tests for stability, while doing stuff with the other one).
I haven't had a chance to try with v56 - but I do still have the log files from v55, and will post them when I get home tonight, as I did record the results of getTime().
Thanks for investigating this.
-
• #18
No problem - sorry it hasn't been easy to track down! Because it's not reproducable immediately it's really difficult. I even tried setting the RTC to around the 5 hour mark and running it repeatedly - but it didn't happen in that case!
-
• #19
Failed real fast on v56 - was setting it up before I went out, and it failed before I got out the door (didn't have a chance to study the results yet, since I'm on my way out the door)
http://drazzy.com/espruino/v56failure.txt
var d; var e; var inits=new Uint8Array([0,0,0,0]); var curdht=false; var t=-1; var rh=-1; var t2=-1; var rh2=-1; var fs; var havetraced=false; function onInit() { digitalWrite(A14,1); digitalWrite(A13,1); setTimeout("doInit();",2000); } function doInit() { fs=require("fs"); fs.appendFile("timetemp5.log","Espruino restarted") digitalWrite(A13,0); d=require("DHT11dev").connect(C10); d.read(function(a){ t=a.temp; rh=a.rh; inits[1]=1;endInit();}); e=require("DHT22dev").connect(C11); setTimeout(function(){ e.read(function(a){ t2=a.temp; rh2=a.rh; inits[2]=1;endInit();});},2000); } function endInit() { if (inits[1]==1 && inits[2]==1 && inits [3]==0) { inits[3]=1; digitalWrite(A14,0); setInterval(function() {var mf=process.memory().free;console.log(mf.toString()+" "+getTime().toFixed().toString()); if (mf < 1100) traceit();curdht=!curdht;if(curdht){doMeasure();}else{doMeasure2();}},5000); setInterval(function(){dologs();},30000); setBusyIndicator(A13); } } function traceit() { if (!havetraced) { havetraced=true; trace(); } } function doMeasure() { d.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh=a.rh; t=a.temp; } }); } function dologs() { console.log("trying to log..."); fs.appendFile("timetemp5.log","DHT11,"+t.toString()+","+rh.toString()+",DHT22,"+t2.toString()+","+rh2.toString()+",Time,"+getTime().toFixed()+"\n"); } function doMeasure2() { e.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh2=a.rh; t2=a.temp; } }); } save();
-
• #20
Still fails quickly in v57. So far the most stable version has been v55, where it would run for ~4 hrs.
Trace:
http://drazzy.com/espruino/v57failure.txtThe code that generated it (simplified down to a single DHT now):
var d; var e; var inits=new Uint8Array([0,0,0,0]); var curdht=false; var t=-1; var rh=-1; var t2=-1; var rh2=-1; var fs; var havetraced=false; function onInit() { digitalWrite(A14,1); digitalWrite(A13,1); setTimeout("doInit();",2000); } function doInit() { fs=require("fs"); fs.appendFile("timetemp5.log","Espruino restarted") digitalWrite(A13,0); d=require("DHT11dev").connect(C10); d.read(function(a){ t=a.temp; rh=a.rh; inits[1]=1;endInit();}); } function endInit() { if (inits[1]==1 && inits [3]==0) { inits[3]=1; digitalWrite(A14,0); setInterval(function() {var mf=process.memory().free;console.log(mf.toString()+" "+getTime().toFixed().toString()); if (mf < 1100) traceit();doMeasure();},5000); setInterval(function(){dologs();},30000); setBusyIndicator(A13); } } function traceit() { if (!havetraced) { havetraced=true; trace(); } } function doMeasure() { d.read(function(a) { if (a.rh!=-1 && a.temp < 50) { rh=a.rh; t=a.temp; } }); } function dologs() { console.log("trying to log..."); fs.appendFile("timetemp6.log","DHT11,"+t.toString()+","+rh.toString()+",Time,"+getTime().toFixed()+"\n"); } save();
Edit: No, unlike previous tests, log files were not normal.
-
• #21
In the above test, I just discovered, the SD card contents got trashed!
node_modules directory was deleted, and a directory "t" was created, but is not accessible via windows file explorer.
timetemp6.log is empty.
Windows disk repair tool found and recovered all the files when run on the SD card. Am investigating whether there is some issue with the SD card itself.
-
• #22
Is there any way you can check what the sector size on your SD card is? It should be 512...
Thanks for all the info - it'd seem that the issue is still that something strange is happening with the value from
getTime()
judging by the values for time in the timers array. It could be that rapid changes of the time are causing the filesystem library to time out before things are being written correctly.I've spent ages trying to sort that out for 1v57 though - I've had a board running over the weekend as a clock, and it's still keeping time - which seems a definite improvement over 1v56.
Can you try:
var t = getTime(); if (!fs.appendFile("timetemp6.log","DHT11,"+t.toString()+","+rh.toString()+",Time,"+getTime().toFixed()+"\n")) console.log("FAIL before "+t+" after "+getTime());
That might help to show whether the time is actually being reported wrong...
-
• #23
Actually you can trace this down really easily:
var t = getTime(); while (true) { var p = getTime(); if (p<t) console.log("Backwards! "+p+" vs "+t); t=p; }
I think this is probably a SysTick-based problem again. I need to add some code to detect the case where the SysTick timer overflows but there is no interrupt in time. It shouldn't even happen - but it seems it does :(
-
• #24
I may have just fixed this... Can you try:
http://www.espruino.com/binaries/git/commits/099ad7d4f8f8080a823c9f3db7fcd79a8aee779a/
I stupidly forgot to mark something 'volatile' and I think the compiler optimised out something that had been put in specially :)
-
• #25
I'll give it a try tonight and report back.
I have a program that reads a pair of DHT's (11 and 22, to compare their readings), shows them on a Nokia 5110 screen, and every ten seconds, records the latest data on the SD card. Every time I go away and leave it, I come back to find something like this on the console:
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Unable to write file : DISK_ERR
ERROR: Out of Memory!
ERROR: Out of Memory!
WARNING: Unable to create string as not enough memory
ERROR: Out of Memory!
ERROR:
(with cursor at the end of the word ERROR:)
Using v55 firmware, connecting via bluetooth. After reset, SD card is found to be working normally.
All of the DISK_ERRs come in fairly rapid succession (either all at once, or once per attempted append once the problem manifests). I don't know what the exact timing is, since I've never been watching the console when it happens.