-
• #2
Oh, interesting.
Maybe the FLOAT_ABI_HARD won't be needed for all espruino code. It breaks passing floats to inline C in regular registers.
I think with some other HR library the only hardfp method that needed linking in was sqrt and it was possible to do some workaround and keep the rest in softfp calling convention. it is not super important but I'll check it. Or is the hardfp otherwise preferred and produces shorter/faster code?
-
• #3
Maybe the FLOAT_ABI_HARD won't be needed for all espruino code.
It seems hard to mix, but if you can find a solution I'm open to it - in a way I'd prefer not to change.
But it is only for Bangle.js 2 - I've kept everything else as softfp to avoid things getting broken.
But I did a quick test with:
function m() { for (y=0;y<32;y++) { line=""; for (x=0;x<32;x++) { var Xr=0; var Xi=0; var Cr=(4.0*x/32)-2.0; var Ci=(4.0*y/32)-2.0; var i=0; while ((i<8) && ((Xr*Xr+Xi*Xi)<4)) { var t=Xr*Xr - Xi*Xi + Cr; Xi=2*Xr*Xi+Ci; Xr=t; i++; } if (i&1) line += "*"; else line += " "; } // print(line); } } t=getTime();m();print(getTime()-t);
And as far as I can tell, hardfp is faster - maybe 5.5s vs 5.7s so not a lot, but it's noticeable and reproducible. Seems a shame to have everything running slower just for the extremely rare case of floats + inline C that I'd argue literally only you currently uses (and you make your own builds anyway!)
-
• #4
Well, the STL Viewer and Mandel apps also use floats, although I haven't officially ported either to Bangle 2 (just running hacked versions privately). Don't have the watch near me at the moment, I will see how things break - I usually pass floats in flat arrays in and out of compiled C via pointers in my code, so I am not sure where the passing in registers comes in here...
I also freely concede that a more reliable HR monitor is much higher on most people's priority list than a 3D viewer on a watch... ;-) -
• #5
Well, I am not against hardfp. If it is faster or otherwise better then so be it. I was thinking that since JavaScript uses doubles, it is working with software floats anyway. Not sure how the calling convention works for double precision numbers with hardfp and single precision fpu, are they put into two vfp registers before calling and unloaded back in software emulation? Need to try (EDIT: here https://gist.github.com/fanoush/0da3e47aee9e20fb11b010cc3aa4e16e - updated my older test, now using
-mcpu=cortex-m4 -mfpu=fpv4-sp-d16
). Also I was thinking that cortex m4 has some lazy optimization when fpu registers are not used between interrupts they are not saved on stack so it is faster. this may use them just to pass it in arguments between functions without actually using them to do the math. -
• #6
I also freely concede that a more reliable HR monitor is much higher on most people's priority list than a 3D viewer on a watch... ;-)
It is hopefully not 'either or' situation. You can link code using both calling conventions, we did it before - old arduino for nrf52 is using softfp callling convention and we linked such hardfp vendor HR library before and it worked, need to check. may take few days.
And btw one can set attributes on each method so both calling conventions can be used even in same source code, was testing this few years ago https://gist.github.com/fanoush/85ebe50c5c4a54ca15bf2867e27f7cd3
but from the source Gordon committed I see that there are no float arguments passed in the methods of the HR library(?) anyway so it may be only about giving it few hardfp based math library methods it needs for linking.OTOH if whole espruino linked as hardfp is faster and/or smaller I hope we'll somehow figure out inline C code passing hard floats too.
Well, maybe it is not an issue and will even work 'as is' if the inline C compiler keeps soft float conventions as it is now despite espruino fw using hard floats internally. When thinking about it, it may not break as they are separate and the float arguments are passed as integers.
-
• #7
Does the proprietary firmware also provide a SpO2 value?
-
• #8
Here is output from the library so I think it does not
https://github.com/espruino/Espruino/blob/master/libs/misc/vc31_binary/algo.h#L47 -
• #9
I usually pass floats in flat arrays in and out of compiled C via pointers in my code, so I am not sure where the passing in registers comes in here...
Yes, in that case it won't be an issue. I think it's when trying to call a function like
void foo(float)
- but Espruino doesn't support automatically converting floats anyway, so I think you'd have to do:foo((new Uint32Array((new Float32Array([1234])).buffer)[0])
So it's a bit contrived.
However @fanoush you're totally right - we don't pass any floats into the library, so I guess as long as it's not using anything external it doesn't really matter. It's just a matter of convincing the Linker it's ok, as that was what was throwing the error before.
So I guess it may be enough to edit the
.o
files headers so that GCC thinks they are softfp?As you say, since Espruino uses doubles for most stuff the hardfp doesn't feel like it'd be needed. And in fact double passing may in fact be an issue (or maybe it's just for floats?). In the current build Espruino assumes the doubles are passed in registers/stack just like ints - so switching to hardfp may actually break stuff (although in initial tests it doesn't seem to have).
Does the proprietary firmware also provide a SpO2 value?
@user140377 The one included at the moment doesn't. There are separate binaries for SpO2 sensing that could be included, but you have to put the sensor into a special mode first where it samples a lot faster and also samples the Infrared at the same rate.
It's definitely possible and I can provide the binaries to anyone who is interested in looking into it, but it's quite a lot of work.
-
• #10
we don't pass any floats into the library, so I guess as long as it's not using anything external it doesn't really matter. It's just a matter of convincing the Linker it's ok, as that was what was throwing the error before.
Yes, and when dumping the object files I don't even see any floating point code inside at all. it is all integer math.
~/Espruino/libs/misc/vc31_binary$ for i in *.o ; do arm-none-eabi-objdump -d $i ; done | grep vmov
gives nothing.
I tried to make a library from it
arm-none-eabi-gcc-ar -cr libvc31.a *.o arm-none-eabi-gcc-ranlib libvc31.a
and trying to pass
LDFLAGS += -Llibs/misc/vc31_binary -lvc31
but it didn't work- it cannot find that Algo_ methods in such library. I guess it is becasue of LTO and the library may not be well made for LTO.I see the
ld: error: libs/misc/vc31_binary/algo.o uses VFP register arguments, bin/espruino_2v17.47_banglejs2.elf does not
error. Also tried to striparm-none-eabi-strip -N 'BuildAttributes$$THM_ISAv4$E$P$D$K$B$S$7EM$VFPi3$EXTD16$VFPS$VFMA$PE$A:L22UL41UL21$X:L11$S22US41US21$IEEE1$IW$USESV6$~STKCKD$USESV7$~SHL$OSPACE$EBA8$REQ8$PRES8$EABIv2' *.o
and looks like it is gone from .o files but this was not enough so the flag is somewhere else.
But still if you see hardfp being faster then it is also good. But I find it strange since it must shuffle between those registers more than before, if you check https://gist.github.com/fanoush/0da3e47aee9e20fb11b010cc3aa4e16e then it can be seen that for math operation
*
it calls__aeabi_dmul
which has softfloat convention even if file was compiled with-mfloat-abi=hard
so the__hard_f2d
and__hard_powd
there needs to unpack arguments from vfp to integer register before calling the multiplication eabi builtin (unlike the first f2d method or __softfp_powd that just calls it directly). So I wonder what else offsets this extra code to still make it faster. -
• #11
So I guess it may be enough to edit the .o files headers so that GCC thinks they are softfp?
I tried the objcopy trick from https://stackoverflow.com/a/51439150 and it links fine. Not sure whether it still works :-)
-
• #12
I just did some more tests - this time on a bare nRF52 with a minimal build.
5.65 softfp 5.63 hardfp
So there is a repeatable difference, but it's very very small, and I'd definitely err on the side of keeping softfp and not breaking anything if at all possible :)
I just tried building a minimal C file with hard/softfp and I see a few binary differences in the object file, but I do see:
// hard 4 .ARM.attributes 00000034 00000000 00000000 000000b8 2**0 CONTENTS, READONLY // soft 4 .ARM.attributes 00000032 00000000 00000000 000000b8 2**0 CONTENTS, READONLY
And of course the
algo.o
is something totally different (5e!) - but I did just hex editalgo.o
and change 5e to 32 and it does now build. I've just got to do the other files, and we might be ok -
• #13
That's great! Much better than my attempt - I'll just remove the header from those files then and switch back to softfp
... and it works! Just committed. Thanks for your help!
I just hope folks actually notice some difference with the new algorithm
-
• #14
I just did some more tests - this time on a bare nRF52 with a minimal build.
And is the hardfp version larger? so the extra register moves for double precision math are there?
OTOH I can imagine that if you use single precision float type argument/return value somewhere internally in Espruino then it could make such code shorter and may not overwrite integer registers. But that should affect only method calling. vfp single precision math inside methods should be there already even with softfp calling convention.
EDIT:
when briefly checking https://github.com/espruino/Espruino/search?q=float&type=code I can see tensorflow usesfloat
type a lot so maybe that one could benefit from hardfp calling convention being default -
• #15
Without comparing to a BTHRM and without testing with much movement, the values given by the HRM seem to be a lot closer to reality. Having a resting pulse with the old algo of under 50 would be nice to have, but for me it is not exactly realistic :) So at least in the constraints of it being an optical sensor it now works much better. I had the chance to play around with a Garmin Fenix 7X which should have one of the best optical sensor available. That worked impressively well, but even that thing wasn't able to keep up with steep rises and falls in pulse rate during movement like an EEG based sensor does.
-
• #16
I did a small test with two Bangle.js2 on the same forearm. One with the old firmware and a bt chest belt connected to it (blue) and its firmware PPG result (red) and another Bangle.js2 with new firmware and and its firmware PPG result (green). While recording I was
- sitting still
- walking fast
- running without moving
- sitting still
1 Attachment
- sitting still
-
• #17
Great results, thank you! There are actually more modes listed here https://github.com/espruino/Espruino/blob/master/libs/misc/vc31_binary/algo.h#L10
and currently SPORT_TYPE_NORMAL is hardcoded here https://github.com/espruino/Espruino/blob/master/libs/misc/heartrate_vc31_binary.c#L67 In future it would be nice to compare chest belt with that when you pick the right mode, running and walking are different modes there (1 and 9) -
• #18
I wonder if some other activity guessing code could preselect this mode automatically from accelerometer data.
Or maybe it could do it by itself for normal mode since the accelerometer data is passed there too https://github.com/espruino/Espruino/blob/master/libs/misc/heartrate_vc31_binary.c#L61 but at least for walking phase it didn't work well.
-
• #19
That's great! Thanks for the graphs @user140377 it's really good to see some side by side results!
I wonder if some other activity guessing code could preselect this mode automatically from accelerometer data.
There is actually another binary library for activity guessing that could be added. I've been meaning to try it out though - it's an entirely 'black box' with just one function that you pass xyz values to, so I'm not too sure how good it is.
And is the hardfp version larger?
Yes, I just tried again - 233196 hard, 232156 soft. So it's another reason to try and keep softfp :)
-
• #20
Ok, I just tried the black box library and it's useless. It returns 1 if you're moving, 0 otherwise.
I just decompiled it to verify as well, and there's nothing in there. So realistically we'd need some other way to guess the activity type.
Having said that, I got curious and decompiled the heart rate algorithm, and despite all those sport types the only ones that are handled in any way are:
SPORT_TYPE_NORMAL =0x00, SPORT_TYPE_RUNNING =0x01, SPORT_TYPE_RIDE_BIKE =0X02, SPORT_TYPE_SPINNING = 0X12
And it looks like SPORT_TYPE_RUNNING and SPORT_TYPE_RIDE_BIKE do the same thing, and SPORT_TYPE_SPINNING just makes the algorithm more sensitive to accelerometer data.
So maybe that activity detection function is correct - maybe if the step counter is counting steps we should just tell the heart rate monitor it's in running mode.
Having said that, looking at the graph running is actually handled properly, so maybe we shouldn't worry too much about setting the activity mode!
-
• #21
Having said that, looking at the graph running is actually handled properly,
running (third part?) looks OK, it is the second part (walking?) which is a bit off so maybe if that would be in RUNNING mode instead of NORMAL it could be better
however running part is now measured in NORMAL mode too and is OK(?)
-
• #22
Please note that I was walking fast, not slow or medium. This is almost running, I was intentionally testing extreme cases.
-
• #23
however running part is now measured in NORMAL mode too and is OK(?)
Yes, that's my concern. I guess I could expose the sport mode in some way so that anyone interested could fiddle with it, but I feel like for the moment the current solution is still a lot better than it was before!
-
• #24
Please note that I was walking fast
that's even more interesting because when you was running the NORMAL mode managed to do quite well but fast walking did not work so well (20-10BPM lower)
i just suppose that you were for x coordinate
0-50 sitting still
50-130 walking fast
130-210 running without moving
210- sitting still -
• #25
Sorry forgot to mention that but yes we have approx one 1 pwm per second with event number on x axis.
It was ca.- one minute
2 two minutes (~60-180)
3 one minute (180+)
4 one minute
Maybe I'll do different plots for difference activities in the future (walking slow/fast, running, stairs, bicycle)
- one minute
I have now switched over to the proprietary heart rate algorithm provided by the sensor manufacturer - which I hope will do a better job at giving you accurate heart rate readings, especially when moving around.
It can be supplied with some different sports modes - for instance it may be possible to switch it to a mode that'll make it work better for Running - but I haven't tried this yet.
It's a shame to move away from an Open Source implementation, but realistically despite making a test harness nobody seems to have been very interested in trying out better solutions.
In the end, I think most Bangle.js users care more about a working Heart rate sensor than an open one - although anyone who does care can still build their own firmwares with the open one with relative ease.