Number.toFixed() not rounding per specification

You are reading a single comment by @Robin and its replies. Click here to read the full conversation.

•

Robin
Wed 2019.08.21

Using the IEEE 754 specification link in #2 above, and using those primitive tools, a pencil and paper, made an attempt at viewing the bits that are used to create the floating point representation of our discovered value that appears to be problematic.

GOAL: Make a determination if the issue is with the functions that are performing the conversion, or isolate if the underlying floating point math is problematic.

Using our initial value, we split into the fractional and the integer parts. Using the integer value, convert that to base 2 using a calculator, and also the hex equivalent for reference. The fractional part uses a iterative process of simple multiplication, and records the carry value, either a one or zero.

Step A
Convert the Integer part to it's equivalent in base 2
```
var kwh = 1234.505;

Convert to F.P.
Step A   Integral part 1234 base 10 = 0100 1101 0010 base 2 == 0x04D2

Fractional Part 0.505
0.505 x 2 = 1.01  1 
0.01 x 2  = 0.02  0 
0.02 x 2  = 0.04  0 
0.04 x 2  = 0.08  0 

0.08 x 2  = 0.16  0 
0.16 x 2  = 0.32  0 
0.32 x 2  = 0.64  0 
0.64 x 2  = 1.28  1 

0.28 x 2  = 0.56  0 
0.56 x 2  = 1.12  1 
0.12 x 2  = 0.24  0 
0.24 x 2  = 0.48  0 

0.48 x 2  = 0.96  0 
0.96 x 2  = 1.92  1 
0.92 x 2  = 1.84  1 
0.84 x 2  = 1.68  1
```
I only manually calculated 16 places as we really don't know how many bits will be required and quite frankly, is a tedious process.

Step B
Normalize that integer part (move the D.P. in this case left) so we have a value of one multiplied by it's base 2 power equivalent. We are after the resultant power which is used as an index.
```
0100 1101 0010

Step B Normalize - move D.P. left 10 places

// we drop the leading zero in the initial value

 1.00 1101 0010  base2 x 2 ^10

We moved the D.P. ten places left which is the equivalent of multiplying one and the fractional part by 2 ^ 10
```
We mentally save ten, the base 2 power and record both to use in subsequent calculations. Looking back at the @valderman Feb 2012 stackoverflow.com getNumberParts(x) function from #2 above, we now can validate that the power 10 does match.

Step C
Now we may start to assemble our F.P. register value. Bit 31 is the sign. B23-B30 is the exponent. Bit0-B22 holds the mantissa
```
1 bit sign  8 bit exponent  23 bit mantissa
reg 0  00000000  0000000000.....bit0
   31 30        22                 0
     msb    lsb msb              lsb 
```
Assemble the base 2 fractional part. Using the manual calculations from step 2
```
mantissa is formed using the original adjusted exponent
00 1101 0010  followed by
1000 0001 0100 0111

23 bits mantissa
reg 0  10001001  0011010010.....bit0
   31 30        22                 0
     msb    lsbmsb               lsb 
```
Step D
Convert the stream of bits into the base we understand in hex notation
```
F.P.  0  1000 1001   0011 0100 1010 0000 0101 000

Separating by four bits, 0xF we now have

0100 0100 1001 1010 0101 0000 0010 1000

we can easily see the hex equivalent of

0x449A 5028

// this is our Floating Point equivalent value in hex for
//  our base 10 value of 1234.505
 
// that is the value used in the register performing F.P. calculations
//  make sure not to confuse with the human readable
//  hex equivalent 0x04d2 or 1234

which when converted to human readable form is
0x449A 5028  base 16
equiv - as a floating point value not the original number 1234.505
1150963752  base 10
```
Now off to hand calculate the reverse process and build a simple snippet to check our work.

Back in a bit (a long while)

Number.toFixed() not rounding per specification

About

Espruino