• Wed 2019.08.21

    Using the IEEE 754 specification link in #2 above, and using those primitive tools, a pencil and paper, made an attempt at viewing the bits that are used to create the floating point representation of our discovered value that appears to be problematic.

    GOAL: Make a determination if the issue is with the functions that are performing the conversion, or isolate if the underlying floating point math is problematic.

    Using our initial value, we split into the fractional and the integer parts. Using the integer value, convert that to base 2 using a calculator, and also the hex equivalent for reference. The fractional part uses a iterative process of simple multiplication, and records the carry value, either a one or zero.


    Step A
    Convert the Integer part to it's equivalent in base 2

    var kwh = 1234.505;
    
    Convert to F.P.
    Step A   Integral part 1234 base 10 = ‭0100 1101 0010‬ base 2 == 0x04D2
    
    Fractional Part 0.505
    0.505 x 2 = 1.01  1 
    0.01 x 2  = 0.02  0 
    0.02 x 2  = 0.04  0 
    0.04 x 2  = 0.08  0 
    
    0.08 x 2  = 0.16  0 
    0.16 x 2  = 0.32  0 
    0.32 x 2  = 0.64  0 
    0.64 x 2  = 1.28  1 
    
    0.28 x 2  = 0.56  0 
    0.56 x 2  = 1.12  1 
    0.12 x 2  = 0.24  0 
    0.24 x 2  = 0.48  0 
    
    0.48 x 2  = 0.96  0 
    0.96 x 2  = 1.92  1 
    0.92 x 2  = 1.84  1 
    0.84 x 2  = 1.68  1
    

    I only manually calculated 16 places as we really don't know how many bits will be required and quite frankly, is a tedious process.


    Step B
    Normalize that integer part (move the D.P. in this case left) so we have a value of one multiplied by it's base 2 power equivalent. We are after the resultant power which is used as an index.

    ‭0100 1101 0010
    
    Step B Normalize - move D.P. left 10 places
    
    // we drop the leading zero in the initial value
    
     1.00 1101 0010  base2 x 2 ^10
    
    We moved the D.P. ten places left which is the equivalent of multiplying one and the fractional part by 2 ^ 10
    

    We mentally save ten, the base 2 power and record both to use in subsequent calculations. Looking back at the @valderman Feb 2012 stackoverflow.com getNumberParts(x) function from #2 above, we now can validate that the power 10 does match.



    Step C
    Now we may start to assemble our F.P. register value. Bit 31 is the sign. B23-B30 is the exponent. Bit0-B22 holds the mantissa

    1 bit sign  8 bit exponent  23 bit mantissa
    reg 0  00000000  0000000000.....bit0
       31 30        22                 0
         msb    lsb msb              lsb 
    

    Assemble the base 2 fractional part. Using the manual calculations from step 2

    mantissa is formed using the original adjusted exponent
    00 1101 0010  followed by
    1000 0001 0100 0111
    
    23 bits mantissa
    reg 0  10001001  0011010010.....bit0
       31 30        22                 0
         msb    lsbmsb               lsb 
    



    Step D
    Convert the stream of bits into the base we understand in hex notation

    F.P.  0  1000 1001   0011 0100 1010 0000 0101 000
    
    Separating by four bits, 0xF we now have
    
    0100 0100 1001 1010 0101 0000 0010 1000
    
    we can easily see the hex equivalent of
    
    0x‭449A 5028‬
    
    // this is our Floating Point equivalent value in hex for
    //  our base 10 value of 1234.505
     
    // that is the value used in the register performing F.P. calculations
    //  make sure not to confuse with the human readable
    //  hex equivalent 0x04d2 or 1234
    
    which when converted to human readable form is
    0x‭449A 5028‬  base 16
    equiv - as a floating point value not the original number 1234.505
    1150963752  base 10
    



    Now off to hand calculate the reverse process and build a simple snippet to check our work.

    Back in a bit (a long while)

About

Avatar for Robin @Robin started