• Sat 2019.08.24

    Took a bit longer than I anticipated

    Attempting to uncover the source of this rounding error reminds of a book I read thirty years ago that had an impact on my career, if not more awareness to the world of computing. I was buried in electronics at the time, barely post tube era. (I hear the laughter now) oops, sorry proper web etiquette ROTF LOL

    Even remember the author, Clifford Stoll. Anyone with me? The Cuckoos Egg A true story that Cliff unravels tracking down $0.75 accumulated rounding error that leads him down the path tracking the crime in progress. And across a 1200 baud modem! Remember those days? Mesmerised over the step-by-step process to find the computer anomaly. Can't immediately put my hand on the hard copy I picked up later in the nineties, but did find a link. Isn't the web a marvelous place!!

    Grab a copy - worth the read

    http://bayrampasamakina.com/tr/pdf_stoll_4_1.pdf



    Back to our Cuckoos Egg that is plaguing this discovered rounding issue.

    While cranking out routines to assist in building some data to analyze, I uncovered a slick online tool to create Floating Point values.

    https://www.exploringbinary.com/floating-point-converter/

    I used this to cross check my manual register byte creation. Still need some looping functions to solve this anomaly though.





    From the following we can visually verify that the @valderman Feb 2012 snippet does in fact return consistent accurate results.

    Output from Chrome browser

    significand= 001101001010000001010001111010111000010100
    significand= 01234567890123456789012345678901      
    //    array element index reference only
    
    
    // we have a sign bit, eight exponent bits, followed by the significand, 
    // the string of bits we are after starting at element 10
    
    
    2^0 = 1 add leading implicit 1
    // we would use the above for our decimal representation
    
    str = [ 10000001010001111010111 ] 
    
    [ 2^ -1 ] 0.5
    [ 2^ -8 ] 0.00390625
    [ 2^-10 ] 0.0009765625
    [ 2^-14 ] 0.00006103515625
    [ 2^-15 ] 0.000030517578125
    [ 2^-16 ] 0.0000152587890625
    [ 2^-17 ] 0.00000762939453125
    [ 2^-19 ] 0.0000019073486328125
    [ 2^-21 ] 4.76837158203125e-7
    [ 2^-22 ] 2.384185791015625e-7
    [ 2^-23 ] 1.1920928955078125e-7
    
    sum = 0.5049999952316284
    



    Note the number of digits the browser presents to the right of the D.P.

    To determine if there might be any issues while adding, (remember we use the PC to perform the addition, which is using the same questionable Floating Point to perform that calculation) I added the summation intermediate step.

    I've added the element location beneath the addend to ease locating elements ref  D:0123456

    // Chrome browser
    2^0 = 1 add leading implicit 1
    str = [ 10000001010001111010111 ] 
    
    [ 2^-1 ]
    0.5
    - - - - - - - - - - - - - - - - - - - -
    0.5
    
    [ 2^-8 ]
    0.00390625
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.50390625
    
    [ 2^-10 ]
    0.0009765625
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.5048828125
    
    
    [ 2^-14 ]
    0.00006103515625
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.50494384765625
    
    [ 2^-15 ]
    0.000030517578125
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.504974365234375
    
    
    [ 2^-16 ]
    0.0000152587890625
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.5049896240234375
    
    
    [ 2^-17 ]
    0.00000762939453125
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.5049972534179688
    
    
    [ 2^-19 ]
    0.0000019073486328125
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.5049991607666016
    
    
    [ 2^-21 ]
    4.76837158203125e-7
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.5049996376037598
    
    [ 2^-22 ]
    2.384185791015625e-7
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    0.5049998760223389
    
    [ 2^-23 ]
    1.1920928955078125e-7
    D:12345678901234567890
    - - - - - - - - - - - - - - - - - - - -
    
    - - - - - - - - - - - - - - - - - - - -
    - - - - - - - - - - - - - - - - - - - -
    D:12345678901234567890
    0.5049999952316284
    
    toFixed(2)
    0.50
    
    // Chrome browser
    



    Note that the browser output renders up to 19 base ten numerals to the right of the D.P. and will default to 'E' notation when significantly small a value.

        document.write( sum.toString(10) );
    
      // it appears that the toString() function has no effect on the formatting in this case
        document.write( sum );
    
    Findings:
    
    Chrome output no format specifier, but 16 numerals past the D.P. and up to 19 used during rendering to get to this final value
    0.5049999952316284
    
    
    Espruino output no format specifier, but truncates (up to) 11 numerals
    
    0.50499999523
    





    Now let's compare to the Espruino output

    // Espruino output
    2^0 = 1   add leading implicit 1
     
     
    [ 2^- 1 ]
    0.5
     
    D:12345678901234567890
    -------------------------
    0.5
     
     
    [ 2^- 8 ]
    0.00390625
     
    D:12345678901234567890
    -------------------------
    0.50390625
     
     
    [ 2^-10 ]
    0.0009765625
     
    D:12345678901234567890
    -------------------------
    0.5048828125
     
     
    [ 2^-14 ]
    0.00006103515
     
    D:12345678901234567890
    -------------------------
    0.50494384765
     
     
    [ 2^-15 ]
    0.00003051757
     
    D:12345678901234567890
    -------------------------
    0.50497436523
     
     
    [ 2^-16 ]
    0.00001525878
     
    D:12345678901234567890
    -------------------------
    0.50498962402
     
     
    [ 2^-17 ]
    0.00000762939
     
    D:12345678901234567890
    -------------------------
    0.50499725341
     
     
    [ 2^-19 ]
    0.00000190734
     
    D:12345678901234567890
    -------------------------
    0.50499916076
     
     
    [ 2^-21 ]
    0.00000047683
     
    D:12345678901234567890
    -------------------------
    0.50499963760
     
     
    [ 2^-22 ]
    0.00000023841
     
    D:12345678901234567890
    -------------------------
    0.50499987602
     
     
    [ 2^-23 ]
    0.00000011920
     
    D:12345678901234567890
    -------------------------
    0.50499999523
    
    toFixed(2)
    0.50
    
    // Espruino output
    



    Using our online floating point generator, we can see that our input of 1234.505 using either precision presents a number that should round to the hundredths place. The value 1234.5050048828125 contains enough bits to the right of our last thousandths numeral '5' to allow for a round up. During the conversion back to decimal, several bits beyond 2^-23 are lost, leaving us with a value that is just shy of our original '5' thousandths place. 1234.50499999

    As I have hand coded two examples and we have the @valderman stackoverflow snippet as an additional cross check, I'm 0.9987654321 x 10^2 % certain that the Floating Point mechanism works near flawlessly, as one would expect, making the assumption that proven underlying 'C' Math libraries are accurate and used.



    But, still a bit puzzling is that Espruino and the Chrome browser produce near identical numerals to the right of the D.P.
    Espruino - Single precision 32 bit: 0.50499999523
    Chrome - Single precision 32 bit:0.5049999952316284

    The online calculator produces a slightly greater value:

    Single precision 32 bit: 1234.505004882812
    Double precision 64 bit:1234.5050000000001091393642127513885498046875



    As the conversion to and from F.P. introduces a slight bit of error, It would be nice if we had a right of the D.P. adder that works using integers only. This could assist in resolving if there is additional slop during the summation of the equivalent right of D.P. bits. Anyone up for the task to create that little wonder, while I battle cleaning up these snippets to allow posting here?



    The Html code file presents the conversion to a floating point value and uses color hinting to show the concatenation of bits to form the register value used in math calculations. It was a quick hack to compare summation output against Espruino output. Nothing fancy. Needs a bit of refactoring and renders more of a what's going on under the hood. Can be used to check conversions back to decimal along with toFixed() checks.


    1 Attachment

About

Avatar for Robin @Robin started