Long Double Output Bug In MinGW

You are working on a C++ program where you need very accurate floating point numbers, so you decided to use long double for the extra precision. After a few calculations, you print out the number to to make sure it is correct. To your shock, instead of the number being 123.456789, it is printed out as -6.518427 × 10264 (or 2.745563 depending on your computer). What could have caused this?

This is actually a bug in some versions of MinGW g++ 4.8.1, specifically revision 4 provided by the MinGW-get distribution for x86 architecture. If you find any other versions where this happens, please comment below!

It is very easy to check if your g++ has the issue:

long double val = 123.456789L;
double bitsAsDouble = * (double *) &val;
std::cout << "When interpreted as long double: " << val << '\n';
std::cout << "When interpreting bits as double: " << bitsAsDouble << '\n';

if they print out the same number, then your version of g++ has this bug.

The Bug

In g++, long double is implemented using the 80 bit x86 Extended Precision Format. However, Microsoft decided to have long double be a synonym for double some time ago. This means that long double in Visual Studio and, more importantly, Microsoft’s runtime DLL use the 64 bit IEEE 754 double precision format. When MinGW uses the Microsoft runtime DLL to print out the long double directly, the number is interpreted as using only 64 bits and thus garbage results are outputted. Here is the patch for MinGW which fixed this, and explains the issue. The current bug appears to be a regression bug for this version of g++.

I have given some fixes later in this post, but I would like to explore this bug, and what influences the output, more first.

Endianness

The output of this bug depends on Endianness, so for those who don’t know what it is: Endianness relates to how a computer stores data consisting of multiple bytes in its memory. The two different ways are:

  • Little-endian, where the data is stored with the least significant byte first.
  • Big-endian, where the data is stored with the most significant byte first.

Big-endian will store the bytes in the same order that they occur in a number, while little-endian will store the bytes occur in reverse order.

So the number 0102 0304 0506 070816, when looking at the bytes in memory, would be stored as 0102 0304 0506 070816 in a big-endian architecture, and would be stored as 0807 0605 0403 020116 in a little-endian architecture. Both big- and little-endian have some advantages, some examples can be found on Quora.

Endianness normally only matters when your computer is communicating with other computers (because they must agree on an endianness to be able communicate with multi-byte data), however, endianness changes the output you get with this bug because it will not read all of the bytes for the number. For the number 123.456789L, the bytes loaded will be 5870 3FE0 E9F6 054016 for little-endian, and 4005 F6E9 E03F 705816 for big-endian.

The easiest way to determine which Endianness your computer has is:

int num = 1;
bool isLittleEndian = *(char *) &num;
if (isLittleEndian)
    std::cout << "Is little endian.\n";
    
else
    std::cout << "Is big endian.\n";

because isLittleEndian will be true iff the least significant byte of num is stored first. Here is a small C++ file to allow you to determine the endianness of your computer.

The Two Possible Interpretations

There is another way that we can confirm what is happening with this bug: by looking at the actual bits for when stored as a long double, and see what happens if we instead interpret it as a double. I have created a small C++ program to print out the bits for a long double vs a double.

The number 123.456789L, in x86 Extended Precision Format, is 4005 F6E9 E03F 7058 57B016 so we can now consider what would happen to this number if it is interpreted as a double:

In Little-Endian Architecture

In a little-endian machine, the number 123.456789L is stored as B057 5870 3FE0 E9F6 0540 000016. You may notice that this would imply that long double is stored as a 96 bit number, rather than an 80 bit number. While it is stored using 96 bits (or even 128 bits) the last 16 bits are never used and ignored, and so the number is treated as being stored as B057 5870 3FE0 E9F6 054016. These additional bits are to maintain data alignment. Here is an example of how data alignment can affect the size of your data structures.

Due to how the number is stored, when Microsoft’s runtime tries to print out the long double, it decides that it is stored as B057 5870 3FE0 E9F616, which when converted from little-endian is F6E9 E03F 7058 57B016.

Next, the runtime then interprets these 8 bytes using the IEEE 754 double precision format:

  • The sign is 1 (the bit in position 63)
  • The exponent is 76E16 = 190210 (the bits in position 52-62)
  • The fraction/mantissa is 9 E03F 7058 57B016 (the bits in position 0-51)

By following the format for double precision floating point, we can see that the number is:

(-1)1 × 1.10011110000000111111011100000101100001010111101100002 × 21902 – 1023
= -1 × 1.1001111000000011111101110000010110000101011110112 × 2879
= -1.6172510 × 2879
= -6.51842710 × 10264

which is exactly what is printed out on my computer.

In Big-Endian Architecture

In a big-endian machine, the number 123.456789L is stored as 4005 F6E9 E03F 7058 57B0 000016. When Microsoft’s runtime tries to print the 123.456789, it decides that it is stored, and interpreted, as 4005 F6E9 E03F 705816.

Next, the runtime then interprets these 8 bytes using the IEEE 754 double precision format:

  • The sign is 0 (the bit in position 63)
  • The exponent is 40016 = 102410 (the bits in position 52-62)
  • The fraction/mantissa is 5 F6E9 E03F 705816 (the bits in position 0-51)

By following the format for double precision floating point, we can see that the number is:

(-1)0 × 1.01011111011011101001111000000011111101110000010110002 × 21024 – 1023
= 1.01011111011011101001111000000011111101110000010112 × 21
= 10.1011111011011101001111000000011111101110000010112
(A little trick I found from http://mathforum.org/library/drmath/view/56091.html), there are 48 digits in the number.
= 210 + 1011111011011101001111000000011111101110000010112 ÷ 248
= 210 + 209857404202507 ÷ 248
= 2.745563270507812310

Fixes

As I mentioned in my answer on stackoverflow, there are a few fixes:

  • The easiest way to fix this would be to simply change the version of MinGW you are using to 4.9 or 4.7, depending on what you need (you can get 4.9 here).
  • If you want to continue to use MinGW 4.8, you can also just download and install a different distribution of MinGW, which I found to have no issues.
  • You can even just cast to a double whenever you try to print out the long double (there is some loss of accuracy, but it shouldn’t matter when just printing out numbers). Such as changing:
    long double val = XXX;
    std::cout << val << '\n'
    

    to

    long double val = XXX;
    std::cout << (double) val << '\n'
    
  • If you are willing to instead use printf, you could change to printf("%Lf", ...), and either:
    • add the flag -posix when you compile (g++ -posix …..)
    • add #define __USE_MINGW_ANSI_STDIO 1 before your #include <cstdio> (found this from the origional patch)

If you have any suggestions or corrections, please comment below!

Update: Changed title to “Long Double Output Bug In MinGW” from “Long Double Bug In MinGW” due to the only issue being when it tries to output the long double.

Categories: bugs, c++ |

5 thoughts on “Long Double Output Bug In MinGW

  1. You really don’t get it.

    Answer on StackOverflow was mostly right. You got all bits right, but you decided to add endianness, I got no idea why, and ended with completely incorrect answer.

    And your example program is full of misdirection.

    When you cast long double as double, compilator behind the scenes transforms exponent and fraction into smaller representation.
    When you do pointer coercion, you tell your program that “bits at this location are in format of double, interprete them as such and print them out.” And your program reads first half of long double and prints garbage, because you put garbage in the middle of process.

    The conclusion and reasoning you wrote here are hilariously wrong and should be removed before someone decides they are true. Sources are mostly right.

    1. I explain endianness because with this output bug the output does depend on the order that bytes are stored due to not reading in all of the bytes.

      I agree that I should have been more explicit about why I was doing pointer coercion, it was not related to casting, but instead showing why the long double number was outputted incorrectly.

      The values I found from interpreting the bits is the actual number that is printed out on my small-endian computer, although I have been unable to test the big-endian output.

    2. I really do not understand your complaint, of course the pointer coercian produces garbage, it reproduces exactly the garbage you obtain when the MinGW long double is passed to MSVC’s implementation of printf. It does a nice job of showing the internals of the bug.

      I don’t understand how the endianess point is incorrect, the bug produces a different result on a big endian architecture, so naturally the blog post will discuss endianess.

  2. Hi,
    Thanks a lot for the explanation, I’ve been stuck on this problem for a while… However, I am using cygwin and I don’t really know how to implement your fixes. Any hint would be much appreciated.
    Thanks for the help!

    1. Hi Matt,

      The easiest fix would be, whenever you do:

      ... << val ...

      (where val is a long double you are wanting to output)
      instead just cast val to a double on your own:

      ... << (double) val ...

      So:

      long double val = XXX;
      std::cout << (double) val << '\n';
      

      Hope this helps!

Leave a Reply

Your email address will not be published. Required fields are marked *

[TOP]