What you didn't know about Float?

Although floats are typically a reliable representation of real numbers, there are instances when they fall short, resulting in unexpected outcomes. While they usually provide a satisfactory approximation, it's important to note that they are not infallible.

For example, try running the code

 x = 0.0
 for i in range(10):
     x = x + 0.1
 if x == 1.0:
     print(x, '= 1.0')
 else:
     print(x, 'is not 1.0')

#output
 0.9999999999999999 is not 1.0

Surprising isn't it. To understand why this happens, it is essential to have a grasp on how floating-point numbers are represented in the computer during a computation. To understand that, we need to understand binary numbers.

In decimal point numbers (numbers base 10) any decimal number can be represented by a sequence of the digits 0123456789. The rightmost digit is the place 10^0, the next digit towards the left is the 10^1 place, etc.

For example the sequence of decimal digits 457 represents :

$$457 = 4(10^2) + 5(10^1) + 7(10^0)$$

Binary numbers( numbers base 2) work similarly. A binary number is represented by a sequence of digits each of which is either 0 or 1. These digits are often called bits. The rightmost digit is the 2^0 place, the next digit towards the left is the 2^1 place, etc.

For example, the sequence of binary digits 101 represents which is equal to 5 in decimal.

$$101 = 1(2^2) + 0(2^1) + 1(2^0) = 5$$

All modern computer systems represent numbers in binary. because it is easy to build hardware switches, i.e., devices that can be in only one of two states, on or off.

A brilliant discussion on binary is done by Prof David Milan in CS50 (give it a watch).

In modern programming languages non-integer numbers are implemented using a representation called floating point. We would represent a number as a pair of integers, the significant digits of the number and an exponent. For example, the number 1.756 would be represented as the pair (1756, -3), which stands for the product 1756*10^-3.

The number of significant digits determines the precision with which numbers can be represented. Suppose there were only two significant digits available; then the number 1.949 couldn't be precisely represented and would need to be rounded to an approximate value of 1.9. This approximate value is referred to as the rounded value.

Computers utilize binary rather than decimal representations to represent significant digits and exponents. In binary representation, the exponent is raised to the power of 2 instead as compared to 10 in decimal representation.

For example, the number represented by the decimal digits 0.625 (5/8) would be represented as the pair (101, -11). 101 is the binary representation of the number 5 and -11 is the binary representation of -3, the pair (101, -11) stands for 5*2^(-3) = 5/8 = 0.625.

What about the decimal fraction 1/10, which we write in Python as 0.1? The best we can do with four significant binary digits is (0011, -101). This is equivalent to 3/32, i.e., 0.09375. If we had five significant binary digits, we would represent 0.1 as (11001, -1000), which is equivalent to 25/256, i.e., 0.09765625.

How many significant digits would we need to get an exact floating-point representation of 0.1? An infinite number of digits! So, no matter how many bits Python (or any other language) uses to represent floating-point numbers, it can represent only an approximation to 0.1.

Lets return to original program

 x = 0.0
 for i in range(10):
     x = x + 0.1
 if x == 1.0:
     print(x, '= 1.0')
 else:
     print(x, 'is not 1.0')

Upon closer examination, we can observe that the evaluation of the expression x == 1.0 yields False because the value assigned to x is not precisely 1.0. As a consequence, the execution of the else clause is justified.

"Why did it conclude that x was below 1.0 despite the floating-point representation of 0.1 being slightly greater than 0.1? The reason behind this was that Python encountered a situation during one of the loop iterations where it lacked significant digits, and as a result, it performed rounding, which resulted in a downward adjustment."

"Does the difference between real and floating-point numbers really matter?Generally, it is not a significant concern, except in a few cases where slight variations in numbers such as 0.9999999999999999, 1.0, and 1.00000000000000001 can have an impact.

However, it is crucial to be cautious when testing for equality because comparing two floating-point values using the "==" operator can yield unexpected results, which is something to be mindful of.

It is almost always more appropriate to ask whether two floating point values are close enough to each other, not whether they are identical. So, for example, it is better to write abs(x‐y) < 0.0001 rather than x == y.

Liked this article. Follow me on twitter 🐦

References

(Prof. John V Guttag, Introduction to Computation and Programming Using Python, Second Edition, Chapter 3: Some Simple Numerical Programs, 3.4 A Few words about using Floats, Pg 34.)

Blogs by Aayush

Blogs by Aayush

What you didn't know about Float?

References