Skip to content
Other

Floating-Point Calculator (IEEE 754)

Enter any decimal number to see its exact IEEE 754 binary representation in both single-precision (32-bit) and double-precision (64-bit) format. The calculator breaks each value into its sign bit, exponent field, and mantissa (fraction) field, shows the hexadecimal encoding, computes the actual stored value, and measures the precision error introduced by the finite binary representation. Special values like zero, infinity, denormalized numbers, and NaN are detected automatically.

Your details

Enter any real number. The calculator converts it to 32-bit and 64-bit IEEE 754 floating-point binary and shows where precision is lost.
Choose which format drives the primary output card and the visual breakdown.
Sign bitNormal
0

0 = positive, 1 = negative

Exponent bits01111011
Mantissa bits10011001100110011001101
Hexadecimal3DCCCCCD
Stored value0.1000000015
Precision error1.490116e-9
Biased exponent123
True exponent-4
ClassificationNormal
32-bit hex3DCCCCCD
64-bit hex3FB999999999999A
Error (32-bit)1.490116e-9
Error (64-bit)0 (exact)
Biased exponent123
True exponent-4

0.1 in 32-bit single precision: sign=0, exponent=-4 (biased 123), class = Normal.

  • In 32-bit format the stored value differs from your input by 1.490116e-9. This is floating-point rounding in action.
  • 64-bit double precision stores the value with far greater accuracy (about 15-17 significant decimal digits vs. 7 for 32-bit).
  • Binary fractions can only represent numbers whose denominator is a power of 2. Values like 0.1 and 0.3 repeat infinitely in binary, causing the small rounding errors you see.
  • The hexadecimal encodings are: 32-bit = 0x3DCCCCCD, 64-bit = 0x3FB999999999999A.

Next stepUse 64-bit doubles for general computation and reserve 32-bit floats for GPU shaders or large float arrays where memory matters.

What is IEEE 754 floating-point?

IEEE 754 is the international standard that defines how computers store and compute with real numbers in binary. Almost every CPU, GPU, and programming language uses it. The standard was first published in 1985 and revised in 2008. It defines several formats; the two you encounter most often are 32-bit single precision (called "float" in C and Java) and 64-bit double precision (called "double" in C and Java, and the default number type in JavaScript and Python). Each floating-point number is stored as three fields: a single sign bit, a biased exponent, and a significand (mantissa) that encodes the fraction. The formula is: value = (-1)^sign x 2^(exponent - bias) x (1 + mantissa), where the leading 1 is implicit for all normal numbers.

How the three fields work

The sign bit is 0 for positive and 1 for negative. The exponent field is stored with a bias added: 127 for 32-bit and 1023 for 64-bit. To find the true power of 2, subtract the bias from the raw exponent bits. The mantissa (significand) stores the fractional part of the normalized number, with an implicit leading 1 that is never actually written into memory, saving one bit. So the significand is always in the range [1.0, 2.0) for normal numbers. Special exponent bit patterns signal the special values: all zeros means zero or subnormal, all ones means infinity (if the mantissa is zero) or NaN (if the mantissa is nonzero).

Precision, rounding, and why 0.1 + 0.2 is not 0.3

Binary fractions can only represent values whose denominator is a power of 2: 0.5 (1/2), 0.25 (1/4), 0.125 (1/8), and so on. Decimal fractions like 0.1 = 1/10 cannot be written exactly in binary, so the computer stores the nearest representable value. In 32-bit, 0.1 is stored as approximately 0.100000001490116, an error of about 1.5e-9. In 64-bit, the error shrinks to about 5.6e-18. When you add two rounded values, the rounding errors can accumulate or partially cancel, which is why (0.1 + 0.2) in most programming languages evaluates to 0.30000000000000004 rather than 0.3. This calculator shows you the exact error for any value you enter.

Comparing 32-bit and 64-bit formats

Single precision (32-bit) offers about 7 significant decimal digits of accuracy and an exponent range of roughly 1.2e-38 to 3.4e+38. Double precision (64-bit) gives about 15-17 significant decimal digits and an exponent range from about 2.2e-308 to 1.8e+308. Using 32-bit instead of 64-bit halves memory for large arrays and often doubles throughput on GPUs and SIMD instruction sets, which is why machine learning and graphics workloads favor 32-bit (or even 16-bit) floats. General-purpose software should default to 64-bit to avoid surprising rounding errors in financial or scientific computations.

IEEE 754 special values (32-bit single precision)

SignExponent (8 bits)Mantissa (23 bits)Value
00000000000000000000000000000000+Zero
10000000000000000000000000000000-Zero
000000000any nonzeroPositive subnormal
100000000any nonzeroNegative subnormal
000000001 ... 11111110anyPositive normal number
100000001 ... 11111110anyNegative normal number
01111111100000000000000000000000+Infinity
11111111100000000000000000000000-Infinity
x11111111any nonzeroNaN (Not a Number)

Special bit patterns reserved by the standard. The same pattern rules apply for 64-bit with 11-bit exponent and 52-bit mantissa.

Frequently asked questions

Why does 0.1 + 0.2 not equal 0.3 in most programming languages?

Because 0.1 and 0.2 cannot be represented exactly in binary. Each is stored as the nearest binary fraction, and when you add two slightly-off values, the result is also slightly off: 0.30000000000000004 in 64-bit. The error is small (about 5.6e-17) but nonzero. To check equality of floats, compare with a small tolerance: |a - b| < epsilon, rather than a == b.

What is the difference between 32-bit and 64-bit floating point?

A 32-bit single-precision float uses 1 sign bit, 8 exponent bits (bias 127), and 23 mantissa bits, giving about 7 significant decimal digits. A 64-bit double-precision float uses 1 sign bit, 11 exponent bits (bias 1023), and 52 mantissa bits, giving about 15-17 significant decimal digits. Double precision is the default in most programming languages; single precision is preferred in GPU and ML workloads where memory and throughput matter.

What are subnormal (denormalized) numbers?

Subnormal numbers are very small values between zero and the smallest normal float. In 32-bit format, normal numbers have exponent bits from 00000001 to 11111110. When the exponent is all zeros and the mantissa is nonzero, the number is subnormal: the implicit leading bit is 0 instead of 1, and the effective exponent is fixed at 2^-126. Subnormals fill the gap near zero gracefully, but operations on them can be much slower on hardware that handles them in microcode rather than dedicated circuits.

What does NaN mean and how does it arise?

NaN stands for Not a Number. IEEE 754 reserves it for the result of undefined operations: 0 divided by 0, infinity minus infinity, the square root of a negative real number, and similar undefined forms. A computation that produces NaN will propagate it through subsequent arithmetic, making it easy to detect that something went wrong. In the bit pattern, NaN has all exponent bits set to 1 and a nonzero mantissa.

Can I convert from hex back to decimal using this calculator?

This calculator converts decimal to binary/hex. To go the other direction, enter the decimal number that the hex represents, or use a dedicated hex-to-float tool. For the reverse, interpret the 8 hex digits as 32 bits: extract the sign (bit 31), exponent (bits 30-23), and mantissa (bits 22-0), subtract the bias from the exponent, and apply the IEEE 754 formula: (-1)^sign x 2^(exp - 127) x (1 + mantissa).

What is the precision error shown in the results?

The precision error is the difference between the exact decimal you typed and the closest value the binary format can store. For 32-bit float, this can be as large as about 5.96e-8 relative to the value. For 64-bit double, it is typically around 1e-16 relative. A precision error of 0 (exact) means the number you entered happens to be exactly representable as a binary fraction (e.g. 0.5, 1.25, -8.0).

Sources

Written by Grace Mbeki, MSc Data Scientist & Educator · Nairobi, Kenya

Turning everyday numbers into clear, actionable answers for the decisions that matter most.

Search 3,500+ calculators

Loading search…