The Number That Fits Too Well
Every number is a useful lie
Your computer cannot store π. It can't even store 0.1 exactly. What it stores is a carefully constructed approximation — close enough you'd never notice in everyday code, but technically wrong.
That approximation follows IEEE 754, standardised in 1985,
and it governs every float and
double you've used. A 32-bit float splits
its 32 bits into three fixed regions:
float32 of 1.0 — fields are always the same width, for every number:
The sign bit is +/−. The exponent (8 bits, biased by 127) sets the scale — the power of 2. The mantissa packs in significant digits. Full value:
(−1)s × (1 + m/223) × 2(exp−127)
The important detail: no matter what you're storing — 0.0001 or 10 billion — you always get exactly 23 mantissa bits. Always. The exponent shifts the scale, but the precision budget never moves.
The precision isn't where you need it
Most real computation clusters near 1.0. Neural network weights, physics simulations, colour values — ordinary numbers. For those, 8 exponent bits is vast overkill. You're covering 10−38 to 1038 but spending a quarter of your bits on a range you'll rarely touch.
Imagine 32 coins to spend on "represent this number precisely." IEEE spends 1 on sign, 8 on scale, 23 on precision. That ratio is fixed. Forever. What if you could spend fewer coins on scale when the scale doesn't need it — and give those coins to precision instead?
That's the core idea of posits.
Enter: posits
Posits were proposed by John Gustafson in 2017 (the same Gustafson behind Gustafson's Law). The key change: instead of a fixed 8-bit exponent, posits replace it with a regime — a self-terminating run of identical bits. The run length encodes a coarse scale; shorter runs leave more room for precision.
posit32 of 1.0 — regime is only 2 bits here, leaving 27 for fraction:
A regime of 10 means k = 0.
A regime of 1110 means k = 2.
A regime of 001 means k = −2.
The value of k is raised to the "useed" — for es = 2,
useed = 2(22) = 16.
sign × 16k × 2exp × (1 + fraction)
For 1.0, k = 0, regime takes just 2 bits. That leaves 27 bits for fraction vs. IEEE's fixed 23 — about 1.2 extra decimal digits of precision. Push toward 108 and the regime grows to 9 bits, eating into fraction. The precision budget automatically redistributes based on the number's magnitude.
Part I — IEEE 754: encoding 0.1 by hand
Step 1: Why 0.1 repeats in binary
To convert a decimal fraction to binary, you repeatedly multiply by 2 and take the integer part. Here is the table for 0.1:
| Step | Value | × 2 | Bit |
|---|---|---|---|
| 1 | 0.1 | 0.2 | 0 |
| 2 | 0.2 | 0.4 | 0 |
| 3 | 0.4 | 0.8 | 0 |
| 4 | 0.8 | 1.6 | 1 |
| 5 | 0.6 | 1.2 | 1 |
| 6 | 0.2 | 0.4 | 0 |
| 7 | 0.4 | 0.8 | 0 |
| 8 | 0.8 | 1.6 | 1 |
| … | repeats from step 2 | … |
0.1 in binary is 0.0001100110011… —
a repeating pattern of 0011 that never
terminates. Just as 1/3 cannot be written exactly in decimal, 1/10 cannot be written exactly
in binary. This is the fundamental reason 0.1 can never be stored perfectly in any binary
floating-point format.
Step 2: Normalisation
IEEE 754 stores numbers in "scientific notation" form. We shift the binary point until we have exactly one leading 1:
0.0001.10011001100110011… × 20 = 1.10011001100110011… × 2−4
The leading 1 is implicit (not stored) — IEEE calls it the "hidden bit." So the mantissa only needs to store the part after the binary point.
Step 3: Exponent bias
The true exponent is −4. Float32 uses a bias of 127, so the stored exponent is:
−4 + 127 = 123 → 01111011 in binary
Step 4: The 23-bit mantissa and rounding
The repeating fraction 10011001100110011…
must be truncated to 23 bits. Writing out the first 24+ bits:
1001 1001 1001 1001 1001 100|1 1001…
23 bits ↑ guard bit = 1 The guard bit (bit 24) is 1 and there are more 1-bits after it, so IEEE "round to nearest even" rounds up. The stored 23-bit mantissa becomes:
10011001100110011001101
Step 5: Assemble the bits
IEEE 754 float32 encoding of 0.1:
Decode it back: (1 + 0xCCCCCD / 223) × 2−4 =
stored = 0.100000001490116119384765625 error ≈ 1.49 × 10−9
That trailing ...001490... is the lie.
Close enough for everyday work. Technically wrong.
Part II — Posit32: encoding the same 0.1
Step 1: Where does 0.1 sit on the useed scale?
Posit32 with es = 2 has useed = 24 = 16. We need to find k such that useedk ≤ 0.1 < useedk+1:
16−1 = 0.0625 ← too small 160 = 1.0 ← too large 0.0625 ≤ 0.1 < 1.0 → k = −1
Step 2: Regime encoding
For k = −1, the regime is a run of |k| zeros followed by a terminating 1.
That's just 01 — only 2 bits.
Step 3: Strip to xs and extract the exponent
Divide out the regime's contribution to find what's left:
xs = 0.1 / 16−1 = 0.1 × 16 = 1.6
Now express 1.6 as 2e × (1 + f). Since
1.0 ≤ 1.6 < 2.0, e = 0.
The 2-bit exponent field is 00.
Step 4: Count the available fraction bits
32 total − 1 sign − 2 regime − 2 exponent = 27 fraction bits. IEEE had 23. That's 4 extra bits of precision — roughly one extra decimal digit.
Step 5: Extract the fraction
The fractional significand is 1.6, with hidden bit 1. So we need to encode 0.6 into 27 bits using the same repeated-doubling trick:
0.6 in binary = 0.100110011001100110011001100… 27 fraction bits: 100110011001100110011001100
Step 6: Assemble
Posit32 (es=2) encoding of 0.1:
stored ≈ 0.09999999962747097 error ≈ 3.7 × 10−10
That's about 4× more accurate than IEEE float32, from the same 32 bits. The extra 4 fraction bits bought us an order of magnitude less error.
Part III — three numbers, three tradeoffs
Where does the posit advantage live, and where does it disappear? Let's compare three numbers spanning different magnitudes:
1.3 — the sweet spot
For 1.3, k = 0 and the regime is just
10 (2 bits). That leaves 27 fraction bits.
IEEE still has its fixed 23.
IEEE: 23 fraction bits → error ≈ 5.96 × 10−8
Posit: 27 fraction bits → error ≈ 3.73 × 10−9
Posit wins by ~4 bits of precision
100,000 — the crossover
100,000 ≈ 164 × 20 × something, so k = 4.
Regime becomes 111110 — 6 bits. After sign
(1) + regime (6) + exponent (2), only 23 fraction bits remain. Same as IEEE.
IEEE: 23 fraction bits
Posit: 23 fraction bits
Roughly tied
167 ≈ 268 million — the extreme
At k = 7, the regime is
111111110 — 9 bits. After sign + regime
+ exponent, only 20 fraction bits are left. IEEE still has its full 23.
IEEE: 23 fraction bits
Posit: 20 fraction bits
Posit loses by ~3 bits — IEEE is more precise here
This is the posit tradeoff made visible. Near 1.0, regime is tiny and precision overflows. At extreme magnitudes, regime claims progressively more bits and precision tapers. The system is designed around the assumption that most real computation lives in the middle — and for ML weights, color values, physics constants, and financial calculations, it does.
Try it: see the bits change live
Type any positive number to watch both formats encode it. Notice the posit regime growing as you push toward large or small values — and the fraction count shrinking to compensate. The sweet spot for posits is near 1.0.
IEEE 754 float32
1 + 8 + 23 = 32 bits (fields always fixed)
Stored value:
Error:
Posit32 (es = 2)
Stored value:
Error:
Fraction bits available
Why they haven't taken over yet
Near 1.0, posits give roughly 4–5 extra fraction bits versus float32. For ML inference — where most weights cluster near zero — that matters meaningfully. Some benchmarks show posit32 matching float64 accuracy for dot products and accumulation-heavy ops.
Posits also clean up the edge cases IEEE accumulated over decades: one zero, one NaR (Not a Real) instead of two infinities and hundreds of distinct NaN bit patterns. No negative zero. The number line is projective — it wraps from +∞ to −∞ through a single special value, symmetrically.
The practical problem: every processor, GPU, and CUDA core is built for IEEE 754. FPUs, SIMD units, math libraries, compilers, hardware intrinsics — all of it assumes float32 or float64. Posit hardware exists in research FPGAs and a handful of startup chips, but there's no mainstream CPU support. Swapping a number format used in billions of devices is not a software update.
Posits aren't uniformly better either. At extreme magnitudes — 10−38 or 1038 — the long regime eats into fraction bits, and IEEE is actually more precise there. The tapered precision is a feature for typical computation and a deliberate tradeoff at the edges. If you need full dynamic range, IEEE's fixed fields work in your favour.
Whether posits eventually displace IEEE 754 is an open question that will likely be answered by whoever puts them in a mainstream ML accelerator first. But as a piece of numerical design — a variable-length field that automatically trades range for precision based on magnitude — the idea is elegantly constructed. The bits know what they're worth.
The interactive widget on this page runs the SoftPosit C library by Cerlane Leong, compiled to WebAssembly. SoftPosit is a complete software implementation of posit arithmetic — source and C++ API on GitLab.