Fast Inverse Square Root

Magic Number - 0x5f3759df

Dec 8, 2023

The article introduces an algorithm called Fast Inverse Square Root, which is used to quickly calculate the inverse square root of a floating point number.

Prologue

Fast Inverse Square Root is a rather famous algorithm, and the specific author is no longer traceable. Its prominence comes from its extensive use in the game Quake III, where it became well-known.

Why do we need the inverse square root instead of the square root itself? In computer graphics, many calculations involve the inverse square root rather than the square root. For example, the normalize() operation for a vector $\frac{1}{\sqrt{x^2 + y^2 + z^2}}(x,y,z)$ includes the inverse square root. In computer graphics, each frame's rendering requires at least tens of thousands of calls to normalize. If we can accelerate the calculation of the inverse square root, the rendering speed will significantly improve.

The algorithm originated more than 20 years ago. Today, the algorithm has research value but not practical use since modern CPUs have faster and more accurate instructions. We can simply use 1/sqrt(y).

However, this doesn't hinder us from understanding the elegance of this algorithm. In a brief overview, it involves:

IEEE 754 Standard
Undefined behavior in the C language
Newton's iteration method.

The focus here is on the second step—how to derive the magic number 0x5f3759df. For more detailed information, you can refer to the source.

Algorithm

float Q_rsqrt(float number)
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;
    x2 = number * 0.5F;
    y = number;

    /* Core Code */
    // 1. reinterpret_cast from float to int
    i = *(long *)&y; // evil floating-point bit-level hacking
    // 2. Estimate the inverse square root
    i = 0x5f3759df - (i >> 1); // what the heck? (....why not 0x69696969?)
    // reinterpret_cast from int to float
    y = *(float *)&i;
    // 3. Newton's method
    y = y * (threehalfs - (x2 * y * y)); // 1st iteration
    // y = y * (threehalfs - (x2 * y * y)); // 2nd iteration, this can be removed

    return y;
}

The algorithm roughly consists of three parts:

reinterpret_cast used for conversion between float and int because any compiler would otherwise prohibit bitwise operations on float types.
Estimation using the magic number 0x5f3759df for the inverse square root (explained later).
One iteration of Newton's method (explained later).

Estimating the Inverse Square Root

In the IEEE 754 standard, a 32-bit floating-point number contains 1 sign bit, 8 exponent bits $E$ , and 23 mantissa bits $M$ . Any floating-point number can be represented as: $(1.M) \times 2^{E - 127} = (1 + \frac{M}{2^{23}}) \times 2^{E - 127}, M \in[0, 2^{23} ), E \in [0,256)$ .

To calculate $y$ 's inverse square root $\frac{1}{\sqrt{y}} = y^{-\frac{1}{2}}$ , the IEEE 754 form of $y$ is $y = (1 + \frac{M_y}{2^{23}}) \times 2^{E_y - 127}$ .

For simplicity, let's take the logarithm of $y$ to the base 2:

\begin{align} \log y &= \log{((1 + M_y) \times 2^{E_y - 127})} \\ &= \log{(1+\frac{M_y}{2^{23}})} + E_y - 127 \end{align}

It's clear that $\frac{M_y}{2^{23}}$ is in the interval $[0,1]$ . In this interval, $\log(1+x) \approx x$ . If we add a constant $\mu$ , the approximation improves:

Using this approximation, we can further simplify:

\begin{align} \log y &= \log{(1+\frac{M_y}{2^{23}})} + E_y - 127 \\ &\approx \frac{M_y}{2^{23}} + \mu + E_y - 127 \\ &= \frac{1}{2^{23}}(M_y + E_y \times 2^{23}) + \mu - 127 \end{align}

Now, let's take the logarithm of $y^{-\frac{1}{2}}$ :

\log y^{-\frac{1}{2}} = -\frac{1}{2} \log y \approx -\frac{1}{2}\left( \frac{1}{2^{23}}(M_y + E_y \times 2^{23}) + \mu - 127 \right)

Assuming the accurate value of $y^{-\frac{1}{2}}$ is $A$ , with exponent and mantissa $E_A$ and $M_A$ respectively, we can take the logarithm of $A$ :

\log A \approx \frac{1}{2^{23}}(M_A + E_A \times 2^{23}) + \mu - 127

By equating the two logarithmic expressions, we get:

\begin{align} M_A + E_A \times 2^{23} &= 2^{23} \times \frac{3}{2}(127 - \mu) - \frac{1}{2}(M_y + E_y \times 2^{23}) \end{align}

When $\mu$ is set to $0.0450461875791687011756$ , the first term becomes 1597463011 in hexadecimal (0x5F3759e3). If we replace the second term's $\frac{1}{2}$ with a right shift operation, the result is remarkably close to the line in the code:

i = 0x5f3759df - (i >> 1);

The magic number is slightly different, but with an appropriate $\mu$ value, it should yield exactly the same number.

Newton's Iteration Method

The update formula for Newton's iteration method is:

x_{t+1} = x_t - \frac{f(x_t)}{f^\prime(x_t)}

The precondition for Newton's iteration method is that the initial point should be near the function's zero. Since the estimation from i = 0x5f3759df - (i >> 1); should have a small error, we can assume the precondition is satisfied.

Now, our problem is, given $y$ , we need to find $x$ such that $\frac{1}{\sqrt{y}} = x$ , i.e., $\frac{1}{x^2} = y$ . Rearranging, we get $f(x) = \frac{1}{x^2} - y = 0$ . Applying Newton's iteration to $f(x)$ gives the iteration formula:

x_{t+1} = x_t - \frac{y}{2}x_t^3

This is translated into code as:

y = y * (threehalfs - (x2 * y * y));

References

[1] Is fast inverse square root still used?

[2] Fast Inverse Square Root — A Quake III Algorithm

[3] Relevant Thesis

NOTE & BLOG

Prologue

Algorithm

Estimating the Inverse Square Root

Newton's Iteration Method

References