Maybe we should have a stable sum of floats

I mean we can add an sum function in std that minimizes the loss of precision. Just like math.fsum in python. In fact, there are already some implementations in library/test/ 141). If you do some more improvements, you can directly add it to f64/f32. Below is my implement, it may not be perfect, which adds INF, NAN and overflow handling based on, and made some optimizations at the end with reference to the implementation in Cpython(line 1424). I want to know if it has the value and feasibility of adding to std.

fn main() {
    let v = vec![1e15, 0.05, 0.05];
    assert_eq!(f64::stable_sum(v), 1000000000000000.1);

    let v = vec![1e-16, 1., 1e16];
    assert_eq!(f64::stable_sum(v), 10000000000000002.0);

    // infinity
    let v = vec![0., f64::INFINITY];
    assert_eq!(f64::stable_sum(v), f64::INFINITY);

    // NaN
    let v = vec![f64::NEG_INFINITY, f64::INFINITY];

    // overflow
    let v = vec![f64::MAX, 1e292];
    assert_eq!(f64::stable_sum(v), f64::INFINITY);

trait StableSum: Sized {
    fn stable_sum<I: IntoIterator<Item = f64>>(into_iter: I) -> f64;
impl StableSum for f64 {
    fn stable_sum<I: IntoIterator<Item = f64>>(into_iter: I) -> f64 {
        let mut iter = into_iter.into_iter();
        let mut partials = vec![];
        let mut special_sum = 0.;
        let mut inf_sum = 0.;
        while let Some(mut x) = {
            let xsave = x;
            let mut j = 0;
            // This inner loop applies `hi`/`lo` summation to each
            // partial so that the list of partial sums remains exact.
            for i in 0..partials.len() {
                let mut y: f64 = partials[i];
                if x.abs() < y.abs() {
                    let t = x;
                    x = y;
                    y = t;
                // Rounded `x+y` is stored in `hi` with round-off stored in
                // `lo`. Together `hi+lo` are exactly equal to `x+y`.
                let hi = x + y;
                let lo = y - (hi - x);
                if lo != 0.0 {
                    partials[j] = lo;
                    j += 1;
                x = hi;
            if (!x.is_finite()) {
                /* a nonfinite x could arise either as
                a result of intermediate overflow, or
                as a result of a nan or inf in the
                summands */
                if (xsave.is_finite()) {
                    //intermediate overflow
                    return x;
                if (xsave.is_infinite()) {
                    inf_sum += xsave;
                special_sum += xsave;
                // reset partials
            if j >= partials.len() {
            } else {
                partials[j] = x;
                partials.truncate(j + 1);
        if (special_sum != 0.0) {
            if (inf_sum.is_nan()) {
                return f64::NAN;
            } else {
                return special_sum;
        let mut len = partials.len();
        let mut hi = partials[len - 1];
        let mut lo = 0.;
        len -= 1;
        while len > 0 {
            let x = hi;
            let y = partials[len - 1];
            hi = x + y;
            lo = y - (hi - x);
            if lo != 0. {
        if len > 1 && ((lo < 0. && partials[len - 2] < 0.) || (lo > 0. && partials[len - 2] > 0.)) {
            let y = lo + lo;
            let x = hi + y;
            if y == x - hi {
                hi = x;

As I understand it, there isn't just one perfect method of doing a stable sum, but various different choices and trade-offs. This is an ideal situation for external crates to offer various options - in fact, I quickly found accurate, which offers a variety of options for floating point summation bundled into one crate. The Rust standard library prefers not to commit to a specific approach for things like this.


Thanks for informing. I only study rust-num and there is no corresponding method in it.(In fact it comes from an issue in rust-num). I think it might be necessary to make it a universal method. Don't mind if it's not right

As yet another option, you can do Pairwise summation using itertools with .tree_fold1(Add::add).unwrap_or(0.0).

All this said, and without really disagreeing with the other posts in the thread, Rust does have impl Sum<f64> for f64, so it would make sense to me for that to be implemented in a way that has sub-linear rounding error accumulation. I think it'd fit well with things like HashMap defaulting to a HashDoD-resistent hasher -- one could always do .fold(0.0, Add::add) instead of .sum() if they wanted the naive summation instead, or of course use a specialized non-std method if they wanted something in particular. I like it when the obvious thing isn't a "gotcha".


Thank you very much, this is something I haven't thought of before. Low rounding error addition through tree iterator sounds so cool. :smirk_cat: I think I will try to achieve it.

The sum proposed in the first post uses an error free transformation, from memory it should be significantly more accurate than a paiwise summation (note that you can preserve the accuracy while getting rid of the test using TwoSum instead of FastTwoSum which, counter-intuitively, is not the fastest error free transformation for the sum on modern processors).
However, if one were to add a numerically more stable sum to the languague, I would advocate for an exact sum (accurate implements one such algorithm). It garantees that the result is within one ULP of the analytical result for all inputs which is optimal (the only possible improvements are performances improvements introduced by improved algorithms).

I would love to see languages that are often used for numerical applications (such as Julia) including an Accumulator type out of the box. Encapsulating a compensated or exact sum (with options to trade precision for performances) and using it in the standard documentation to make it the idiomatic choice when accumulating floating point numbers. (I believe an accumulator type is much more ergonomic and reusable in arbitrary algorithms than a sum method on iterators).

However, I doubt this is appropriate for Rust. Such algorithms requires expert knowledge to implement and debug (which is a good reason to put them in a crate and put the weight of maintaining them on the crate author) while being rarely used in practice in Rust (which does not focus on numerical applications at the moment): making it the default in Rust is most likely overkill.


I'll note that that one allocates internally (at least it has Vecs and Boxes in the part I looked at), which means it could not be used for impl Sum for f64.

I totally agree that something doing explicit numerics will want something smarter than whatever the default .sum() does. I don't even think that .sum() must be 1-ULP accurate; I just want it to have o(nε) error.

Do you have one that doesn't use the heap? Otherwise, I don't think it's possible, because iterator sum() is part of libcore, and that doesn't have access to dynamic memory allocation.

Kahan summation, or one of a couple variants, just uses an extra float. I suspect we can go down a rabbit hole for this but Kahan summation is at least way better than the status quo.

Like @toc, I was thinking of Kahan summation. It seems to be a fairly standard answer for "how do I add floats?" even if there might be other fancier ones.

(Pairwise summation can also be written without using heap. In fact the tree_fold1 in itertools works without allocating.)

Note: It doesn't (necessarily) use the heap but in that case it does recurse. I don't know if that's disallowed per se in core but it might be enough to disqualify it for somebody.

Yup, I know :upside_down_face:

1 Like

:thinking: Since it's logarithmically bound (as scottmcm says in the linked PR thread), you could technically make it use neither allocation nor recursion, by just using an ArrayVec<[Item; 64]> on the stack.

To see that the exact sum of floats can be computed with constant memory, consider that it could be done as follows:

  • Each finite float is an integer multiple of the smallest positive float.
  • Due to their finite range, f64 only requires integers of ~2100 bits, while f32 needs ~280 bits.
  • Compute the sum of these wide integers.
  • Return the rounded float corresponding to that integer sum.

This, in itself, is not to an efficient implementation, but I find it a handy visual for thinking about the problem. Naive float summation corresponds to only storing the mantissa-sized sequence of bits starting at the most significant set bit of the integer, while Kahan summation will also keep another such sequence starting at the next set bit that wasn't included in the first one.

edit: Actually the last statement isn't quite correct. The compensation term in a Kahan sum can have either sign relative to the main accumulator, while the above is only accurate when both are positive.

1 Like

Rust's Sum<f64> doesn't autovectorize fully either (for float accuracy reasons - compiler is not allowed to change the programmer's "desired" order of summation - we don't have a way to say we don't care about this). I think that means with a SIMD implementation of compensated sum it could be a win compared with current sum on both fronts (accuracy and speed).

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.