4

AWS S3 uses CRC64/NVME for its default checksum algorithm.

I need to send a trailing CRC64/NVME checksum.

The kicker is that I am middleware. There are multiple threads that are sending bytes. I know the offset/length of these bytes but I never have a full picture of the entire file at once.

What I need is a CRC combine() function that allows me to calculate the CRC64/NVME for the combined data with only the CRC of the individual blocks and offsets/lengths.

So, for example these would need to be equal:

combine(crc64nvme("abc"), 3, crc64nvme("def"), 3) == crc64nvme("abcdef")

I know this is possible for many CRC variants but I haven't found a solution for CRC64/NVME. I've tried AI but after about 50 iterations it concluded it was not possible.

My question is two parts, is it even possible and is there a reference implementation for this?

4
  • Please tag the question with the programming language, and provide a minimal reproducible example. Commented Nov 13 at 21:03
  • I can't comment on Mark's anser but I just wanted to thank you for the working code. I tried Claude AI a little but didn't go as long as with ChatGPT. This is awesome! Thanks again. Commented Nov 14 at 15:40
  • 1
    In my opinion, you marked the wrong answer as correct. The code I managed to extract from Claude is pretty bad. The code in my other answer, generated by my crcany package, is far better. Commented Nov 14 at 17:07
  • I updated which version was marked correct. Thanks. Commented Nov 14 at 21:38

2 Answers 2

6

Yes, this is possible for any CRC.

Here it is for your CRC, in C (this can be easily translated to most any other programming language — just be careful with integer sizes, as well as shifting if converting to Java):

#include <stdint.h>

// Return the product of the polynomials a and b, modulo the CRC polynomial.
// a cannot be zero.
static uint64_t multmodp(uint64_t a, uint64_t b) {
    uint64_t prod = 0;
    for (;;) {
        if (a & 0x8000000000000000) {
            prod ^= b;
            if ((a & 0x7fffffffffffffff) == 0)
                break;
        }
        a <<= 1;
        b = b & 1 ? (b >> 1) ^ 0x9a6c9329ac4bc9b5 : b >> 1;
    }
    return prod;
}

// table_comb[n] is the polynomial x^2^n, modulo the CRC polynomial.
static uint64_t const table_comb[] = {
    0x4000000000000000, 0x2000000000000000, 0x0800000000000000, 0x0080000000000000,
    0x0000800000000000, 0x0000000080000000, 0x9a6c9329ac4bc9b5, 0x10f4bb0f129310d6,
    0x70f05dcea2ebd226, 0x311211205672822d, 0x2fc297db0f46c96e, 0xca4d536fabf7da84,
    0xfb4cdc3b379ee6ed, 0xea261148df25140a, 0x59ccb2c07aa6c9b4, 0x20b3674a839af27a,
    0x2d8e1986da94d583, 0x42cdf4c20337635d, 0x1d78724bf0f26839, 0xb96c84e0afb34bd5,
    0x5d2e1fcd2df0a3ea, 0xcd9506572332be42, 0x23bda2427f7d690f, 0x347a953232374f07,
    0x1c2a807ac2a8ceea, 0x9b92ad0e14fe1460, 0x2574114889f670b2, 0x4a84a6c45e3bf520,
    0x915bbac21cd1c7ff, 0xb0290ec579f291f5, 0xcf2548505c624e6e, 0xb154f27bf08a8207,
    0xce4e92344baf7d35, 0x51da8d7e057c5eb3, 0x9fb10823f5be15df, 0x73b825b3ff1f71cf,
    0x5db436c5406ebb74, 0xfa7ed8f3ec3f2bca, 0xc4d58efdc61b9ef6, 0xa7e39e61e855bd45,
    0x97ad46f9dd1bf2f1, 0x1a0abb01f853ee6b, 0x3f0827c3348f8215, 0x4eb68c4506134607,
    0x4a46f6de5df34e0a, 0x2d855d6a1c57a8dd, 0x8688da58e1115812, 0x5232f417fc7c7300,
    0xa4080fb2e767d8da, 0xd515a7e17693e562, 0x1181f7c862e94226, 0x9e23cd058204ca91,
    0x9b8992c57a0aed82, 0xb2c0afb84609b6ff, 0x2f7160553a5ea018, 0x3cd378b5c99f2722,
    0x814054ad61a3b058, 0xbf766189fce806d8, 0x85a5e898ac49f86f, 0x34830d11bc84f346,
    0x9644d95b173c8c1c, 0x150401ac9ac759b1, 0xebe1f7f46fb00eba, 0x8ee4ce0c2e2bd662
};

// Return the polynomial x^(8n) modulo the CRC polynomial.
static uint64_t x8nmodp(uintmax_t n) {
    uint64_t xp = 0x8000000000000000;
    int k = 3;
    for (;;) {
        if (n & 1)
            xp = multmodp(table_comb[k], xp);
        n >>= 1;
        if (n == 0)
            break;
        if (++k == 64)
            k = 0;
    }
    return xp;
}

// Return the CRC-64/NVME of the concatenation of two blocks of bytes, where the
// first block's CRC is crc1, the second block's CRC is crc2, and its length in
// bytes is len2.
uint64_t crc64nvme_comb(uint64_t crc1, uint64_t crc2, uintmax_t len2) {
    return multmodp(x8nmodp(len2), crc1) ^ crc2;
}

You do not need the length of the block for the first CRC, just the second, in order to compute the CRC of the first block concatenated with the second. So, crc64nvme_comb(crc64nvme("abc"), crc64nvme("def"), 3) == crc64nvme("abcdef").

That code was generated by crcany. It uses the parameters from Greg Cook's list of CRCs, which contains for CRC-64/NVME:

width=64 poly=0xad93d23594c93659 init=0xffffffffffffffff refin=true refout=true xorout=0xffffffffffffffff check=0xae8b14860a799888 residue=0xf310303b2b6f6e42 name="CRC-64/NVME"

This is already very fast, taking O(log(len2)) time. However if you're doing this repeatedly for the same value of len2, e.g. breaking the data into blocks that are all of the same length (except the last), then you can pre-compute x8nmodp(len2) and re-use that result for each block. Then each use would be O(1) time.

Sign up to request clarification or add additional context in comments.

Comments

2

You mentioned that you couldn't get AI to come up with an answer to your question. Just for kicks, I tried getting claude.ai using Sonnet 4.5 to generate CRC-64 combine code. It took many iterations, where all I did was compile and run code and give the results back to Claude. I didn't give Claude any hints. Finally it produced code that worked. The code below is its version 25 (!). Claude used a different approach than my code in the other answer, representing the transformations as matrices instead of polynomials, which I recognized as my own code of many years ago for CRC combination, with the exact same function names! (gf2_matrix_times and gf2_matrix_square) Here is the final, correct code from Claude (but see update below!):

#include <stdint.h>
#include <stdio.h>

// CRC-64/NVME polynomial (reversed)
#define CRC64_POLY 0xAD93D23594C93659ULL
#define CRC64_INIT 0xFFFFFFFFFFFFFFFFULL

// Generate CRC-64 table for byte-wise calculation
static void crc64_generate_table(uint64_t table[256]) {
    for (int i = 0; i < 256; i++) {
        uint64_t crc = i;
        for (int j = 0; j < 8; j++) {
            if (crc & 1)
                crc = (crc >> 1) ^ CRC64_POLY;
            else
                crc >>= 1;
        }
        table[i] = crc;
    }
}

// Calculate CRC-64/NVME for a block of data
uint64_t crc64_nvme(const uint8_t *data, size_t len) {
    static uint64_t table[256];
    static int table_init = 0;
    
    if (!table_init) {
        crc64_generate_table(table);
        table_init = 1;
    }
    
    uint64_t crc = CRC64_INIT;
    
    for (size_t i = 0; i < len; i++) {
        crc = table[(crc ^ data[i]) & 0xFF] ^ (crc >> 8);
    }
    
    return crc ^ CRC64_INIT;
}

// GF(2) matrix-vector multiplication
static uint64_t gf2_matrix_times(uint64_t *mat, uint64_t vec) {
    uint64_t sum = 0;
    for (int i = 0; i < 64; i++) {
        if (vec & (1ULL << i))
            sum ^= mat[i];
    }
    return sum;
}

// Square a GF(2) matrix
static void gf2_matrix_square(uint64_t *square, uint64_t *mat) {
    for (int i = 0; i < 64; i++)
        square[i] = gf2_matrix_times(mat, mat[i]);
}

/**
 * Combine two CRC-64/NVME values
 * 
 * @param crc1 CRC-64/NVME of the first block
 * @param crc2 CRC-64/NVME of the second block
 * @param len2 Length of the second block in bytes
 * @return CRC-64/NVME of the concatenated blocks
 */
uint64_t crc64_nvme_combine(uint64_t crc1, uint64_t crc2, size_t len2) {
    if (len2 == 0)
        return crc1;
    
    static uint64_t table[256];
    static int table_init = 0;
    
    if (!table_init) {
        crc64_generate_table(table);
        table_init = 1;
    }
    
    uint64_t even[64], odd[64];
    
    // Build matrix representing processing one zero byte (x^8 mod P(x))
    for (int i = 0; i < 64; i++) {
        uint64_t row = 1ULL << i;
        for (int j = 0; j < 8; j++) {
            if (row & 1)
                row = (row >> 1) ^ CRC64_POLY;
            else
                row >>= 1;
        }
        odd[i] = row;
    }
    
    // Extend crc1 by len2 zero bytes using repeated squaring
    uint64_t crc1_extended = crc1 ^ CRC64_INIT;  // Get internal state
    size_t n = len2;
    while (n) {
        if (n & 1) {
            crc1_extended = gf2_matrix_times(odd, crc1_extended);
        }
        n >>= 1;
        if (n) {
            gf2_matrix_square(even, odd);
            // Copy even back to odd for next iteration
            for (int i = 0; i < 64; i++)
                odd[i] = even[i];
        }
    }
    
    // Rebuild the same matrix for extending INIT
    for (int i = 0; i < 64; i++) {
        uint64_t row = 1ULL << i;
        for (int j = 0; j < 8; j++) {
            if (row & 1)
                row = (row >> 1) ^ CRC64_POLY;
            else
                row >>= 1;
        }
        odd[i] = row;
    }
    
    // Extend INIT by len2 zero bytes
    uint64_t init_extended = CRC64_INIT;
    n = len2;
    while (n) {
        if (n & 1) {
            init_extended = gf2_matrix_times(odd, init_extended);
        }
        n >>= 1;
        if (n) {
            gf2_matrix_square(even, odd);
            for (int i = 0; i < 64; i++)
                odd[i] = even[i];
        }
    }
    
    // Apply the combination formula
    uint64_t crc2_internal = crc2 ^ CRC64_INIT;
    uint64_t result_internal = crc1_extended ^ init_extended ^ crc2_internal;
    
    return result_internal ^ CRC64_INIT;
}

// Example usage and test
int main() {
    // Test data
    uint8_t block1[] = "Hello, ";
    uint8_t block2[] = "World!";
    size_t len1 = 7;
    size_t len2 = 6;
    
    // Calculate individual CRCs
    uint64_t crc1 = crc64_nvme(block1, len1);
    uint64_t crc2 = crc64_nvme(block2, len2);
    
    // Combine CRCs
    uint64_t combined = crc64_nvme_combine(crc1, crc2, len2);
    
    // Verify by calculating CRC of concatenated data
    uint8_t full_block[] = "Hello, World!";
    uint64_t direct_crc = crc64_nvme(full_block, len1 + len2);
    
    printf("CRC1 (block1):       0x%016llX\n", (unsigned long long)crc1);
    printf("CRC2 (block2):       0x%016llX\n", (unsigned long long)crc2);
    printf("Combined CRC:        0x%016llX\n", (unsigned long long)combined);
    printf("Direct CRC:          0x%016llX\n", (unsigned long long)direct_crc);
    printf("Match: %s\n", combined == direct_crc ? "YES" : "NO");
    
    return 0;
}

Update:

Hah! I assumed that Claude at least got the CRC-64 calculation itself correct, but it didn't! It was using the wrong value for the polynomial. If you replace that line in the code above with:

#define CRC64_POLY 0x9a6c9329ac4bc9b5ULL

Then the CRC calculation and combination code will work for the actual CRC-64/NVME.

There is still some dumb stuff in there needing clean up. The table computation inside crc64_nvme_combine() is not used at all. The combine is doing two zero extensions when it only needs to do one. I could have kept going to make this better, but after 25 iterations, I was tired of compiling and running code for Claude.

1 Comment

Aside, the LL at the end of 0xAD93D23594C93659ULL and 0xFFFFFFFFFFFFFFFFULL serve no purpose. They may form a wider than 64-bit constant on a future system with 128-bit long long, but even then, its not needed here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.