int128 handling in c-code, gcc / glibc / linux - follow up

Question

I had a draft for int128 handling reviewed there:
int128 handling in c-code, gcc / glibc / linux

I changed a lot according to the hints there, while leaving in, e.g.,
a trailing space on each line, the disclaimer and some 'internal notes'.

My intention is as described in the other thread and in the code
header and comments.

/* 
 * Copyright 2025 ... B. Samtleben, 
 * Based on others work, see below, alas in chaotic trial and error not all 
 * sources noted, will try to re-find origins for crediting. 
 * 
 * Redistribution and use in source and binary forms, with or without 
 * modification, are permitted provided that the following conditions 
 * are met: 
 * 
 * 1. Check to match / fullfill evtl. restrictions / copyrights of the origins. 
 * 2. As well for using / distributing source code as binaries. 
 * 3. Forward the following disclaimer. 
 * 
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 
 * SUCH DAMAGE. 
 */ 
 
// Issue, goal: GNU gcc / glibc programming workbench provides 128-bit integers 
// with different names ( __int128, int_128? int_128_t? ), already confusing. 
// It doesn't provide read routine, print routine or print format specifier. 
// Working with such makes me feel blind, requires guessing, and likely 
// introduces errors. 
// The web is flooded with naive questions and - IMHO - insuffucient solutions 
// about the topic. The routines / macros provided here try to emulate: 
// - constants: 
//   INT_MAX, INT_MIN LONG_MAX ... by U128_MIN, U128_MAX, I128_MIN, I128_MAX, 
//   add. U128_MAX_DIV10, U128_MAX_MOD10, I128_MAX_DIV10 and I128_MAX_MOD10 
//   are added to handle overflow detection. 
// - input: 
//   strtol, strtoul, atoi by strtou128, strtoi128, 
// - output: 
//   u128tostr( str, x ) and i128tostr( str, x ) are intended to provide 
//   snprintf functionality, but without length limiter, 
//   printf functionality by printf( " %s ", i128tostr( x ) ) a little less 
//   complicated than quadmath_snprintf. 
 
// ressources: 
// "6.9 128-bit Integers" - int128 support in gcc / glibc, 
// https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html 
// "Codeforces / DrinkMoreBoilingWater's blog / some tips about __int128" 
// https://codeforces.com/blog/entry/75044 
// shortcomings AFAICT: console only, print fails for INT128_MIN, 
// "atoi, atol, atoll" - https://en.cppreference.com/w/c/string/byte/atoi, 
// "strtol, strtoll" - https://en.cppreference.com/w/c/string/byte/strtol, 
// "strtoul, strtoull" - https://en.cppreference.com/w/c/string/byte/strtoul, 
 
// tried to: xxxx 
// - not yet covered: 
//   Read multiple signs ( ++x, +-x ... ), 
//   Other print formatting. 
//   Other roots, no binary, octal or hex ... 
// In contrast to snprintf no length limit. 
// To know: defined behaviour, under- / overflows throw error and return 
// MAX in that direction. Like strto(u)l, unlike wrapping of atoi. 
// Caution: in contrast to overflow check for conversions calculations do wrap! 
// To know: signed and unsigned are not different, just different interpretation
// of the same bitstring. 
 
// Build: gcc or some c-compiler / libc with int128 support. 
// Build / tested: linux ( debian ), GNU gcc, glibc 2.40, Intel xeon hardware. 
// Standalone: compile with 'gcc -O2 -o int128_test_5 int128_test_5.c', 
// ignore warning about 'integer constant is so large that it is unsigned'. 
// Run with: './int128_test_5 xxxx (yy)' where xxxx is a string to read, convert, 
// re-convert and print. yy is optional to set the iterations for timing. 
// The macros before 'main' provide the functionality, 
// main has usage examples, tests some edge cases, compares under- / overflow 
// to strtoul, strtol and atoi, and checks arbitrary input from command line, 
// as well it provides simple performance timing. 
// Use in other programs: copy/paste from '#include' .. '#define TIMEITcu' into 
// your program, check for keyword or variable collisions test and then enjoy. 
// The rest is commented or - IMHO - self-explanatory. 
 
#include <errno.h>                                  // reg. e.g. errno, 
#include <stdio.h>                                  // reg. e.g. printf, putchar, 
#include <string.h>                                     // reg. e.g. strcpy, 
#include <stdbool.h>                                    // reg. e.g. bool, 
#include <limits.h>                                     // reg. e.g. UINT_MAX, 
#include <stdlib.h>                                     // reg. e.g. strtoul, 
#include <time.h>                                   // reg. e.g. clock(), 
#include <stdint.h>                                     // reg. e.g. __int128, 
#include <wchar.h>                                  // reg. e.g. wcstol, 
#include <inttypes.h>                                   // reg. e.g. imaxabs, 
 
#define _GNU_SOURCE                                     // acc. anonchatGTP required to use wchar.h, doesn't help for that, 
 
// #define int128 long long                                 // doesn't work, 
// #define int128 int128_t                              // doesn't work, 
// #define int128 int_128                               // doesn't work, 
 
#define int128 __int128                                 // try to avoid int128, __int128, __int128_t confusions, 
#define uint128 unsigned __int128                           // try to avoid long word 'unsigned', 
// #define int128 __int128_t                                // try to avoid int128, __int128, __int128_t confusions, 
// #define uint128 __uint128_t                              // try to avoid long word 'unsigned', 
                                            // not yet found which of above pairs is better, 
 
#define uint unsigned int                               // try to avoid long word 'unsigned', 
#define ulong unsigned long                                 // try to avoid long word 'unsigned', 
 
#define U128_MIN 0 
#define U128_MAX ( ( ( (uint128)( 0xFFFFFFFFFFFFFFFF ) ) << 64 ) + 0xFFFFFFFFFFFFFFFF ) // gcc doesn't support constants > 64-bit?, 
#define U128_MAX_DIV10 ( U128_MAX / 10 )                        // constants for overflow check, 
#define U128_MAX_MOD10 ( U128_MAX % 10 ) 

#define I128_MAX ( ( ( (int128)( 0x7FFFFFFFFFFFFFFF ) ) << 64 ) + 0xFFFFFFFFFFFFFFFF )  // gcc doesn't support constants > 64-bit?, 
#define I128_MAX_DIV10 ( I128_MAX / 10 )                        // constants for overflow check, 
#define I128_MAX_MOD10 ( I128_MAX % 10 ) 
#define I128_MIN ( -I128_MAX - 1 )                          // gcc doesn't support constants > 64-bit?,  
#define I128_MIN_DIV10 ( I128_MIN / 10 )                        // constants for overflow check, 
#define I128_MIN_MOD10 ( I128_MIN % 10 ) 
 
 
// volatile int i = 0;                                  // global variables defined before 'macros', 
// char str[ 45 ]; 
// char *ptr = str;
clock_t start1, end1; 
double reference = 1; 
 
uint128 strtou128( const char *s ) {                            // string to uint128, 
    const char *p = s; 
    uint128 val = 0; 
 
    while ((*p == '\n') || (*p == '\t') || (*p == ' ') ||               // Skip leading whitespace 
        (*p == '\f') || (*p == '\r') || (*p == '\v')) 
        p++; 
 
    if( *p == '-' ) {                               // Check against negative,  
        errno = 1; 
        perror( "error, don't try to convert negative string into positive value" ); 
        return( 0 ); 
    } 
 
    if( *p == '+' )                                 // Swallow '+', 
        p++; 
 
    while (*p >= '0' && *p <= '9') {                        // Convert string to number 
        if( ( val > U128_MAX_DIV10 ) ||                     // This seems costly, carry a counter and start at digit37? 
            ( ( val == U128_MAX_DIV10 ) && ( ( *p - '0' ) > U128_MAX_MOD10 ) ) ) { 
            errno = ERANGE; 
            perror( "error, uint128 overflow" ); 
            return( U128_MAX ); 
        } 
        val = (10 * val) + (*p - '0'); 
        p++; 
    } 
 
    return val; 
} 
 
int128 strtoi128( const char *s ) {                             // string to int128 from ???, 
    const char *p = s; 
    int128 val = 0; 
    bool neg = 0; 
 
    while ((*p == '\n') || (*p == '\t') || (*p == ' ') ||               // Skip leading whitespace 
        (*p == '\f') || (*p == '\r') || (*p == '\v')) 
        p++; 
 
    if( ( *p == '-' ) || ( *p == '+' ) ) {                      // Check for sign 
        neg = ( *p == '-' );                            // account '-', 
        p++;                                    // swalow '+', 
    } 
 
    if( !neg ) { 
        while (*p >= '0' && *p <= '9') {                    // Convert string to number, 
            if( ( val > I128_MAX_DIV10 ) || 
                ( ( val == I128_MAX_DIV10 ) && ( ( *p - '0' ) > I128_MAX_MOD10 ) ) ) { 
                errno = ERANGE; 
                perror( "error, int128 overflow" ); 
                return( I128_MAX ); 
            } 
        
        val = (10 * val) + (*p - '0'); 
        p++; 
        } 
    } 
 
    if( neg ) { 
        if( *p >= '0' && *p <= '9' ) {                      // Account 'neg', 
            val = ( ( 10 * val ) - ( *p - '0' ) ); 
            p++; 
        } 
        while (*p >= '0' && *p <= '9') {                    // Convert string to number, 
            if( ( val < I128_MIN_DIV10 ) || 
                ( ( val == I128_MIN_DIV10 ) && ( ( *p - '0' ) > -I128_MIN_MOD10 ) ) ) { 
                errno = ERANGE; 
                perror( "error, int128 underflow" ); 
                return( I128_MIN ); 
            } 
        val = ( 10 * val ) - ( *p - '0' ); 
        p++; 
        } 
    } 
 
    return val; 
} 
 
char* u128tostr( uint128 x ) {                          // converting uint128 into 0-terminated ASCII string, buffer approach, 
    static char arr[ 40 + 1 ] = { 0 };                      // definition includes termination, 
    int j = 40;                                     // start from right, 
    while( x > 9 ) {                                // iterate through value, 
        arr[ --j ] = ( x % 10 + '0' );                      // set next digit, 
        x /= 10;                                // strip from value, 
    } 
    arr[ --j ] = ( x + '0');                            // set last ( most significant ) digit, 
    return arr + j;                                 // done, but if that works? 
} 
 
char* i128tostr( int128 x ) {                           // converting int128 into 0-terminated ASCII string, buffer approach, 
    static char arr[ 41 + 1 ] = { 0 };                      // definition includes termination, needs one add. byte for the sign, 
    int j = 41;                                     // start from right, 
    bool neg = ( x < 0 ); 
    if( neg ) { 
        arr[ --j ] = -( x % 10 ) + '0';                         // set next digit, 
        x = x / 10;                                 // strip from value, 
        x = -x;                                 // strip from value, 
    } else { 
        arr[ --j ] = x % 10 + '0';                      // set next digit, 
        x = x / 10;                                 // strip from value, 
    } 
 
    while( x > 9 ) {                                // iterate through value, 
        arr[ --j ] = ( x % 10 + '0' );                      // set next digit, 
        x /= 10;                                // strip from value, 
    } 
    if( x > 0 )                                     // don't ad leading 0 to 1 digit negative values,  
        arr[ --j ] = ( x + '0');                        // set last ( most significant ) digit, 
    if( neg ) 
        arr[ --j ] = '-';                           // add sign, 
    return arr + j;                                 // done, but if that works? 
} 
 
#define TIMEITcu( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%1u; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, x1u, #expr, comment ) 
 
#define TIMEITci( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%1d; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, x1i, #expr, comment ) 
 
#define TIMEITclu( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%1lu; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, x1lu, #expr, comment ) 
 
#define TIMEITcli( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%1ld; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, x1li, #expr, comment ) 
 
#define TIMEITcllu( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%s; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, u128tostr( x1llu ), #expr, comment ) 
 
#define TIMEITclli( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%s; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, i128tostr( x1lli ), #expr, comment ) 
 
#define TIMEITcstr( expr, N, comment ) \
    start1 = clock(); \
    for( int i = 1; i <= N; i++ ) \
    { \
        expr; \
    } \
    end1 = clock(); \
    printf( "%07d; %09.03f; %1d; '%s; %s; %s \n", end1 - start1, ( end1 - start1 ) / reference, N, str1, #expr, comment ) 
 
 
int main( int argc, char *argv[] ) { 
 
    volatile uint128 x1llu; 
    volatile int128 x1lli; 
    volatile ulong x1lu; 
    volatile long x1li; 
    volatile uint x1u; 
    volatile int x1i; 
    char *str1; 
    int count = 1000; 
    wchar_t wide_str[20]; 

    if( argv[ 2 ] ) count = atoi( argv[ 2 ] ); 
 
    printf( " \n" ); 
 
    printf( "check availability of INT_128 \n\n" ); 
 
#ifdef __SIZEOF_INT128__
    printf( "size of INT128: %1d \n", __SIZEOF_INT128__ ); 
#else
    printf( "INT128 likely not supported. \n" ); 
#endif
 
    printf( "sizeof( intmax_t ) : %1d \n", sizeof( intmax_t ) ); 
    printf( "sizeof( uintmax_t ): %1d \n", sizeof( uintmax_t ) ); 
    printf( " \n" ); 
 
    printf( "check constants / references \n\n" ); 
 
    printf( "U128_MAX                : %s \n", u128tostr( U128_MAX ) ); 
    printf( "U128_MAX_DIV10          : %s \n", u128tostr( U128_MAX_DIV10 ) ); 
    printf( "U128_MAX_MOD10          : %s \n", u128tostr( U128_MAX_MOD10 ) ); 
    printf( " \n" ); 
 
    printf( "I128_MAX                : %s \n", i128tostr( I128_MAX ) ); 
    printf( "I128_MAX_DIV10          : %s \n", i128tostr( I128_MAX_DIV10 ) ); 
    printf( "I128_MAX_MOD10          : %s \n", i128tostr( I128_MAX_MOD10 ) ); 
    printf( " \n" ); 
 
    printf( "I128_MIN                : %s \n", i128tostr( I128_MIN ) ); 
    printf( "I128_MIN_DIV10          : %s \n", i128tostr( I128_MIN_DIV10 ) ); 
    printf( "I128_MIN_MOD10          : %s \n", i128tostr( I128_MIN_MOD10 ) ); 
    printf( " \n" ); 
 
    printf( "testing uint128: \n\n" ); 
 
    x1llu = 1;                                  // simple case, 
    printf( "simple 'x1llu = 1'      : %s \n", u128tostr( x1llu ) );        // print as string, 
 
    x1llu--;                                    // simple calculation, and checking 0, 
    printf( "'x1llu--'               : %s \n", u128tostr( x1llu )  ); 
 
    x1llu--;                                    // simple calculation, and checking wrap below 0, 
    printf( "'x1llu--' wrap below 0  : %s \n", u128tostr( x1llu ) ); 
    printf( " \n" ); 
 
    x1llu = U128_MIN;                               // checking MIN, 
    printf( "U128_MIN                : %s \n", u128tostr( U128_MIN ) ); 
    printf( " \n" ); 
 
    x1llu = U128_MAX;                               // checking MAX, 
    printf( "U128_MAX                : %s \n", u128tostr( x1llu ) ); 
 
    x1llu++;                                    // check wrap above MAX, 
    printf( "'x1llu++' wrap above MAX: %s \n", u128tostr( x1llu ) ); 
    printf( " \n" ); 
 
    printf( "testing int128: \n\n" ); 
 
    x1lli = 1;                                  // simple case, 
    printf( "simple 'x1ll1 = 1'      : %s \n", i128tostr( x1lli ) );        // print the string, 
 
    x1lli--;                                    // simple calculation, and checking 0, 
    printf( "'x1ll1--' testing 0     : %s \n", i128tostr( x1lli ) );        // print the string, 
 
    x1lli--;                                    // checking advaning into negtive, 
    printf( "'x1ll1--' go to negative: %s \n", i128tostr( x1lli ) );        // print the string, 
    printf( " \n" ); 
 
    x1lli = I128_MIN;                               // checking MIN, 
    printf( "'x1ll1 = I128_MIN'      : %s \n", i128tostr( x1lli ) );        // print the string, 
 
    x1lli--;                                    // checking wrap below MIN, 
    printf( "'x1ll1--' wrap below MIN: %s \n", i128tostr( x1lli ) );        // print the string, 
    printf( " \n" ); 
 
    x1lli = I128_MAX;                               // checking MAX, 
    printf( "'x1ll1 = I128_MAX'      : %s \n", i128tostr( x1lli ) );        // print the string, 
 
    x1lli++;                                    // checking wrap above MAX, 
    printf( "'x1ll1++' wrap above MAX: %s \n", i128tostr( x1lli ) );        // print the string, 
    printf( " \n" ); 
 
    printf( "'ll' wraps at LONG_MAX  : %s \n\n", i128tostr( -9223372036854775808ll ) ); // print the string, 
    printf( "'ll' wraps at LONG_MAX  : %s \n\n", i128tostr( -18446744073709551616ll ) ); // print the string, 
 
    printf( "cross check vs. unsigned long int: \n\n" ); 
 
    x1lu = 1;                                   // simple case, 
    printf( "simple 'x1lu = 1'       : %1lu \n", x1lu );                // print the value, 
 
    x1lu--;                                     // simple calculation, and checking 0, 
    printf( "'x1lu--'                : %1lu \n", x1lu ); 
 
    x1lu--;                                     // simple calculation, and checking wrap below 0, 
    printf( "'x1lu--' wrap below 0   : %1lu \n", x1lu ); 
    printf( " \n" ); 
 
    x1lu = ULONG_MAX;                               // checking UINT_MAX, 
    printf( "'ULONG_MAX'             : %1lu \n", x1lu ); 
 
    x1lu++;                                     // checking wrap above MAX, 
    printf( "'x1lu++' wrap above MAX : %1lu \n", x1lu ); 
    printf( " \n" ); 
 
    printf( "cross check vs. long int: \n\n" ); 
 
    x1li = 1;                                   // simple case, 
    printf( "simple 'x1li = 1'       : %1ld \n", x1li );                // print the value, 
 
    x1li--;                                     // simple calculation, and checking 0, 
    printf( "'x1li--'                : %1ld \n", x1li ); 
 
    x1li--;                                     // simple calculation, and checking advance into negative, 
    printf( "'x1li--' advance to neg.: %1ld \n", x1li ); 
    printf( " \n" ); 
 
    x1li = LONG_MIN;                                // checking INT_MIN, 
    printf( "'LONG_MIN'              : %1ld \n", x1li ); 
 
    x1li--;                                     // check wrap below MIN, 
    printf( "'x1li--' wrap below MIN : %1ld \n", x1li ); 
    printf( " \n" ); 
 
    x1li = LONG_MAX;                                // checking INT_MAX, 
    printf( "'LONG_MAX'              : %1ld \n", x1li ); 
 
    x1li++;                                     // check wrap above MAX, 
    printf( "'x1li++' wrap above MAX : %1ld \n", x1li ); 
    printf( " \n" ); 
 
    printf( "cross check vs. unsigned int: \n\n" ); 
 
    x1u = 1;                                    // simple case, 
    printf( "simple 'x1u = 1'        : %1u \n", x1u );              // print the value, 
 
    x1u--;                                      // simple calculation, and checking 0, 
    printf( "'x1u--'                 : %1u \n", x1u ); 
 
    x1u--;                                      // simple calculation, and checking wrap below 0, 
    printf( "'x1u--' wrap below 0    : %1u \n", x1u ); 
    printf( " \n" ); 
 
    x1u = UINT_MAX;                                 // checking UINT_MAX, 
    printf( "'UINT_MAX'              : %1u \n", x1u ); 
 
    x1u++;                                      // checking wrap above MAX, 
    printf( "'x1u++' wrap above MAX  : %1u \n", x1u ); 
    printf( " \n" ); 
 
    printf( "cross check vs. int: \n\n" ); 
 
    x1i = 1;                                    // simple case, 
    printf( "simple 'x1i = 1'        : %1d \n", x1i );              // print the value, 
 
    x1i--;                                      // simple calculation, and checking 0, 
    printf( "'x1i--'                 : %1d \n", x1i ); 
 
    x1i--;                                      // simple calculation, and checking advance into negative, 
    printf( "'x1i--' advance to neg. : %1d \n", x1i ); 
    printf( " \n" ); 
 
    x1i = INT_MIN;                                  // checking INT_MIN, 
    printf( "'INT_MIN'               : %1d \n", x1i ); 
 
    x1i--;                                      // check wrap below MIN, 
    printf( "'x1i--' wrap below MIN  : %1d \n", x1i ); 
    printf( " \n" ); 
 
    x1i = INT_MAX;                                  // checking INT_MAX, 
    printf( "'INT_MAX'               : %1d \n", x1i ); 
 
    x1i++;                                      // check wrap above MAX, 
    printf( "'x1i++' wrap above MAX  : %1d \n", x1i ); 
    printf( " \n" ); 
 
    printf( "Check arbitrary values from command line argument: \n\n", x1li ); 
 
    x1llu = strtou128( argv[ 1 ] ); 
    printf( "input as uint128        : %s \n", u128tostr( x1llu ) ); 
    printf( "under- / overflow error < 0 and > U128_MAX, \n\n" ); 
 
    printf( "performance of uint128: \n" ); 
    TIMEITcllu( x1llu = strtou128( argv[ 1 ] ), count, "" ); 
    TIMEITcstr( str1 = u128tostr( x1llu ), count, "" ); 
    printf( " \n" ); 
 
    x1lli = strtoi128( argv[ 1 ] ); 
    printf( "input as int128         : %s \n", i128tostr( x1lli ) ); 
    printf( "under- / overflow error < I128_MIN and > I128_MAX \n\n" ); 
 
    printf( "performance of int128: \n" ); 
    TIMEITclli( x1lli = strtoi128( argv[ 1 ] ), count, "" ); 
    TIMEITcstr( str1 = i128tostr( x1lli ), count, "" ); 
    printf( " \n" ); 
 
    x1lu = strtoul( argv[ 1 ], NULL, 10 ); 
    printf( "input as unsig. longint : %1lu \n", x1lu ); 
    printf( "strtoul works up to ULONG_MAX, then stuck, \n" ); 
    printf( "in negative wraps down to -ULONG_MAX, then stuck there, \n\n" ); 
 
    printf( "performance of ulong: \n\n" ); 
    TIMEITclu( x1lu = strtoul( argv[ 1 ], NULL, 10 ), count, "" ); 
    TIMEITcstr( snprintf( str1, 21, "%lu", x1lu ), count, "" ); 
    printf( " \n" ); 
 
    x1li = strtol( argv[ 1 ], NULL, 10 ); 
    printf( "input as long int       : %1ld \n", x1li ); 
    printf( "strtol works up to LONG_MAX, then stuck, \n" ); 
    printf( "in negative wraps down to LONG_MIN, then stuck there, \n\n" ); 
 
    printf( "performance of long: \n\n" ); 
    TIMEITcli( x1li = strtol( argv[ 1 ], NULL, 10 ), count, "" ); 
    TIMEITcstr( snprintf( str1, 21, "%ld", x1li ), count, "" ); 
    printf( " \n" ); 
 
    x1u = atoi( argv[ 1 ] ); 
    printf( "input as uint           : %1u \n", x1u ); 
    printf( "atoi ( found no atou ) wrapping up to LONG_MAX, then stuck at \n" ); 
    printf( "4294967295 ( UINT_MAX ), in negative it wraps down to \n" ); 
    printf( "LONG-MIN, then stuck at 0. \n\n" ); 
 
    printf( "performance of uint: \n\n" ); 
    TIMEITcu( x1u = atoi( argv[ 1 ] ), count, "" ); 
    TIMEITcstr( snprintf( str1, 11, "%u", x1u ), count, "" ); 
    printf( " \n" ); 
 
    x1i = atoi( argv[ 1 ] ); 
    printf( "input as int            : %1d \n", x1i ); 
    printf( "atoi wraps up to LONG_MAX, then stuck at -1 \n" ); 
    printf( "in negative it wraps down to LONG_MIN, then stuck at 0. \n\n" ); 

    printf( "performance of int: \n\n" ); 
    TIMEITci( x1i = atoi( argv[ 1 ] ), count, "" ); 
    TIMEITcstr( snprintf( str1, 12, "%d", x1i ), count, "" ); 
    printf( " \n" ); 
 
// acc. anonchatGPT compiling with '-std=c99 helps to use wchar.h types ... not yet got it to work ... 
///     wide_str = \0; 
///     x1llu = wcstoull( argv[ 1 ], wide_str, 10 ); 
///     u128tostr( str, x1llu ); 
///     printf( "input as uint128        : %s \n", str ); 
 
    printf( "done \n \n" ); 
 
    return 0; 
}

If you want to do this well, you really should consider extending printf(). Sadly, the GNU library doesn't seem to support registering a corresponding scanf() format. — Toby Speight
– Toby Speight, Commented Feb 8 at 15:11
user1018684, given the explosion of BitInt integer types available now with C23, perhaps we should design 4 functions: signed to/from string, unsigned to/from string, that all take a bit-width n argument. This 128-bit set of yours is good, yet arbitrary width-n is not that much of an extension. Hmmm, maybe *printf() was already extended for that...? — chux
– chux, Commented Feb 9 at 21:51
@chux When I asked the question, I was actually expecting something like this to already be ready somewhere in the drawer. I don't know what C23 contains, do I have to worry that it defines the types without offering functions for them? — user1018684
– user1018684, Commented Feb 9 at 22:11

chux · Accepted Answer · 2025-02-08 16:55:35Z

Caller unable to detect no conversion

A call like strtou128(" -xyz"); simple returns 0 with no indication of a failed conversion.

`char* i128tostr( int128 x )` risk issues returning a `static` buffer

Consider returning const char * to discourage users from altering the buffer.
Consider passing in a buffer like here or a compound literal.

Candidate `i128tostr()` code reduction

Something like below which handles MAX, MIN values and zero nicely. I suspect it's about 40% of OP's code.

  // Start filling from the end.
  char* p = &arr[sizeof arr - 1];
  *p = '\0';  // This eliminates need to initialize `arr[]`.

  // Work with negative `int128`.
  int128 an = x < 0 ? x : -x;  

  do {
    *(--p) = '0' - (an % 10);
    an /= 10;
  } while (an);

  if (x < 0) {
    *(--p) = '-';
  }
  return p;

`strto...()` is a reserved name

str[a-z][a-zA-Z0-9_]* C23dr § J.6.1 1

`strtou128()` signature

If wanting different functionality from standard strto....() functions' interface, use a different name.

`strtoi128()` function signature is quite different from standard `strto...()` functions

Function differs from standard `strtou...()` which accept `'-'`

 if( *p == '-' ) {                               // Check against negative,  
        errno = 1;

Corner case consideration: uint128 strtou128("-0").

Alternative limit check

OP has:

if( ( val > U128_MAX_DIV10 ) || 
    ( ( val == U128_MAX_DIV10 ) && ( ( *p - '0' ) > U128_MAX_MOD10 ) ) ) {

Good compilers emit efficient code, yet the following may be faster/easier as code rarely does testing beyond the first val >= U128_MAX_DIV10 unlike the original which often needs to do the second test.

if (val >= U128_MAX_DIV10 && 
  (val > U128_MAX_DIV10  || (*p - '0' > U128_MAX_MOD10)) {

OP's `strtou128()` function differs from standard `strto...()` in potentially calling `perror()`

If targeting strto...() like behavior, remove perror() call.

`strtoi128()` has 2 nearly identical loops; only 1 is needed

I suspect OP already knows of this from prior review, yet still favors repeated code.

Repeat: Avoid naked magic numbers

Code like static char arr[ 41 + 1 ] = { 0 }; has 41 with no explanation. (I suspect it is off-by-1 in a benign fashion.)

Minor: Standard `strto...()` functions skip locale dependent white-space

OP uses a fixed set of white-space: "\n\t \f\r\v" so likely different functionality in a future standard strtou128().

Test code assumes `clock_t` is an `int` when printing

Use spec'd matching specifiers and objects.

Casting to a wide type is one approach.

// printf( "%07d; ....\n", end1 - start1, ...);
printf( "%07g; ....\n", (double)(end1 - start1), ...);
// or the like ...

Minor: `ll` not needed

i128tostr( -9223372036854775808ll ) is fine as i128tostr( -9223372036854775808). Appending a l or ll is seldom needed.

IAC, -9223372036854775808 (-2⁶³) and -18446744073709551616, (-2⁶⁴) with ll or not are often invalid integer constants.

If compilation did not warn, consider enabling more warnings to save time.

DRY vs. WET.

On a future system with a 256-bit long long, i128tostr( -9223372036854775808ll ) may emit a pedantic warning about type reduction.

Maybe use UINT64_MAX + (uint128)1?

Minor: alternate constants

Inspired by Toby

Code like ( ( ( (uint128)( 0xFFFFFFFFFFFFFFFF ) ) << 64 ) + 0xFFFFFFFFFFFFFFFF ) risks a mis-code of the number of Fs.

For ..._MOD10 constants, I like the idea of narrowing the type to prevent unnecessarily width math.

//#define U128_MIN 0 // Not really needed
#define U128_MAX ((uint128)-1)
#define U128_MAX_DIV10 ( U128_MAX / 10 )
#define U128_MAX_MOD10 ( (unsigned) (U128_MAX % 10) ) // or maybe (int)

#define I128_MAX ( (int128)(U128_MAX/2) )
#define I128_MAX_DIV10 ( I128_MAX / 10 )
#define I128_MAX_MOD10 ( (int) (I128_MAX % 10) ) 

#define I128_MIN ( -I128_MAX - 1 )
#define I128_MIN_DIV10 ( I128_MIN / 10 )
#define I128_MIN_MOD10 ( (int) (I128_MIN % 10) )

`strtoi128_base10()`

Some sample unchecked code that integrates various ideas for part of OP's goal.

#include <ctype.h>
#include <errno.h>

typedef __int128 my_int;
#define MY_INT_MAX ((my_int) ((unsigned __int128)-1)/2 )
#define MY_INT_MIN (-MY_INT_MAX - 1)

// A base 10 only version of strto...()
my_int my_strtoi128_base10(const char *restrict nptr, char **restrict endptr) {
  // Access the characters as unsigned char.
  const unsigned char *unptr = (const unsigned char*) nptr;

  // Consume leading white-space.
  while (isspace(*unptr)) {
    unptr++;
  }

  // Save sign.
  bool negsign = *unptr == '-';
  if (negsign || *unptr == '+') {
    unptr++;
  }

  // Store the address of the first potential digit.
  const unsigned char *first_digit = unptr;

  my_int sum = 0;

  // Examine the digits.
  while (isdigit(*unptr)) {
    int digit = *unptr++ - '0';
    if (sum <= MY_INT_MIN / 10
        && (sum < MY_INT_MIN / 10 || (digit > -(MY_INT_MIN % 10)))) {
      sum = MY_INT_MIN;
      errno = ERANGE;
    } else {
      sum = sum * 10 - digit;  // Sum as a negative integer.
    }
  }

  // Adjust for sign.
  if (!negsign) {
    if (sum < -MY_INT_MAX) {
      errno = ERANGE;
      sum = MY_INT_MAX;
    } else {
      sum = -sum;
    }
  }

  if (endptr) {
    // Was at least 1 digit found?
    *endptr = (unptr > first_digit) ? (char*) unptr : (char*) nptr;
  }

  return sum;
}

Toby Speight · Accepted Answer · 2025-02-08 18:25:22Z

It would be helpful to see the public interface (as you would provide in a header file) separated from the implementation. There's a lot in here that shouldn't be exposed to users. I'm unable to review the public/private split in this review.

Please clean up the trailing whitespace all through the file.

#include <errno.h>                                  // reg. e.g. errno,

#define _GNU_SOURCE                                     // acc. anonchatGTP required to use wchar.h, doesn't help for that,

I can't tell what the abbreviations reg. and acc. stand for here.

And _GNU_SOURCE probably ought to be defined before including the standard library headers, or you get an inconsistent view of the target's capabilities.

#define int128 __int128

Don't use preprocessor for this. C provides typedef for declaring type aliases, without the risk of expanding macros in inappropriate contexts.

#define U128_MAX ( ( ( (uint128)( 0xFFFFFFFFFFFFFFFF ) ) << 64 ) + 0xFFFFFFFFFFFFFFFF ) // gcc doesn't support constants > 64-bit?,

Perhaps simpler and more portable as (~(uint128)0)?

These variables do not need to be global:

clock_t start1, end1; 
double reference = 1;

Avoid global variables if at all possible, as they make the code much harder to reason about.

This function has the same weakness as atoi() from the standard library:

uint128 strtou128( const char *s )

Don't define functions with names reserved for future versions of standard library. That risks undefined behaviour in future.

This function has no way to unambiguously tell its caller whether or not it was successful. Prefer an interface more like strtoull() which indicates success or failure and also how much of the string was parsed.

while ((*p == '\n') || (*p == '\t') || (*p == ' ') ||               // Skip leading whitespace 
    (*p == '\f') || (*p == '\r') || (*p == '\v')) 
    p++;

Why are we using a fixed list of whitespace characters here, instead of standard isspace()?

~~And we have a buffer overflow if there's no non-whitespace characters in the input string - we need to be testing *p == '\0' throughout this loop, too.~~

Also, please always use braced blocks for conditional statements, to aid comprehension and reduce mistakes.

    errno = 1; 
    perror( "error, don't try to convert negative string into positive value" ); 
    return( 0 );

On my system, 1 is EPERM, meaning that the user will get a message like "Operation not permitted" or "Operationen er ikke tilladt" or "許可されていない操作です" in this case. I don't think that's helpful.

In a library function like this, we shouldn't be producing output - just make sure the caller is able to determine if and why it failed, and allow it to decide whether its user needs to be informed.

strtoi128() shares most of the same issues as strtou128(). Additionally, we can restructure this form:

if( !neg ) {
    ⋮ 
}

if( neg ) {
    ⋮ 
}

to a much more readable

if (neg) {
    ⋮ 
} else {
    ⋮ 
}

The functions that convert integers to string both return pointers to internal static storage that's re-used each call. That makes them hard to use, for example when we want to format and print more than one value. It's better to require the caller to provide suitable output storage (and its length) in the style of snprintf().

The various TIMEIT macros (presumably part of the test program rather than the library) differ only in the printing of result. It's probably better to make a common version that accepts extra arguments for that:

#define TIMEIT(expr, N, result, format, comment)                        \
    do {                                                                \
        clock_t const start1 = clock();                                 \
        for (int i = 1;  i <= N;  ++i) {                                \
            expr;                                                       \
        }                                                               \
        clock_t const end1 = clock();                                   \
        printf( "%07ld; %09.03lf; %1d; '%" #format "; %s; %s\n",        \
                end1 - start1, (double)(end1 - start1) / reference,     \
                N, result, #expr, comment );                            \
    } while (0)

#define TIMEITcu(expr, N, comment) \
    TIMEIT(expr, N, x1u, u, comment)

#define TIMEITci(expr, N, comment) \
    TIMEIT(expr, N, x1i, d, comment)

#define TIMEITclu(expr, N, comment) \
    TIMEIT(expr, N, x1lu, lu, comment)

#define TIMEITcli(expr, N, comment) \
    TIMEIT(expr, N, x1li, ld, comment)

#define TIMEITcllu(expr, N, comment) \
    TIMEIT(expr, N, u128tostr(x1llu), s, comment)

#define TIMEITclli(expr, N, comment) \
    TIMEIT(expr, N, i128tostr(x1lli), s, comment)

#define TIMEITcstr(expr, N, comment) \
    TIMEIT(expr, N, str1, s, comment)

(I corrected the %d format for clock_t to %ld for my platform - consider something more portable here)

thank you!, the whitespace is intentional, "reg.": regarding, "acc.": according(ly), the other hints will take some time to go through, I will do and come back to it. ( why doe's the editor steal away my leading "@Toby Speight" ~ confused ) — user1018684
– user1018684, Commented Feb 7 at 11:06
@Toby, As clock_t and time_t are some arithmetic type (perhaps FP), I found a cast to double to match %g satisfactory and portable. printf("%g\n", (double) clock);. — chux
– chux, Commented Feb 8 at 9:22
I was mistaken there @chux - fixed. Thank you for pointing that out. — Toby Speight
– Toby Speight, Commented Feb 8 at 18:25
reg. the recursive TIMEIT definitions, can you explain that concept in more detail, I ran into difficulties when tried to adapt for more datatypes, don't understand the do { } while(0) loop, and @Fe2O3 considered it 'risky' in ~follow-up II — user1018684
– user1018684, Commented Feb 15 at 9:33
There's no recursion - the various TIMEIT* macros expand the common TIMEIT to avoid duplicating all the shared implementation. The do {} while(0) idiom is a commonly used structure to create a macro that can be used like a single statement. — Toby Speight
– Toby Speight, Commented Feb 15 at 9:47

Stack Exchange Network

int128 handling in c-code, gcc / glibc / linux - follow up

2 Answers 2

Caller unable to detect no conversion

`char* i128tostr( int128 x )` risk issues returning a `static` buffer

Candidate `i128tostr()` code reduction

`strto...()` is a reserved name

`strtou128()` signature

`strtoi128()` function signature is quite different from standard `strto...()` functions

Function differs from standard `strtou...()` which accept `'-'`

Alternative limit check

OP's `strtou128()` function differs from standard `strto...()` in potentially calling `perror()`

`strtoi128()` has 2 nearly identical loops; only 1 is needed

Repeat: Avoid naked magic numbers

Minor: Standard `strto...()` functions skip locale dependent white-space

Test code assumes `clock_t` is an `int` when printing

Minor: `ll` not needed

Minor: alternate constants

`strtoi128_base10()`

You must log in to answer this question.

Linked

Hot Network Questions

int128 handling in c-code, gcc / glibc / linux - follow up

2 Answers 2

Caller unable to detect no conversion

char* i128tostr( int128 x ) risk issues returning a static buffer

Candidate i128tostr() code reduction

strto...() is a reserved name

strtou128() signature

strtoi128() function signature is quite different from standard strto...() functions

Function differs from standard strtou...() which accept '-'

Alternative limit check

OP's strtou128() function differs from standard strto...() in potentially calling perror()

strtoi128() has 2 nearly identical loops; only 1 is needed

Repeat: Avoid naked magic numbers

Minor: Standard strto...() functions skip locale dependent white-space

Test code assumes clock_t is an int when printing

Minor: ll not needed

Minor: alternate constants

strtoi128_base10()

You must log in to answer this question.

Linked

Related

Hot Network Questions

`char* i128tostr( int128 x )` risk issues returning a `static` buffer

Candidate `i128tostr()` code reduction

`strto...()` is a reserved name

`strtou128()` signature

`strtoi128()` function signature is quite different from standard `strto...()` functions

Function differs from standard `strtou...()` which accept `'-'`

OP's `strtou128()` function differs from standard `strto...()` in potentially calling `perror()`

`strtoi128()` has 2 nearly identical loops; only 1 is needed

Minor: Standard `strto...()` functions skip locale dependent white-space

Test code assumes `clock_t` is an `int` when printing

Minor: `ll` not needed

`strtoi128_base10()`