2

I need a strtol function that would allow me specify bounds in which the string will be parsed. For example

char * number = "123456789";
std::cout << my_strtol(number, 1, 3);

Should print "234".


Using some pointer arithmetic I could get pretty close:

int32_t my_strtol(const char* str, int from_char, int to_char, int base = 10)
{
  char *end;
  auto res = strtol(str + from_char, &end, base);
  auto extra_digits = end - str - (to_char + 1);
  if(extra_digits > 0) res /= pow(10, extra_digits); 
  return res;
}

this however it fails if the string is greater than LONG_MAX (regardless the value of said part). For example input for "1234567890123456789" the call str_to_l(number, 1, 3) would fail and return 0. Moreover it uses unnecessary pow – the performance is crucial.

Edit: I would like to avoid constructing new strings or string streams, because I will be extracting multiple integers from same buffer and even more importantly I will be literally performing tens thousands of these calls.


For those who are interested in (casual)profiling results of suggested solutions:

//put temp null character at end
int32_t my_strtol1(char* str, int from_char, int to_char, int base = 10)
{
    char tmp = str[to_char];
    str[to_char] = '\0';
    auto res = strtol(str + from_char, nullptr, base);
    str[to_char] = tmp;
    return res;
}

//use substr()
int32_t my_strtol2(const std::string& str,
    const std::string::size_type from_char,
    const std::string::size_type to_char,
    int base = 10) {
    return std::stol(str.substr(from_char, to_char - from_char + 1));
}

//using boost
int32_t my_strtol3(char* str, int from_char, int to_char) {
    return boost::lexical_cast<int>(str + from_char, to_char - from_char + 1);
}

//parse characters one by one
int32_t my_strtol4(const char* str, int from_char, int to_char)
{
    int32_t res = 0;
    for (int i = from_char; i < to_char; i++)
    {
        char ch = str[i];
        ch -= '0';
        if (ch > 10 || ch < 0) return 0;
        res *= 10;
        res += ch;
    }
    return res;
}

The output (measured by clock_t) on my machine was:

Manipulating null character with 100000 iterations took 0.114s
Using substr() with 100000 iterations took 0.62s
Using boost::lexical_cast<T>() with 100000 iterations took 0.231s
Parsing character one by one with 100000 iterations took 0.083s
14
  • 2
    std::stol is the C++11 version of C's strtol. Commented Nov 23, 2015 at 23:03
  • Have you tried debugging it? What steps did you take? What do you expect the call to strtol to do? Hint: It's going to try to convert from the from_char until it sees a null terminator. Try std::string instead of char *, and use std::stol on a .substr(). Commented Nov 23, 2015 at 23:05
  • 1
    Actually, it is important whether it is null-terminated or not. strtol will convert until it reaches a non-digit character or a null terminator. This is why you're getting issues with values greater than LONG_MAX. Commented Nov 23, 2015 at 23:07
  • 1
    something like std::string numstr = number; return std::stol(number.substr(from_char, to_char-from_char)); Commented Nov 23, 2015 at 23:09
  • 1
    @wondra I've written up three approaches: coliru.stacked-crooked.com/a/e551aca357385ebc are any of these suitable? Commented Nov 23, 2015 at 23:31

2 Answers 2

4

So, if you don't need super portability (support for thai, chinese and arabic numbers, for example):

int32_t my_strtol(const char* str, int from_char, int to_char, int base = 10)
{
   int32_t res = 0; 
   for(int i = from_char; i < to_char; i++)
   {
      char ch = str[i];
      if (ch > '9' && base > 10)
      {
         ch &= ~32;    /* Make it upper case */
         ch -= 'A' + 10;
      }
      else 
      {
          ch -= '0';
      }
      if (ch > base || ch < 0)   ... do some error handling ... 
      res *= base;
      res += ch;
   }
   return res;
}
Sign up to request clarification or add additional context in comments.

3 Comments

Sorry for misleading example and not-mentioning it - I expect only ASCII string input with 10 base. I only wrote my tested example more generic because it did not complicate the code.
@wondra it is easy enough to adapt Mats' solution. Simply multiply by 10 instead of by base, and remove the handling for non-digit characters. You should still do some range checking on the string indices, of course.
The only thing that could be added is a data type for res
3

If you have Boost, it can be done as a one-liner using boost::lexical_cast, which has an overload that takes a char pointer and a length as arguments. From all I know, this is optimized (specialized) to operate directly on the input string with no copies or memory allocations performed and no streams involved.

#include <iostream>
#include <boost/lexical_cast.hpp>

int
main()
{
  const auto text = "123456789";
  const auto number = boost::lexical_cast<int>(text + 1, 3);
  std::cout << number << '\n';
}

Output:

234

I know I have originally learned this from another answer to a very similar question some time ago but I'm failing to find it again so I'll replicate it here.

1 Comment

Profiling the results shows this solution is elegant, safe and reasonably fast, I am picking this as solution because of simplicity all-around results.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.