1

I have just started learning C after coding for some while in Java and Python.

I was wondering how I could "validate" a string input (if it stands in a certain criteria) and I stumbled upon the sscanf() function.

I had the impression that it acts kind of similarly to regular expressions, however I didn't quite manage to tell how I can create rather complex queries with it.

For example, lets say I have the following string:

char str[]={"Santa-monica 123"}

I want to use sscanf() to check if the string has only letters, numbers and dashes in it.

Could someone please elaborate?

4
  • 2
    This is not a job for sscanf(). Use actual regular expressions instead. There are plenty of regex libraries available for this. Commented Apr 6, 2020 at 21:22
  • @RemyLebeau Hey Remy! mind showing an example? I really tried digging into regexes in C but I didn't quite got it. Commented Apr 6, 2020 at 21:25
  • 1
    "check if the string has only letters, numbers and dashes in it' --> "Santa-monica 123" has a space, so is it invalid? Commented Apr 7, 2020 at 1:51
  • 1
    Given "only letters, numbers and dashes in it.", is an empty string "" valid or not? Commented Apr 7, 2020 at 1:52

6 Answers 6

2

The fact that sscanf allows something that looks a bit like a character class by no means implies that it is anything at all like a regular expression library. In fact, Posix doesn't even require the scanf functions to accept character ranges inside character classes, although I suspect that it will work fine on any implementation you will run into.

But the scanning problem you have does not require regular expressions, either. All you need is a repeated character class match, and sscanf can certainly do that:

#include <stdbool.h>

bool check_string(const char* s) {
  int n = 0;
  sscanf(s, "%*[-a-zA-Z0-9]%n", &n);
  return s[n] == 0;
}

The idea behind that scanf format is that the first conversion will match and discard the longest initial sequence consisting of valid characters. (It might fail if the first character is invalid. Thanks to @chux for pointing that out.) If it succeeds, it will then set n to the current scan point, which is the offset of the next character. If the next character is a NUL, then all the characters were good. (This version returns OK for the empty string, since it contains no illegal characters. If you want the empty string to fail, change the return condition to return n && s[n] == 0;)

You could also do this with the standard regex library (or any more sophisticated library, if you prefer, but the Posix library is usually available without additional work). This requires a little bit more code in order to compile the regular expression. For efficiency, the following attempts to compile the regex only once, but for simplicity I left out the synchronization to avoid data races during initialization, so don't use this in a multithreaded application.

#include <regex.h>
#include <stdbool.h>

bool check_string(const char* s) {
  static regex_t* re_ptr = NULL;
  static regex_t re;
  if (!re_ptr) regcomp((re_ptr = &re), "^[[:alnum:]-]*$", REG_EXTENDED);
  return regexec(re_ptr, s, 0, NULL, 0) == 0;
}
Sign up to request clarification or add additional context in comments.

Comments

1

I want to use sscanf() to check if the string has only letters, numbers and dashes in it.

Variation of @rici good answer.

Create a scanset for letters, numbers and dashes.

//v              The * indicates to scan, but not save the result.
//  v            Dash (or minus sign), best to list first.
"%*[-0-9A-Za-z]"
//      ^^^^^^   Letters a-z, both cases
//   ^^^         Digits  

Use "%n" to detect how far the scan went.

Now we can use determine if

  1. Scanning stop due to a null character (the whole string is valid)

  2. Scanning stop due to an invalid character


int n = 0;
sscanf(str, "%*[-0-9A-Za-z]%n", &n);

bool success = (str[n] == '\0');

Comments

0

sscanf does not have this functionality, the argument you are referring to is a format specifier and not used for validation. see here: https://www.tutorialspoint.com/c_standard_library/c_function_sscanf.htm

Comments

0

as also mentioned sscanf is for a different job. for more in formation see this link. You can loop over string using isalpha and isdigit to check if chars in string are digits and alphabetic characters or no.

    char str[]={"Santa-monica 123"}
    for (int i = 0; str[i] != '\0'; i++)
    {
        if ((!isalpha(str[i])) && (!isdigit(str[i])) && (str[i] != '-'))
            printf("wrong character %c", str[i]);//this will be printed for spaces too
    }

Comments

0

I want to ... check if the string has only letters, numbers and dashes in it.

In C that's traditionally done with isalnum(3) and friends.

bool valid( const char str[] ) {
  for( const char *p = str; p < str + strlen(str); p++ ) {
    if( ! (isalnum(*p) || *p == '-') )
      return false;
  }
  return true;
}

You can also use your friendly neighborhood regex(3), but you'll find that requires a surprising amount of code for a simple scan.

2 Comments

How many lines do you consider surprising? An include, a declaration and two lines of code? coliru.stacked-crooked.com/a/3f2127b87da802a7
@rici, fair point, well done. I was thinking of what's needed to pull out matched substrings, but of course that's not the OP's question.
-1

After retrieving value on sscanf(), you may use regular expression to validate the value.

Please see Regular Expression ic C

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.