1

I'm trying to use C++11's regex for a very simple filtering task, but I couldn't make it to work like I want it. So I started to write a separate demonstration program.

The thing is that the simplest things fail miserably. For example:

#include <regex>
#include <string>
#include <iostream>

int main()
{
  std::vector<std::string> inputs;
  inputs.push_back("1");
  inputs.push_back("123");
  inputs.push_back("a");
  inputs.push_back("apple");
  inputs.push_back(":apple3.worm");

  std::string pattern("[0-9]");
  std::regex r(pattern, std::regex_constants::grep);

  for(auto const &s: inputs)
  {
    bool ok = std::regex_match(s, r);
    std::cout << (ok?"POS":"NEG") << ": " << s << std::endl;
  }
  return 0;
}

Compiled without warnings with g++ -Wextra -pedantic -std=c++11 -O3 rfail.cpp -o rfail. Output:

POS: 1
NEG: 123
POS: a
NEG: apple
NEG: :apple3.worm

Same happend when I replace [0-9] with [[:digit:]]. What is happening? What do I do wrong?

Update:

$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.4-2ubuntu1~14.04.3' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
1
  • You should give us more informations about your g++ Commented Jun 7, 2016 at 8:11

1 Answer 1

1

If you read the regex_match doc carefuly, you'll notice that:

The entire target sequence must match the regular expression for this function to return true (i.e., without any additional characters before or after the match). For a function that returns true when the match is only part of the sequence, see regex_search.

Thus, if you want to check if your string contains at least 1 number, change your regex to .*[0-9].*


Note that I can't reproduce your output, mine is:

POS: 1
NEG: 123
NEG: a // <- here's the diff
NEG: apple
NEG: :apple3.worm

(compiled with Apple LLVM version 7.3.0 (clang-703.0.29))


Given your version of gcc, it seems that it's running a highly experimental implementation of <regex> which has been included in gcc 4.9 more information about the bug here.

You should consider an update if you consider using regex within your code.

Sign up to request clarification or add additional context in comments.

2 Comments

Why [0-9] matches a?
I get what you say about the full matching. My problem is that 3rd line in the output. I will update my post with the g++ version.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.