0

I have to filter out from a huge amount of cpp files all the macros defined, excluding the guards, for example:

#if <NAME>
#ifdef <NAME>
#ifndef <NAME>
#if defined(<NAME>)
#if defined <NAME>
!defined(<NAME>)
!defined <NAME>
#else if <NAME>
#elif <NAME>

I have to retrieve all the NAMEs, but they are not all in the form XXX, due to different programmers working on the project, there are a lot of definitions, so I am facing problems in define a regex that can extract only <NAME> from each of the situations just described.

Any advice is appreciated!

EDIT As someone pointed out, my NAME (with surrounding angle brackets <>) is only a placeholder, where in reality it can be XXXX, XXXX, XX_Y, _XXX , _XXX_Y, XXYY where X and Y could be uppercase letters or digits, with no regularity in the name! They are directives to the preprocessor and I have to filter them out.

2
  • This is a job for grep or awk, depending on what you need to do with the lines that you find. Commented Feb 9, 2017 at 16:21
  • GCC has options to print (list) the macros in effect at the end of a TU (translation unit). If you need to modify the files to eliminate or retain blocks of code, there are tools for that, too. If you just need to find the macro names, then you probably have to worry about #if defined(A) || defined(B) || (defined(C) && defined D), (which likely needs to list all of A, B, C, and D), and you need to worry about macro conditions being continued over multiple lines with backslashes at the end of each non-continuation line. Commented Aug 23, 2022 at 15:39

1 Answer 1

3

Quickly tested this using http://regexr.com with the examples you provided. Matches most of the cases.

You might have to refine it a little.

([#!][A-z]{2,}[\s]{1,}?([A-z]{2,}[\s]{1,}?)?)([\\(]?[^\s\\)]{1,}[\\)]?)?

Quick explanation:

([#!][A-z]{2,}[\s]{1,}?([A-z]{2,}[\s]{1,}?)?)

Matches (most) strings beginning with a '#' or '!', and a directive. A second word is also allowed, whitespaces are ignored (it will match with and without n whitespace)

([\(]?[^\s\)]{1,}[\)]?)?

Will match both bracketed and none-bracketed strings. Will not match if whitespace is inside the brackets.

If you want to match whitespace inside the brackets, change ^\s\) to ^\)

Update Some of the backspaces weren't displayed in the answer. Reserved characters e.g.: []{}(), etc must be escaped. Fixed the answer. Might have missed one or two, sorry in that case.

Update 05.03.2020 @gregn3 has provided an updated version in the comments which allows for whitespace between the # and the following word.

([#!][ \t]*[A-z]{2,}[\s]{1,}?([A-z]{2,}[\s]{1,}?)?)([\\(]?[^\s\\)]{1,}[\\)]?)?
Sign up to request clarification or add additional context in comments.

1 Comment

Here are more examples of preprocessor directives syntax. There can be spaces and tabs, but nothing else, between the # and the define. This is an updated version of your regex: ([#!][ \t]*[A-z]{2,}[\s]{1,}?([A-z]{2,}[\s]{1,}?)?)([\\(]?[^\s\\)]{1,}[\\)]?)? (added [ \t]* after the initial [#!])

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.