1

I have to replace a substring with another string in a file.

Below is the line which is present in the file.

Input: #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */

Expected Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]

Below is my code:

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
comment = re.search(r'([\/\*]+).*([^\*\/]+)', line)
replace = re.search(r'([\[[]).*([\]]])', comment.group())
replaceWith = replace.group()
content_new = re.sub(r'([\/\*]).*([\*\/])', '# ' + replaceWith, line)

Is there an optimal solution for the above code?

5
  • 1
    What is content? What is the problem with the regex? If you need to replace, why do you use re.search, and even twice? Commented Jul 27, 2020 at 17:23
  • ive edited content with line . Is there any alternative way to approach this problem with one regex statement? Commented Jul 27, 2020 at 17:31
  • I am unable to do the search for replaceWith in a single regex statement. Commented Jul 27, 2020 at 17:39
  • content_new = re.sub(r'/\*(.*?)\*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group(1)), line), see demo. Commented Jul 27, 2020 at 17:39
  • Two more ways: regex101.com/r/RbPmfB/1 and regex101.com/r/RbPmfB/2 Commented Jul 27, 2020 at 17:48

1 Answer 1

2

You need to match comments, say, with Regex to match a C-style multiline comment, and then replace the [[...]] substring inside the matches. This approach is safest, it won't fail if there is [[ and no ]] inside the comment, and there are several such comments in the string.

The sample code snippet will look like

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
content_new = re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()), line)
print(content_new)

Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]].

Details:

  • re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line) - finds all C style comments in the string, and
  • lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()) is the replacement: x is the match data object with the comment text, it matches any text up to [[, then captures [[ and then any text up to and including ]], and then matches the rest of the comment, and replaces with #, space and the Group 1 value. See the regex demo here.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.