2

I want to deal only with string which is NOT C++ comment, here is the pattern to find out C++ comment:

pattern = re.compile(r'//.*?$|/\*.*?\*/|\'(?:\\.|[^\\\'])*\'|"(?:\\.|[^\\"])*"', re.DOTALL | re.MULTILINE)

However, I don't know how to make it to work as my intention.

# Python 3.4.2
s = '''
/****
C++ comments
  //pResMgr->CreateDialogEx();
****/
//pResMgr->CreateDialogEx();
/*//pResMgr->CreateDialogEx();*/

// real code, I want to replace only this following line of code
pResMgr->CreateDialogEx();
'''

newS = s.replace('CreateDialogEx', 'Create')
print(newS)

My expected output is:

/****
C++ comments
  //pResMgr->CreateDialogEx();
****/
//pResMgr->CreateDialogEx();
/*//pResMgr->CreateDialogEx();*/

// real code, I want to replace only this following line of code
pResMgr->Create();
6
  • I would not use regex if I were you, I think you'd better iterate over your string removing what is after // until new line and what is after /* until */... and THEN apply regex... Commented Jul 13, 2015 at 7:31
  • @JuniusRendel , how about pResMgr->CreateDialogEx(); // pResMgr->CreateDialogEx();? Commented Jul 13, 2015 at 7:43
  • I don't understand what you mean... Commented Jul 13, 2015 at 7:58
  • @JuniusRendel, pResMgr->CreateDialogEx(); // pResMgr->CreateDialogEx(), as in your advice, I can delete comments at first, it's true, but my result should contain the original unchanged comment. What's the time to add deleted text? Commented Jul 13, 2015 at 8:00
  • Possible duplicate with stackoverflow.com/questions/16720541/…. Commented Jul 13, 2015 at 8:08

1 Answer 1

2

Didn't test it, but it works it with your case and fundamentally should work. It basically goes throught the text finding newline, // or /* and then handling the cases. Really simple, no regex.

source_code = '''//pResMgr//->CreateDialogEx();'''

def indexOf(string, character):
    return string.index(character) if character in string else 9999999

def replaceNotInComments(string, searchFor, replaceWith):
    result = ''
    nextBreak = 0
    while True:
        nextBreak = min(indexOf(string, '\n'),
                        indexOf(string, '/*'),
                        indexOf(string, '//'))
        if nextBreak == 9999999:
            result += string.replace(searchFor, replaceWith);
            break
        result += string[0:nextBreak].replace(searchFor, replaceWith);

        if nextBreak == indexOf(string, '\n'):
            string = string[nextBreak+1:]

        if nextBreak == indexOf(string, '/*'):
            string = string[nextBreak+2:]
            result += '/*'+string[0:indexOf(string, '*/')+2]
            string = string[indexOf(string, '*/')+2:]

        if nextBreak == indexOf(string, '//'):
            string = string[nextBreak+2:]
            if result != '':
                result += '\n'  
            result += string[0:indexOf(string, '\n')+1]
            string = string[indexOf(string, '\n')+1:]


    return result

result = replaceNotInComments(source_code, 'CreateDialogEx', 'Create')
print(result)
Sign up to request clarification or add additional context in comments.

6 Comments

your result changes '/' to '\'! here is your result: \ **** C++ comments //pResMgr->CreateDialogEx(); ****/ pResMgr->CreateDialogEx(); \ *//pResMgr->CreateDialogEx();*/ // real code, I want to replace only this following line of code pResMgr->Create();
oh, that was a typo. Works now?
I found a serious issue, if source_code = '''//pResMgr->CreateDialogEx();''', your code still changes source_code
fixed again. btw, you can try to debug the code yourself
ouch, I found another bug, suppose input: source_code = 'cout << "hello, money\r\n, new line;"', then output lost '\r\n', it becomes 'cout << "hello, money , new line;"', how can I change your code directly?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.