0

How can I extract someXml ?

frame 0
    push 'this'
    getVariable
    push 'g_data_1343488'
    push ' 

    someXml'

    setMember
end // of frame 0

I'm trying to use RegEx but i'm unsuccessful with it :

foreach (var match in Regex.Matches(file, @"(?<=push ').*(?=')"))

Problem with this one : I don't want to have for exemple 'g_data_1343488' or 'this' to be grabbed.

2
  • so you want the text between last 'push' and 'setMember'? Commented Nov 15, 2012 at 11:53
  • yes ! i want to grab someXml (and not someXml') Commented Nov 15, 2012 at 11:54

2 Answers 2

1

Here is one possibility. It is a regex that tries to recognize the contents between the single quotes as XML. It's not a perfect regex for this. It really depends on your requirements if it is ok to use. The more accurate the regex has to be, the more difficult it becomes to read. As it is, this expression will not match all XML and will match some invalid XML as well.

For example this regex will match tags with names that start with numbers. It would also match XML closing tags with attributes. You could tweak it depending on your needs.

Here it is:

push\s+'\s*<(\w+)(?:\s+\w+=(?:"[^"]*"|'[^']*'))*>(?:[^<]+|(?!</\1>)</?\w+(?:\s+\w+=(?:"[^"]*"|'[^']*'))*\s*/?>)*</\1>\s*'

Here is a breakdown of the expression. The start of the push statement:

push\s+'\s*

Detect the root XML tag and capture its name. Allow for attributes that are single and double quote delimited.:

<(\w+)(?:\s+\w+=(?:"[^"]*"|'[^']*'))*>

Loop through all the inner tags and text elements inside the root tag. Allow for attributes that are single and double quote delimited.

(?:[^<]+|(?!</\1>)</?\w+(?:\s+\w+=(?:"[^"]*"|'[^']*'))*\s*/?>)*

Capture the closing root tag.

</\1>\s*'

You could also try simply capturing the push commands and run their values through a function like in this solution: How to check for valid xml in string input before calling .LoadXml()

Sign up to request clarification or add additional context in comments.

Comments

0
var allMatches = Regex.Matches(text, @"(frame.*push ')(.*?)(?='.*end)", RegexOptions.Singleline);

foreach (Match matches in allMatches)
{
    String somexml = matches.Groups[2].Value;
}

1 Comment

P.s. if we put first group in ?<= then gready .* will not work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.