0

I have a xml response with a structure ( e.g groups > subgroups > child records ) and repeats several thousand times with different values with each one.

I want to only grab subgroups whose 9 digit serial number field matches what I provide and extract them into its own file. When extracting the subgroups, it would also bring the group with it. I was hoping such a task would be possible using notepad++ and possibly through the use of regex, but not sure how I would go about in doing this.

1
  • 2
    No. Use an XML parser and XPath. Notepad++ with regex is not the right tool for this task. Commented Feb 19, 2013 at 16:01

1 Answer 1

1

I'm not familiar with notepad++ specifically. For my solution to work it work it would need to support multi-line regular expressions and advanced regular expression syntax (non-greedy matching). Not all text editors that support regular expressions will do either.

I would start and end with the outer tag that you want\<subgroup\>\<\/subgroup\> escaping the special characters. To capture what is in between, I would use non-greedy matching \<subgroup\>\.*?<\/subgroup\>. Then I would add the specific serial number you are interested in \<subgroup\>\.*?123456789.*?<\/subgroup\>.

If you want to find some of multiple serial numbers here use something like

\<subgroup\>\.*?(123456789|987654321|678912345).*?<\/subgroup\>
Sign up to request clarification or add additional context in comments.

2 Comments

In my case multiple serial numbers are involved and I was thinking if it was possible to somehow add these in a comma delimited type list for notepad ++ to extract only the groups with the serial numbers in the list.
Edited my answer to include finding a list of serial numbers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.