Save Each Multiline Grep Output To Array Record

Question

I am parsing XML with regex. It is well known so there is no need to worry about escaping etc and proper XML parsing.

grep is returning multiple lines and I want to store each match to a file.

However, I either get each line in between my tags in my array array=( $list ) or I get the whole output array=( "$list" ).

How can I loop over each match from grep?

My script currently looks like this:

#!/bin/bash

list=$(cat result.xml|grep -ozP '(?s)<tagname.*?tagname>')
array=( "$list" )
arraySize=${#array[@]}
for ((i = 0; i <= $arraySize; i += 1)); do
  match="${array[$i]}"
  echo "$match" > "$i".xml
done

Can you show sample data from result.xml?

anubhava
– anubhava

2016-04-06 19:16:35 +00:00
Commented Apr 6, 2016 at 19:16 — anubhava
– anubhava, Commented Apr 6, 2016 at 19:16

Community · Accepted Answer · 2017-05-23 11:50:37Z

1

According to this answer, the upcoming version of grep will change the meaning of the -z flag so that both input and output are NUL-terminated. So that will automatically do what you want, but it's only available today by downloading and building grep from the git repository.

Meanwhile, a rather hackish alternative is to use the -Z flag which terminates the file name with a NUL character. That means you need to print a "filename", which you can do by using -H --label=. That will print an empty filename followed by a NUL before each match, which is not quite ideal since you really want the NUL after each match. However, the following should work:

grep -ozZPH --label= '(?s)<tagname.*?tagname>' < result.xml | {
  i=0
  while IFS= read -rd '' chunk || [[ $chunk ]]; do
    if ((i)); then
      echo "$chunk" > $i.xml
    fi
    ((++i))
  done
}

edited May 23, 2017 at 11:50

CommunityBot

11 silver badge

answered Apr 6, 2016 at 19:48

rici

243k30 gold badges263 silver badges364 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

shellter Over a year ago

you meant ((++i)) ? ;-) . Wow, -ozZPH ... grep has grown up since Sun 4 days ;-) Good luck to all.

rici Over a year ago

@shellter- wow, lots of typos. Hopefully fixed, thanks. Maybe -PHozZ would be cooler :)

shellter Over a year ago

-PHozZ LOL, another advantange to having more options on grep ;-) .

quazardous · Accepted Answer · 2016-04-06 19:19:04Z

0

Directly cat you lines to a while loop

my_spliting_command | grep something | while read line
do
    echo $line >myoutputfile.txt
done

answered Apr 6, 2016 at 19:19

quazardous

88610 silver badges16 bronze badges

Comments

Quinn · Accepted Answer · 2016-04-07 17:57:38Z

0

You could use grep to grab all the matches first, and then use awk to save each matched pattern into separate files (e.g. file1.xml, file2.xml, etc):

cat result.xml | grep -Pzo '(?s)(.)<tagname.*?tagname>(.)' | awk '{ print $0 > "file" NR ".xml" }' RS='\n\n'

edited Apr 7, 2016 at 17:57

answered Apr 7, 2016 at 15:41

Quinn

4,5142 gold badges24 silver badges19 bronze badges

Collectives™ on Stack Overflow

Save Each Multiline Grep Output To Array Record

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related