0

I have a file containing the following content:

(Item)
(Values)
blabla
blabla
(StopValues)
(Item)
(Values)
hello
hello
(StopValues)

I'd like to split it into multiple files so that one file always has the content from (Item) to (StopValues) (including both of these tags). Also, as I have to further use those files and use mktemp, I'd like to save each filename in an array when creating it.

To split them I used an approach with awk:

  awk '/(StopValues)/{n++}{print >"out" n ".txt" }' mainfile.txt

First problem here, when providing 'one set' of data, I still get 2 new txt files, one containing just (StopValues) tag, the other one missing just this tag.

Second problem, I'd like to create files with mktemp instead of naming them myself and I need them in an array, how would I dynamically make new ones in the awk loop and save their name into an array?

6
  • an array you meant awk array or a bash array? Commented Feb 28, 2017 at 14:02
  • I mean a bash array Commented Feb 28, 2017 at 14:02
  • how many sections/blocks could your file have? Commented Feb 28, 2017 at 14:03
  • thats not defined, I call my script like this cat *.txt | ./script and all the content from cat is written in one file. When I pipe the cat of all files like that, would there be a possibility to see which content came from which "cat(ed) file", respectively split it and directly get it as array? Because if thats possible all I try to do is not needed anymore Commented Feb 28, 2017 at 14:06
  • ok, all clear, posted my answer. Commented Feb 28, 2017 at 14:37

2 Answers 2

1

First of all, the command:

arr=($(awk 'BEGIN{cmd="mktemp -u"; cmd|getline tmp}
{print > tmp}/\(StopValues/{a[++i]=tmp;close(cmd);close(tmp); cmd|getline tmp;}END{for(i=1;i<=length(a);i++)print a[i]; }' inputFile ))

The awk part:

awk 'BEGIN{cmd="mktemp -u"; cmd|getline tmp}
     {print > tmp}
     /\(StopValues/{a[++i]=tmp
                    close(cmd)
                    close(tmp)
                    cmd|getline tmp} 
    END{for(i=1;i<=length(a);i++)print a[i]; }' inputFile

With this inputFile (f): (I added the third block)

kent$  cat f
(Item)
(Values)
blabla
blabla
(StopValues)
(Item)
(Values)
hello
hello
(StopValues)
(Item)
(Values)
hello
hello
(StopValues)

The awk will output:

#The filenames can be different.
/tmp/tmp.DRaLMsXROR
/tmp/tmp.yUL6GO4xtv
/tmp/tmp.Kb0UxsHVno

So you can see the output have 3 temp files. Each file contains a block of input file.

The outputs have the temp file names, which we put in the bash array declaration, thus we have them in array. So put all together, we do a test:(here I just check the first block/tempfile):

kent$  arr=($(awk 'BEGIN{cmd="mktemp -u"; cmd|getline tmp}                                                                                                                  
    {print > tmp}/\(StopValues/{a[++i]=tmp;close(cmd);close(tmp); cmd|getline tmp;}END{for(i=1;i<=length(a);i++)print a[i]; }' f ))

kent$  echo ${arr[*]}                                                                                                                                                       
/tmp/tmp.fcf7ac0eVl /tmp/tmp.Rjru5psFQB /tmp/tmp.ldaBWCucNg

kent$  echo ${arr[1]}
/tmp/tmp.fcf7ac0eVl

kent$  cat $(echo ${arr[1]})                              
(Item)
(Values)
blabla
blabla
(StopValues)
Sign up to request clarification or add additional context in comments.

Comments

0

@Try:

awk '/(Item)/{A=1;count++} A{VAL=VAL?VAL ORS $0:$0} /(StopValues)/{A="";print VAL > "out" count ".txt";VAL=""}'   Input_file

Will create 2 files out2.txt and out1.txt.

EDIT: Adding a non-one liner form of solution too now.

awk '/(Item)/{
                A=1;
                count++
             }
    A        {
                VAL=VAL?VAL ORS $0:$0
             }
    /(StopValues)/{
                        A="";
                        print VAL > "out" count ".txt";
                        VAL=""
                  }
    '    Input_file

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.