2

I want to split a file into two, but cannot find a way to do this.

Master.txt
Happy Birthday to you!  [[#HAPPY]]
Stop it.  [[#COMMAND]]
Make a U-turn. [[#COMMAND]]

I want to split into two files, with the 2nd file starting when it matches the regex pattern [[#

Output1.txt
Happy Birthday to you!
Stop it.
Make a U-turn.

Output2.txt
[[#HAPPY]]
[[#COMMAND]]
[[#COMMAND]]

I've tried using awk:

awk -v RS="[[#*" '{ print $0 > "temp" NR }'

but it doesn't give my desired output -- any help would be appreciated!

7 Answers 7

4

Here is one way with GNU awk:

awk -v RS='\\[\\[#|\n' 'NR%2{print $0>"Output1.txt";next}{print "[[#"$0>"Output2.txt"}' master

Test:

$ ls
master

$ cat master 
Happy Birthday to you!  [[#HAPPY]]
Stop it.  [[#COMMAND]]
Make a U-turn. [[#COMMAND]]

$ awk -v RS='\\[\\[#|\n' 'NR%2{print $0>"Output1.txt";next}{print "[[#"$0>"Output2.txt"}' master

$ ls
master  Output1.txt  Output2.txt

$ head Out*
==> Output1.txt <==
Happy Birthday to you!  
Stop it.  
Make a U-turn. 

==> Output2.txt <==
[[#HAPPY]]
[[#COMMAND]]
[[#COMMAND]]
Sign up to request clarification or add additional context in comments.

2 Comments

super duper, I was thinking on same line but you were too quick.
You can reduce the print calls ,see my answer. Cheers
1

A pure bash solution might be a little slower, but is very readable:

while read line; do
    [[ $line =~ (.*)(\[\[#.*]]) ]]
    printf "%s" "${BASH_REMATCH[1]}" >&3
    printf "%s" "${BASH_REMATCH[2]}" >&4
done 3> output1.txt 4> output2.txt

Comments

0
you can write small script like this…
#!/bin/ksh

sed -i -e 's/ \[\[#/,\[\[#/' $1

cut -d, -f1 $1 > $1.part1
cut -d, -f2 $1 > $1.part2    
---------------------------------------------

OR…use multi-command line

# sed -i -e 's/ \[\[#/,\[\[#/' Master.txt ; cut -d, -f1 Master.txt > output1.txt ; cut -d, -f1 Master.txt > output.txt

1 Comment

be carefull, you assume that , is not used in first part of text and use it to cut but it's maybe not the case in the full file.
0

Simpler in sed, IMHO:

$ sed 's/^\([^[]*\).*/\1/' Master.txt > Output1.txt
$ sed 's/^[^[]*//'         Master.txt > Output2.txt

6 Comments

idea is good but does not reply to specific request. it mention to separe string after [[# not [
Same thing for the sample input file.
this is a sample but you could have Happy day to you [and your wife]! [[#HAPPY]] that will fail. It is just "assume" that there is no [[# in first part of the line.
If this case can occur, it should be treated in the sample input/output.
so why does he mention [[# if [ is enough. Sample not always show every case possible. If constraint is reduce, your code works
|
0
sed -n 's/\[\[#/\
&/;P
/\n/ {s/.*\n//;H;}
$ {x;s/\n//;w Output2.txt
  }' YourFile > Output1.txt

in 1 sed but awk is better suite for this task

Comments

0

This might work for you(GNU sed):

sed -n 's/\[\[#/\n&/;P;s/.*\n//w file3' file1 >file2

Comments

0

No need for gnu awk, this should work for any awk

awk -F'\\[\\[#' '{print $1>"Output1.txt";print "[[#"$2>"Output2.txt"}' Master.txt

cat Output1.txt
Happy Birthday to you!
Stop it.
Make a U-turn.

cat Output2.txt
[[#HAPPY]]
[[#COMMAND]]
[[#COMMAND]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.