I'm trying to replace substring in a text file with some other substrings using sed
For example,
sed 's/dogs chase/<bop> dogs chase <eop>/g; s/birds eat/<bop> birds eat <eop>'/g corpus.txt
So instead of dogs chase in corpus.txt, I replace it with <bop> dogs chase <eop>, birds eat with <bop> birds eat <eop>.
Suppose I have all the substrings in a textfile sub.txt and I want to use to replace the text in the corpus.txt file, is there a way I can have my command to work
.e.g.
dogs chase
birds eat
chase birds
chase cat
sed 's/dogs chase/<bop> dogs chase <eop>/g; s/chase birds/<bop> chase birds <eop>/g; s/chase cat/<bop> chase cat <eop>/g; s/birds eat/<bop> birds eat <eop>'/g corpus.txt
The sed command would replace dogs chase with <bop> dogs chase <eop>, birds eat with <bop> birds eat </eop>, chase birds with <bop> chase birds <eop>, chase cat and <bop> chase cat <eop>. The hand crafted command would be difficult to write if the sub.txt contains 100s of the substring.
Note the corpus.txt file
dogs chase cats around
dogs bark
cats meow
dogs chase birds
cats chase birds , birds eat grains
dogs chase the cats
the birds chirp
The desired output:
<bop> dogs chase <eop> cats around
dogs bark
cats meow
<bop> dogs chase <eop> birds
cats <bop> chase birds <eop> , <bop> birds eat <eop> grains
<bop> dogs chase <eop> the cats
the birds chirp