So I am trying to remove embedded \n from log lines without removing the \n for each log line from command line. I have tried these and they all changed all \n to ~.
cat test1.txt | perl -n -e 's{\n(?!2013)}{~}mg;print' > test1a.fix
perl -n -e 's{\n(?!2013)}{~}mg;print' test1.txt > test1b.fix
All ignore the negative look behind.
test1.txt contains
2013-03-01 12:23:59,1
line2
line3
2013-03-01 12:23:59,4
test1a.fix and test1b.fix contained
2013-03-01 12:23:59,1~line2~ line3~2013-03-01 12:23:59,4
But I came up with the regex using this script.
#!/usr/bin/perl
use warnings;
use strict;
sub test {
my ($str, $expect) = @_;
my $mod = $str;
$mod =~ s{\n(?!2013)}{~}mg;
print "Expecting '$expect' got '$mod' - ";
print $mod eq $expect ? "passed\n" : "failed\n";
}
test("2013-03-01 12:23:59,line1
line2
line3
2013-03-01 12:23:59,line4", "2013-03-01 12:23:59,line1~line2~ line3
2013-03-01 12:23:59,line4");
and it produces the following output that matches what I want.
sfager@linux-sz05:~/logs> ./regex_test.pl
Expecting '2013-03-01 12:23:59,line1~line2~ line3
2013-03-01 12:23:59,line4' got '2013-03-01 12:23:59,line1~line2~ line3
2013-03-01 12:23:59,line4' - passed
sfager001@linux-sz05:~/logs>
Can anyone explain why these work differently and how this can be done on the command line?