0

INPUT:

dsfgsdf8gfsd
2011.06.26. v
iudsfg98sdfg
sosdufgsdfg
2011.06.27. h
8xdofguiosdfg
jdasfhasd89fa
2011.06.28. k
ydsfgsdgsdg
dsfgdsfzfszgh
2011.06.29. sze
ds9fgisdfgsdfg
asdfasdfasddf
2011.06.30. cs
dsg789sdiofgsdg
dsfig89dsfgds
2011.07.01. p
sd9fg8sdgsdg
sdlfjgsd89öfgxcbv
dsglsd9gcxbv
dsflgjsdlfgfsdg
sdfsdfgdxfgxc
2011.07.02. szo
cvbdsgfsd
2011.07.03. v
dfgsdfgsd
2011.07.04. h
sdfgsdfgsdg

How can I get this OUTPUT with e.g.: sed? (or Perl?)

2011.06.26. v
iudsfg98sdfg
sosdufgsdfg
----------
2011.06.27. h
8xdofguiosdfg
jdasfhasd89fa
----------
2011.06.28. k
ydsfgsdgsdg
dsfgdsfzfszgh
----------
2011.06.29. sze
ds9fgisdfgsdfg
asdfasdfasddf
----------
2011.06.30. cs
dsg789sdiofgsdg
dsfig89dsfgds
----------
2011.07.01. p
sd9fg8sdgsdg
sdlfjgsd89öfgxcbv
dsglsd9gcxbv
dsflgjsdlfgfsdg
sdfsdfgdxfgxc
----------
2011.07.02. szo
cvbdsgfsd
----------
2011.07.03. v
dfgsdfgsd
----------
2011.07.04. h
sdfgsdfgsdg

So I want to swap the:

2011.06.26. v

AND

2011.06.27. h

etc. to this:

----------
2011.06.26. v

AND

----------
2011.06.27. h

I already tried (don't laugh :D ):

sed "s/[0-9]\{4\}\.[0-9]\{2\}\.[0-9]\{2\}\. /WTF/g"

But I don't know how to match "h, k, sze, cs, p, szo, v" in sed, and I don't know how can I put the matched things to the "WTF" (in .../WTF/g")

Has anyone any idea? :\

Thank you!

4
  • Does it actually need to be sed? For some reason people have a desperate need to use sed to mess with multiple lines at once or insert multiple lines; there are better tools for stuff like that Commented Jun 10, 2011 at 18:54
  • quotation: (or Perl?) Commented Jun 10, 2011 at 19:24
  • 1
    Well, does it actually need to be sed or perl, then. For example, this is trivial in awk: awk '/pattern/ {print "--------"; print}' Commented Jun 10, 2011 at 19:27
  • omg... :D of course :D thx.. Commented Jun 10, 2011 at 19:55

4 Answers 4

2

A starting point is this sed line:

$ echo 2011.06.26. v | sed 's/^\([0-9]\+\.[0-9]\+\.[0-9]\+\. \([hv]\|sze\)\)$/----------\n\1/'
----------
2011.06.26. v

Since sed uses basic regular expression syntax (by default), you have to escape the ()|+ characters to get their special meaning (grouping, alternative, one or more). With \1 you backreference the first group match.

3
  • Note that alternation (\|) and \n standing for a newline in replacement text work in GNU sed and some others, but they're not in POSIX. Commented Jun 10, 2011 at 20:23
  • @Gilles, POSIX regex don't include alternation? Commented Jun 10, 2011 at 22:28
  • 3
    Sadly, no, not the basic regular expressions (BRE) that sed uses. POSIX BREs only support […] character classes, ., * repetition, ^$ anchors, and \{…\} repetition, plus \(…\) subexpressions and \N backreferences. \?, \+ and \| are common but not universal extensions. POSIX Extended regular expressions (ERE), such as used by awk, support the usual operators ()[].?*+{}|. Commented Jun 10, 2011 at 22:44
0

I found this solution using sed:

sed -n '/^[0-9]\{4\}\.[01][0-9]\.[0123][0-9]\./,${:a;N;$!ba;{s/\([0-9]\{4\}\.[01][0-9]\.[0123][0-9]\.\)/--------------\n\1/g;p}}'

The disadvantage is that the date has to be matched twice. Maybe there's another (better) solution.
The output is exactly as you expect in your example.

0

In other words you want to insert the line ---------- before every line that contains a YYYY.MM.DD date followed by a space and a bunch of lowercase letters. There are several ways to do this. You can use the insert command (i):

sed -e '/^[0-9][0-9][0-9][0-9]\.[0-9][0-9]\.[0-9][0-9] [a-z][a-z]*$/ i \
----------'

Or you can replace the empty string at the beginning of the line by a newline.

sed -e '/^[0-9][0-9][0-9][0-9]\.[0-9][0-9]\.[0-9][0-9] [a-z][a-z]*$/ s/^/----------\
'

Or you can use & in the replacement text of an s command to stand for the matched pattern.

sed -e 's/^[0-9][0-9][0-9][0-9]\.[0-9][0-9]\.[0-9][0-9] [a-z][a-z]*$/----------\
&'

Some sed implementations allow you to write \n instead of backslash-newline in the replacement text, but on others \n prints \n or n.

0

You should use awk instead

awk ' /[0-9]{4}\.[0-9]{2}\.[0-9]{2}\. / { print "---------------------\n" $0 ; continue } /^/ { print $0 } ' <"INPUTFILE" >"OUTPUTFILE"

basically it works in 2 steps:

step1: /[0-9]{4}\.[0-9]{2}\.[0-9]{2}\. / { print "---------------------\n" $0 ; continue }

means: if it maches /4digits.2digits.2digits. / then print "---...--\n" followed by the matching line, and loop on the next line (= "continue").

step2: /^/ { print $0 }

means: if we didn't match the above, then for all other lines (ie, matching a beginning of line, so even an empty line gets matched), just print that line.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.