I have the output of a pandoc conversion to HTML which looks like this:
foo
bar
<blockquote>
That's one small step for man, one giant leap for mankind
A new line and another quote
</blockquote>
baz
I'd like to make it like this:
foo
bar<blockquote>That's one small step for man, one giant leap for mankind
A new line and another quote</blockquote>baz
(Because block quotes are rendered separately anyway so I don't need the extra new lines.)
I started trying with sed and ended up with this awk:
'/./ {printf "%s%s", $0, ($1 ~ /^$/ && $2 ~ /<\/?blockquote>/) ? OFS : ORS}'
Which does part of what I want, but is a bit too advanced for me to understand how to modify.
In words I think the rule I want is: if the next line is blank and the one after matches /<\/?blockquote>/, then print current line, next line, and the one after without any separators, and then move on.
awkshall not work on six lines of text only. If that is the case then please explain exactly how to handle the data.fooandbarcan't be written like that;foomust be part of some other node, possibly apnode, whilebarcould be the value of thebodynode that you don't show, which, if this is HTML, ought to be part of a roothtmlnode).<p>tags. The full command I use to produce this ispandoc file.org -t html | gsed 's:</\?p>::g' | gsed 's:$:\n:g'