is it possible to use sed to match the source of a script tag within html (which is not valid xml) and replace the whole match with the files' contents?
e.g. say the html contained
<link rel='stylesheet' href="blah.css">
<script src='foo.js'></script>
<script type="text/javascript" src="bar.js"></script>
<title />
I want to not only match 'foo.js' from within src='foo.js' but also to replace in this file the contents of foo.js, so to end up
<link rel='stylesheet' href='blah.css'>
<script>var foo = {};</script>
<script>var bar = [];</script>
<title />
In a regex I can match the script tag src value like so:
<script\s+(?:[^>]*?\s+)?src=(["'])(.*?)\1
with the match being in the second capture group.
I don't mind rewriting the whole line, but how do I get sed to match on that expression - it doesn't seem to like capture groups or backreferences (at least, the way I'm trying it: I know it does). I get an unhelpful
syntax error near unexpected token `)'
Also how can I capture the file name and then pipe its contents back in as the replacement line?
sedcommand that is giving you the error message? What version ofsedare you using?sed -E 's/<script\s+(?:[^>]*?\s+)?src=(["'])(.*?)\1/whatever/' file.html; trying to escape that single quote with\'or \x27 isn't happeningsed, you need to replace'with'\''when you try to write single quotes in single quoted strings.sed -E "s/<script\s+(?:[^>]*?\s+)?src=(["'])(.*?)\1/whatever/" file.htmlyieldsRE error: repetition-operator operand invalidwhich is normally missing-rbut-Eshould cover that["']supposed to be[\"']of course, but it still has the error