0

I have one statement needed to be substituted. The original format is like:

f.STRING.focus();

Where the STRING is the combination of [:alpha:] and [:digit:] (regular expression). My purpose is to change it to

highlight("STRING");

For instance:

f.abCDef12345.focus()    --->     highlight("abCDef12345");
f.ip2.focus()            --->     highlight("ip2");

I can easily use sed to replace the statement for hundreds of html files. However, I don't know how to get the STRING in shell script.

Procedures can be described as follow:

For each html:
    For the STRING which matches the pattern:
        1. Assign it to a parameter.
        2. Insert that STRING to highlight("STRING");
        3. Replace the old one "f.STRING.focus();" to "highlight("STRING");"

But I don't know how to write them in shell script... Any hint is appreciated.

Updated:

  1. Please describe your script clearly. Thank you too much!
  2. SORRY FOR THE MISTAKE! STRING IS THE COMBINATION OF and [:alpha:] and [:digit:]. So the example mentioned here f.ip2.focus() can make sense.
1
  • Can't you just use sed with regexp on multiple files? I'm not sure if I got the idea correctly. Commented Feb 14, 2014 at 12:02

5 Answers 5

2

Try this approach:

#!/bin/bash

while read line
do
    sed 's/f\.\([0-9a-zA-Z]*\)\.focus()/highlight("\1")/g' $line
done < <(find . -type f  -name '*.html')

When you're happy with the output change the sed-command to sed -i.bak instead to do inline-replace.

Explanation:

  1. The find command searches recursively from the current folder and down for all files named .html
  2. a bash while-read loop read one line at a time of the output from the find-command
  3. sed is then used for searching for the desired pattern and the pattern \(...\) is called a caption-group that stores the matching text in a variable that can accessed using \1 which is called a back-reference.

The proper way to read and operate on each line of a file in bash is to use

while read line
do
    echo $line
done < file

In our case, we don't have a file, instead we'd like to operate on each line of the output of a command, enter process substitution <(...) You can of course redirect the find-command to a file using redirection find ... > file and then operate on that.

Update:

As pointed out by @tripleee the while-loop can be dropped completely:

sed -i.bak 's/f\.\([0-9a-zA-Z]*\)\.focus()/highlight("\1")/g' $(find . -type f  -name '*.html')

The sed '...' $(find...) construct executes the part in $() in a subshell, delivering all the matching files as parameters to the sed-command as seen below

sed '...' ./c/file.html ./a/file.html ./b/file.html ./d/file.html

If you have a lot of html-files, the shell might throw an error due to too long command-line; if that is the case xargs is your friend (man xargs).

..or (Linux is full of TMTOWTDI), let find execute the sed-part for all matching files (one at a time), in that way you don't risk the problem of getting a too long command line:

find . -type f  -name '*.html' -exec sed 's/f\.\([0-9a-zA-Z]*\)\.focus()/highlight("\1")/g' {} \;
Sign up to request clarification or add additional context in comments.

10 Comments

I don't know why should I add parentheses around [:alpha:] and what does ("\1") exactly mean? Also, sorry I can't figure out the last statement done < <(find . -type f -name '*.html'). Please explain and add some references.
Error msg: redirection unexpected in the last line.
It might be that your bash is actually a symlink to dash. Dash do not support process substitution. Check with ls -l /bin/bash or are you using /bin/sh as your shebang?
Well... The spaces between done and second < in done < <(find . -type f -name '*.html') cause the error: redirection unexpected. If I delete one, I got Syntax error: "(" unexpected. Btw,why should I have to pass the command to done?
you don't pass anything to done. done terminates the while-loop.
|
2
sed -i 's/f\.\([a-zA-Z0-9]\+\)\.focus()/highlight("\1")/g' file_to_process
  1. f\. matches f.
  2. \([a-zA-Z0-9]\+\) matches one or more alphanumeric characters and stores matched STRING in variable 1
  3. \.focus() matches .focus()
  4. highlight("\1") replaces whole matched pattern with given text and value of variable 1 -> higlight("STRING")

4 Comments

Doesn't match the provided test-string abCDef12345
Maybe because of that :alpha: is a-z A-Z only?
There is clearly stated STRING is the combination of [:alpha:] in question.
Sorry for that. Updated~
0

sed -i 's/b.\(STRING\).focus()/highlight("\1")/g' file will do the trick

#echo "b.STRING.focus()"| sed 's/b.\(STRING\).focus()/highlight("\1")/g' highlight("STRING")

2 Comments

Can you please add some explanation to your code? I totally can't get it.
Basicly, as all others say, expression between parentheses in the left side of the sed expression is recovered by using \# in the right side
0

You can use this sed:

sed -i.bak 's/f\.\([[:alnum:]]\+\).focus()/highlight("\1")/g' file.html

Here sed is finding

f.<string-with-1-and-more-alpha-numerics>.focus()

And capturing middle part into matching group #1

It is replacing that with:

highlight("\1")

Where '\1` is back-reference for matched group #1

Comments

0

An awk version:

echo 'f.STRING.focus("Some data")' | awk '{gsub(/[[:alpha:]]\.[[:alpha:]]+\.focus\(/,"highlight(")}1'
highlight("Some data")

Using sed

echo 'b.STRING.focus("Some data")' | sed 's/[[:alpha:]]\.[[:alpha:]]*\.focus/highlight/g'
highlight("Some data")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.