bashscript for file search and replace!

Question

Hey I try to write a littel bash script. This should copy a dir and all files in it. Then it should search each file and dir in this copied dir for a String (e.g @ForTestingOnly) and then this save the line number. Then it should go on and count each { and } as soon as the number is equals it should save againg the line number. => it should delete all the lines between this 2 numbers. I'm trying to make a script which searchs for all this annotations and then delete the method which is directly after this ano. Thx for help...

so far I have:

echo "please enter dir"
read dir
newdir="$dir""_final"
cp -r $dir $newdir 
cd $newdir

grep -lr -E '@ForTestingOnly' * | xargs sed -i 's/@ForTestingOnly//g'

now with grep I can search and replace the @ForTestingOnly anot. but I like to delete this and the following method...

you should probably explicitly mention why you tagged this question with "java" tag, I can only suspect that @ForTestingOnly is a Java annotation... — bobah
– bobah, Commented May 10, 2010 at 16:34
It would be pretty easy to do this almost correctly as you describe it, but watch out for things like "}" characters inside comments or string literals.... — David Gelhar
– David Gelhar, Commented May 10, 2010 at 16:42
I don't know how I can search for a word and save the line and then search on searching for { and } and then delete all the lines between... — D3orn
– D3orn, Commented May 10, 2010 at 16:48
You can always use your favorite programming language to code up a solution for a single file, then use the find command to apply your program recursively. — Jay
– Jay, Commented May 11, 2010 at 23:25

Dennis Williamson · Accepted Answer · 2010-05-10 21:20:42Z

2

Give this a try. It's oblivious to braces in comments and literals, though, as David Gelhar warned. It only finds and deletes the first occurrence of the "@ForTestingOnly" block (under the assumption that there will only be one anyway).

#!/bin/bash
find . -maxdepth 1 | while read -r file
do
    open=0 close=0
    # start=$(sed -n '/@ForTestingOnly/{=;q}' "$file")
    while read -r line
    do
        case $line in
            *{*) (( open++ )) ;;
            *}*) (( close++ ));;
             '') : ;;    # skip blank lines
              *) # these lines contain the line number that the sed "=" command printed
                 if (( open == close ))
                 then 
                     break
                 fi
                 ;;
        esac
             # split braces onto separate lines dropping all other chars
             # print the line number once per line that contains either { or }
    # done < <(sed -n "$start,$ { /[{}]/ s/\([{}]\)/\1\n/g;ta;b;:a;p;=}" "$file")
    done < <(sed -n "/@ForTestingOnly/,$ { /[{}]/ s/\([{}]\)/\1\n/g;ta;b;:a;p;=}" "$file")
    end=$line
    # sed -i "${start},${end}d" "$file"
    sed -i "/@ForTestingOnly/,${end}d" "$file"
done

Edit: Removed one call to sed (by commenting out and replacing a few lines).

Edit 2:

Here's a breakdown of the main sed line:

sed -n "/@ForTestingOnly/,$ { /[{}]/ s/\([{}]\)/\1\n/g;ta;b;:a;p;=}" "$file"

-n - only print lines when explicitly requested
/@ForTestingOnly/,$ - from the line containing "@ForTestingOnly" to the end of the file
s/ ... / ... /g perform a global (per-line) substitution
$ ... $ - capture
[{}] - the characters that appear in the list bewteen the square brackets
\1\n - substitute what was captured plus a newline
ta - if a substitution was made, branch to label "a"
b - branch (no label means "to the end and begin the per-line cycle again for the next line) - this branch functions as an "else" for the ta, I could have used T instead of ta;b;:a, but some versions of sed don't support T
:a - label "a"
p - print the line (actually, print the pattern buffer which now consists of possibly multiple lines with a "{" or "}" on each one)
= - print the current line number of the input file

The second sed command simply says to delete the lines starting at the one that has the target string and ending at the line found by the while loop.

The sed command at the top which I commented out says to find the target string and print the line number it's on and quit. That line isn't necessary since the main sed command is taking care of starting in the right place.

The inner whileloop looks at the output of the main sed command and increments counters for each brace. When the counts match it stops.

The outer while loop steps through all the files in the current directory.

edited May 10, 2010 at 21:20

answered May 10, 2010 at 18:53

Dennis Williamson

364k95 gold badges386 silver badges446 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

D3orn Over a year ago

okey but now I like to do this for all files in the given dir. and sed hast some unknown command: ',' don't know why...

Dennis Williamson Over a year ago

The find will feed each file to the process. I don't know why the comma isn't working. What version of sed are you using and what OS and version? I've edited the script because I noticed a slight improvement I could make.

D3orn Over a year ago

I'm using Ubuntu 10.04 I'm trying the script later ony very nice work thx a lot now it would be nice to understand what each line in the script does ^^ { and so on are clear but the sed comands i don't get^^ cheer s

Dennis Williamson Over a year ago

@D3orn: Running on Ubuntu 9.04 with GNU sed 4.2.1 and Bash 4.0.33(1)-release, I don't get that error. It probably means the variable "end" isn't getting set, for some reason, but I can't see any reason why not.

Jay · Accepted Answer · 2010-05-10 23:22:41Z

0

I fixed the bugs in the old version. The new versions has two scripts: an awk script and a bash driver.

The driver is:

#!/bin/bash

AWK_SCRIPT=ann.awk

for i in $(find . -type f -print); do
    while [ 1 ]; do
        cmd=$(awk -f $AWK_SCRIPT $i)
        if [ -z "$cmd" ]; then
            break
        else
            eval $cmd
        fi
    done
done

the new awk script is:

BEGIN {
# line number where we will start deleting
start = 0;
}

{
        # check current line for the annotation
        # we're looking for
        if($0 ~ /@ForTestingOnly/) {
                start = NR;
                found_first_open_brace = 0;
                num_open = 0;
                num_close = 0;
        }

        if(start != 0) {
                if(num_open == num_close && found_first_open_brace == 1) {
                        print "sed -i \'\' -e '" start "," NR " d' " ARGV[1];
                        start = 0;
                        exit;
                }
                for(i = 1; i <= length($0); i++) {
                        c = substr($0, i, 1);
                        if(c == "{") {
                                found_first_open_brace = 1;
                                num_open++;
                        }
                        if(c == "}") {
                                num_close++;
                        }
                }
        }
}

Set the path to the awk script in the driver then run the driver in the root dir.

edited May 10, 2010 at 23:22

answered May 10, 2010 at 19:43

Jay

9,6327 gold badges36 silver badges45 bronze badges

3 Comments

Dennis Williamson Over a year ago

Replace backquotes with $(). Here's why.

Jay Over a year ago

ty, if your still having trouble with the find command try $ find . -type f -print

Jay Over a year ago

I just found a bug in that program. It won't work if a file contains more than one annotation to be deleted. This is because once sed deletes the first annotation, the line #s of the second will change invalidating the next sed command. You will have to change the program to only produce one sed command per file, then rerun the whole thing until the awk scrip produces no output.

Collectives™ on Stack Overflow

bashscript for file search and replace!

2 Answers 2

4 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related