0

I want to remove all file contain a substring in a string, if does not contain, I want to ignore it, so I use regex expression

str=9009
patt=*v[0-9]{3,}*.txt
for i in "${patt}"; do echo "$i"     
        if ! [[ "$i" =~ $str ]]; then rm "$i" ; fi    done

but I got an error :

*v[0-9]{3,}*.txt
rm: cannot remove '*v[0-9]{3,}*.txt': No such file or directory

file name like this : mari_v9009.txt femme_v9009.txt mari_v9010.txt femme_v9010.txt

4
  • 1
    I think you may be confusing regular expressions and glob patterns (aka filename wildcard expressions). See here for some discussion of the differences. In particular, "{3,}" is regex-speak for "3 or more", but is not valid in a glob pattern. Commented Sep 24, 2018 at 17:22
  • @locklockM, The pattern problem is really deflecting from what you're asking. Do you have files that do NOT match "v with 3 or more digits" ? If yes, do you want to delete those or not? If not, why post that red herring? Commented Sep 24, 2018 at 19:27
  • Why don't you just do rm *v9009.txt ? Or, if you must store the number in a variable, rm *v"$str".txt Commented Sep 24, 2018 at 19:30
  • @GordonDavisson Thanks, removed incorrect comment. Commented Sep 24, 2018 at 20:15

5 Answers 5

3

bash filename expansion does not use regular expressions. See https://www.gnu.org/software/bash/manual/bash.html#Filename-Expansion

To find files with "v followed by 3 or more digits followed by .txt" you'll have to use bash's extended pattern matching. A demonstration:

$ shopt -s extglob
$ touch mari_v9009.txt femme_v9009.txt mari_v9010.txt femme_v9010.txt
$ touch foo_v12.txt
$ for f in *v[0-9][0-9]+([0-9]).txt; do echo "$f"; done
femme_v9009.txt
femme_v9010.txt
mari_v9009.txt
mari_v9010.txt

What you have with this pattern for i in *v[0-9]{3,}*.txt is:

  1. first, bash performs brace expansion which results in

    for i in *v[0-9]3*.txt *v[0-9]*.txt
    
  2. then, the first word *v[0-9]3*.txt results in no matches, and the default behaviour of bash is to leave the pattern as a plain string. rm tries to delete the file named literally "*v[0-9]3*.txt" and that gives you the "file not found error"

  3. next, the second word *v[0-9]*.txt gets expanded, but the expansion will include files you don't want to delete.


I missed the not from the question.

try this: within [[ ... ]], the == and != operators are a pattern-matching operators, and extended globbing is enabled by default

keep_pattern='*v[0-9][0-9]+([0-9]).txt'
for file in *; do
    if [[ $file != $keep_pattern ]]; then
        echo rm "$file"
    fi
done

But find would be preferable here, if it's OK to descend into subdirectories:

find . -regextype posix-extended '!' -regex '.*v[0-9]{3,}\.txt' -print
# ...............................^^^

If that returns the files you expect to delete, change -print to -delete

Sign up to request clarification or add additional context in comments.

9 Comments

parse: line 4: syntax error near unexpected token (' parse: line 4: patt=*v[0-9][0-9]+([0-9]).txt
This requires bash, not sh
i use bash #!/bin/bash
Did you shopt -s extglob in the script?
it works, but in package debian, I got make[2]: shopt: Command not found
|
1

You need to remove the quotes in the for loop. Then the filename globs will be interpreted:

for i in ${patt}; do echo "$i"

4 Comments

nope it does not work :(( I still get the same error
@locklockM How exactly does it not work? What output do you get?
I got the same error, *v[0-9]{3,}.txt rm: cannot remove '*v[0-9]{3,}.txt': No such file or directory
There is no {m,n} quantification in shell globs.
0

I assume that you are using Python.

I have tested your regex code, and found the * character unnecessary.
The following seems to work fine: v[0-9]{3,}.txt

Can you please elaborate some more on the issue?

Thanks,
Bren.

3 Comments

I have many files like : mari_v9009.txt femme_v9009.txt mari_v9010.txt femme_v9010.txt..., So I want to remove file name contain a substring="9009"
@locklockM Seems like you could just do rm *v9009.txt.
Use rm -f as stated in the man page, as this will not throw up an error in this case.
0

I just piped the error message to /dev/null. This worked for me:

#!/bin/bash 
str=9009
patt=*v[0-9]{3,}*.txt
rm $(eval ls $patt 2> /dev/null | grep $str)

4 Comments

*v[0-9]{3,}.txt rm: missing operand Try 'rm --help' for more information.
You have the 2> /dev/null in there?
str=9009 patt=*v[0-9]{3,}.txt for i in ${patt}; do echo "$i" if ! [[ "$i" =~ $str ]]; then rm $(eval ls $patt 2> /dev/null | grep $str) ; fi done
Why are you putting this in a loop?
0

This is not regex, this is globbing. Take a look what gets expanded:

# echo *v[0-9]{3,}*.txt
*v[0-9]3*.txt femme_v9009.txt femme_v9010.txt mari_v9009.txt mari_v9010.txt

*v[0-9]3*.txt obvously doesn't exists. can you clarify what files are you trying to achieve with {3,} ? Otherwise live it out and it will match the kind of filenames you have specified.

http://tldp.org/LDP/abs/html/globbingref.html

2 Comments

{3,} it is for the number starting from three numbers, like 112, 2222
In this context, {3,} is interpreted as a brace expansion and becomes just 3.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.