1

I am trying to search some files through regular expression, my target files look like: 'Myfile_200_2018.csv';'Myfile_100_2018.csv'....

For example, the following code keeps give the error: expected string or bytes-like object for Regex in Python.

I searched same error from Google, but I think my game_id is a string right? So I am not sure which part causes the error.

import os
import re

allfiles = os.listdir('.')

csv_files = [filename for filename in allfiles if filename.endswith('.csv')]


game_id='100'

re.search(r'(Myfile_%s_\d{4}.csv$)'%game_id, csv_files)
4
  • 2
    csv_files needs to be a string, not a list of strings. Commented Nov 29, 2018 at 3:52
  • But in this folder there are several csv files, the csv_files contains list of csv files, and when the game_id='100", just want to return all csv files match the format, it yams contains:'Myfile_100_2018.csv', 'Myfile_100_2017.csv','Myfile_100_2016.csv' Commented Nov 29, 2018 at 3:55
  • That's not how re.search works. Use a different method; you can't look up a function, plug in whatever arguments you think will work, and expect it to do so. Commented Nov 29, 2018 at 4:02
  • You are already looping through the list of files. Why don't you filter it there at the time when you create the list? like csv_files = [filename for filename in allfiles if filename.endswith('.csv') and re.search((r"Myfile_%s_\d{4}.csv$" % game_id), filename)] ? Commented Nov 29, 2018 at 4:29

2 Answers 2

2

You are already looping through the files in your folder. Why not filter the list when you create it?

import os
import re

allfiles = os.listdir('.')
game_id='100'

csv_files = [
                filename for filename in allfiles if filename.endswith('.csv') and 
                re.search((r"Myfile_%s_\d{4}.csv$" % game_id), filename)
            ]

print csv_files

Or if you want to keep the original list as well, you could use filter to create a new filtered list.

csv_files = [filename for filename in allfiles if filename.endswith('.csv')]

filtered_list = filter(lambda filename: re.search((r"Myfile_%s_\d{4}.csv$" % game_id), filename), csv_files)
Sign up to request clarification or add additional context in comments.

Comments

1

You can use the substring search in your list, if you just want to seach whether the game_id is present in the filename or not

import os
import re

allfiles=['Myfile_100_2018.csv', 'Myfile_100_2017.csv','Yourfile_100_2016.csv','Myfile_200_2018.csv','Myfile_100_2018.csv']

csv_files = [filename for filename in allfiles if filename.endswith('.csv')]


game_id='100'

print([file for file in csv_files if re.search((r"Myfile_%s_\d{4}.csv$" % game_id),file)])

output:

['Myfile_100_2018.csv', 'Myfile_100_2017.csv', 'Myfile_100_2018.csv']

5 Comments

But i want to return a list which match the regex, since in this folder there are several csv files, the csv_files contains list of csv files, and when the game_id='100", just want to return all csv files match the format, it yams contains:'Myfile_100_2018.csv', 'Myfile_100_2017.csv','Myfile_100_2016.csv'
I have changed the code to return the files instead of bool values
I doubt OP wants the zeros in the list, so use [file for file in csv_files if game_id in file] instead.
How about if there is a file calls 'Yourfile_100_2018.csv, it will still return as a result
Then what is your condition to choose your file apart from game id

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.