1

I am trying to take some actions for some csv files in my folder,all those files should have same format,except with different IDs;it looks like: Myfile_100_2018-11-26.csv, all those numbers are different(100 means id,the rest numbers are date time); I have a list object, which contain all ids I want to open, for example my_id=[100,200,300,400]

import pandas as pd
import os
import re

allfiles = os.listdir('.')
game_id=[100,200,300,400]
from id in game_id:
     files = [f for f in allfiles if re.search(r'(%s+_\d{4}-\d{2}-\d{2}\.csv$')%game_id, f)]

In my code, I am want to use game_id replace the %s, so that I can loops though all files for ids from 100, 200, 300,400; however I get an error:SyntaxError: invalid syntax for the comma after game_id.

I tried many combination I searched from other questions, but seems didn't work for me, can anyone gives an advice? thanks

3
  • You don't have an opening parenthesis for )%game_id Commented Nov 27, 2018 at 5:49
  • re.search(r'(%s+_\d{4}-\d{2}-\d{2}\.csv$'%game_id, f)? this doesn't work Commented Nov 27, 2018 at 5:54
  • it is working: [f for f in allfiles if re.search(r'(%s_\d{4}-\d{2}-\d{2}\.csv$)'%(game_id), f)] Commented Nov 27, 2018 at 7:42

1 Answer 1

1

You are trying to pass game_id to the re.search method rather than to the r'(%s+_\d{4}-\d{2}-\d{2}\.csv$' string literal, which is causing trouble.

Then, you have a mismatching opening capturing parenthesis without the closing one, it will cause a regex error.

Besides, the + after %s might result in unexpected matches: 100, 1000 and 1000000 game IDs can be returned.

You may use

import re
allfiles=['YES_100_1234-22-33.csv', 'NO_1000_1023-22-33.csv', 'no_abc.csv']
game_id=[100,200,300,400]
rx=re.compile(r'(?<!\d)(?:%s)_\d{4}-\d{2}-\d{2}\.csv$'%"|".join(map(str,game_id)))
# => (?<!\d)(?:100|200|300|400)_\d{4}-\d{2}-\d{2}\.csv$
files = [f for f in allfiles if rx.search(f)]
print(files) # => ['YES_100_1234-22-33.csv']

The regex is formed like

rx=re.compile(r'(?<!\d)(?:%s)_\d{4}-\d{2}-\d{2}\.csv$'%"|".join(map(str,game_id)))
# => (?<!\d)(?:100|200|300|400)_\d{4}-\d{2}-\d{2}\.csv$

See the regex demo.

Details

  • (?<!\d) - no digit right before the next char matched
  • (?:100|200|300|400) - game_id values joined with an alternation operator
  • _\d{4}-\d{2}-\d{2} - _, 4 digits, -, 2 digits, -, 2 digits
  • \.csv$ - .csv and end of the string.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.