0

I have an excel file that I download from an internal tool. The filename dynamically adds unique tags to the file name every time I download, eg: 'Development_168293048.csv', 'Development_38734023`.csv' or 'Development_168325435.csv'. How do I read this file in Python? The tags have no pattern, it is randomly generated.

Any leads would be appreciated.

2 Answers 2

1

You could use glob to perform filename pattern matching, where * matches zero or more characters in a segment of a name. The following text would find CSV files matching the naming format in your example in the directory specified in the code:

import glob
import os

csv_dir = r"C:\CSVs"
csv_files = glob.glob(os.path.join(csv_dir, "Development_*.csv")

If you know there will only be 1 CSV then you can take the first element of the list, as such:

csv_file = csv_files[0]

Otherwise, it may be best to sort by creation date and select the newest, e.g.

csv_files.sort(key=os.path.getctime)
csv_file = csv_files[-1]

Having obtained the path to the CSV, you can now read it in whichever way is most appropriate for your needs, for example, by using the csv package: https://docs.python.org/3/library/csv.html

Sign up to request clarification or add additional context in comments.

Comments

0

If there are multiple files, you could iterate through the directory and find each file matching a certain criteria. One possible criteria could be that they all start with "Development_", which you could test with Python's regex library (re).

import os
import re

path = r'Your file path'
values = []
criteria = r'^Development_[\d\w]*\.csv$'
test = re.compile(criteria)

for root, _, files in os.walk(path):
    for file in files:
        # Our test is below - can be modified
        # IMPORTANT! strip newlines and spaces below
        if test.fullmatch(file.strip(' \n')) is not None:
            # Handle filename matches
            values.append(os.path.join(root, file))

# Now use filepaths found in files
print(values)

This also handles files in folders under the root directory. Using a list is optional; you could handle the file directly inside the loop if needed.

If there is only one file you need, but its name is variable each time the script is run, the easiest way would still be to get the filename like above. The only difference would be that you could terminate the loop after one suitable file is found, and a list is definitely not required.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.