2

I'm trying to get all files with excel format extensions, therefore I thought this would select any file that has xls in the filename. It would pick up on xls, xlsx, xlsm etc.

the path is a variable defined as the folder I'm extracting these files from and all_files is storing these files. shouldn't the /* define any file that has .xls in it? /*.xlsx or /*.xlsm works fine.

all_files=glob.glob(path + "/*.xls/*")
2
  • glob.glob("/*.xls") selects xls files only Commented Jan 12, 2018 at 16:31
  • What / doing? Commented Jan 12, 2018 at 16:34

2 Answers 2

1

You are trying to get all files that have .xls in them, and you're trying the glob pattern:

/*.xls/*

This will find directories (note the trailing /) that end in .xls, not files.

You need:

glob.glob(path + "/*.xls*")

but that would not be precise, as this would match any file having just the string .xls in them e.g. foo.xlsbar.

The problem is that the standard shell globbing (even leveraging [], ? would not do here) is not so flexible as Regex as needed here, you can wrap the glob in some Regex check afterwards:

import glob
import re
req = re.compile(r'\.xls[xm]?$')
all_files = list(filter(lambda x: req.search(x), glob.iglob(path + '/*.xls*')))
Sign up to request clarification or add additional context in comments.

Comments

0

You have an extra "/" in your expression. To add the wildcard to the end of ".xls" you need:

all_files=glob.glob(path + "/*.xls*")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.