4

I have a folder that contains 4 text files. I want to program a code with which I would be able to check the size of the files in my folder and only open those that has equal sizes. Anyone has any idea?

I have already tried this

import os


d=os.stat('H:/My Documents/211').st_size
2
  • It's good that you've tried something. Presumably it didn't do what you wanted. What did it do? What did you expect it to do? Commented Nov 28, 2013 at 13:19
  • the result of the printing d shows a zero to me. I want to compare the size of the files and open those that has equal sizes Commented Nov 28, 2013 at 13:20

2 Answers 2

9

I can't reproduce your error. This

import os
print os.path.getsize('mydata.csv')
print os.stat('mydata.csv').st_size

Yields

359415
359415

I'm guessing that the filename you provide is wrong. This will print the size of all files in a folder

my_dir = r'path/to/subdir/'

for f in os.listdir(my_dir):
    path = os.path.join(my_dir, f)
    if os.path.isfile(path):
        print os.path.getsize(path)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you the second one worked for me. And do you have any idea how can I code so that files with equal sizes be opened?
8

To get all of the files in a directory, you can use os.listdir.

>>> import os
>>> basedir = 'tmp/example'
>>> names = os.listdir(basedir)
>>> names
['a', 'b', 'c']

Then you need to add basedir on to the names:

>>> paths = [os.path.join(basedir, name) for name in names]
>>> paths
['tmp/example/a', 'tmp/example/b', 'tmp/example/c']

Then you can turn that into a list of pairs of (name, size) using a os.stat(path).st_size (the example files I've created are empty):

>>> sizes = [(path, os.stat(path).st_size) for path in paths]
>>> sizes
[('tmp/example/a', 0), ('tmp/example/b', 0), ('tmp/example/c', 0)]

Then you can group the paths with the same size together by using a collections.defaultdict:

>>> import collections
>>> grouped = collections.defaultdict(list)
>>> for path, size in sizes:
...     grouped[size].append(path)
... 
>>> grouped
defaultdict(<type 'list'>, {0: ['tmp/example/a', 'tmp/example/b', 'tmp/example/c']})

Now you can get all of the files by size, and open them all (don't forget to close them afterwards!):

>>> open_files = [open(path) for path in grouped[0]]

2 Comments

Thanks for your help. How can I put the grouped items in a list and write it to csv file so can be accessible for later use?
That depends on what you want in the CSV file. Rows with size,filename1,filename2,filename3?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.