11

I'm new to Ruby (being a Java dev) and trying to implement a method (oh, sorry, a function) that would retrieve and yield all files in the subdirectories recursively.

I've implemented it as:

def file_list_recurse(dir)
  Dir.foreach(dir) do |f|
    next if f == '.' or f == '..'
    f = dir + '/' + f
    if File.directory? f
      file_list_recurse(File.absolute_path f) { |x| yield x }
    else
      file = File.new(f)
      yield file
    end
  end
end

My questions are:

  1. Does File.new really OPEN a file? In Java new File("xxx") doesn't... If I need to yield some structure that I could query file info (ctime, size etc) from what would it be in Ruby?
  2. { |x| yield x } looks a little strange to me, is this OK to do yields from recursive functions like that, or is there some way to avoid it?
  3. Is there any way to avoid checking for '.' and '..' on each iteration?
  4. Is there a better way to implement this?

Thanks

PS: the sample usage of my method is something like this:

curr_file = nil

file_list_recurse('.') do |file|
  curr_file = file if curr_file == nil or curr_file.ctime > file.ctime
end

puts curr_file.to_path + ' ' + curr_file.ctime.to_s

(that would get you the oldest file from the tree)

==========

So, thanks to @buruzaemon I found out the great Dir.glob function which saved me a couple of lines of code. Also, thanks to @Casper I found out the File.stat method, which made my function run two times faster than with File.new

In the end my code is looking something like this:

i=0
curr_file = nil

Dir.glob('**/*', File::FNM_DOTMATCH) do |f|
  file = File.stat(f)
  next unless file.file?
  i += 1
  curr_file = [f, file] if curr_file == nil or curr_file[1].ctime > file.ctime
end

puts curr_file[0] + ' ' + curr_file[1].ctime.to_s
puts "total files #{i}"

=====

By default Dir.glob ignores file names starting with a dot (considered to be 'hidden' in *nix), so it's very important to add the second argument File::FNM_DOTMATCH

5 Answers 5

13

How about this?

puts Dir['**/*.*']
Sign up to request clarification or add additional context in comments.

2 Comments

That's great! But it produces an Array of String objects. What I'm looking for is function that would yield a File-like structure so that I could do my own calculations based on that. Finding the biggest file, the earliest ctime etc.
Dir['.'] doesn't accept a block. But Dir.glob does! It answers my questions, except for question #1
6

According to the docs File.new does open the file. You might want to use File.stat instead, which gathers file-related stats into a queryable object. But note that the stats are gathered at point of creation. Not when you call the query methods like ctime.

Example:

Dir['**/*'].select { |f| File.file?(f) }.map { |f| File.stat(f) }

1 Comment

File.stat ironically doesn't provide the name of the File, so I can't use it as a data object to return from my method. Also, I have a tree of 200,000 files. Running your example results in ruby process grow above 60 Mb, while running my method (even with File.new) never makes ruby go above 6 Mb. (I'm testing with watch -n 0,1 "ps ax -o comm,rss|grep ruby >> /tmp/q"). But you sample line of code indeed looks cool ;-)
6

this thing tells me to consider accepting an answer, I hope it wouldn't mind me answering it myself:

i=0
curr_file = nil

Dir.glob('**/*', File::FNM_DOTMATCH) do |f|
  file = File.stat(f)
  next unless file.file?
  i += 1
  curr_file = [f, file] if curr_file == nil or curr_file[1].ctime > file.ctime
end

puts curr_file[0] + ' ' + curr_file[1].ctime.to_s
puts "total files #{i}"

Comments

3

You could use the built-in Find module's find method.

Comments

1

If you are on Windows see my answer here under for a mutch faster (~26 times) way than standard Ruby Dir. If you use mtime it's still going to be waaayyy faster.

If you use another OS you could use the same technique, I'm curious if the gain would be that big but I'm almost certain.

How to find the file path of file that is not the current file in ruby

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.