0

I have a method to capture the extension as a group using a regex:

def test(str)

  word_match = str.match(/\.(\w*)/)

  word_scan = str.scan(/\.(\w*)/)

  puts word_match, word_scan

end

test("test.rb")

So it will return:

.rb
rb

Why would I get a different answer?

1
  • 1
    Don't reinvent a wheel. Ruby has well-tested code to handle this. Commented Apr 30, 2020 at 3:57

2 Answers 2

4

The reason is that match and scan return different objects. match returns either a MatchData object or a String while scan returns an Array. You can see this by calling the class method on your variables

puts word_match.class # => MatchData
puts word_scan.class  # => Array

If you take a look at the to_s method on MatchData you'll notice it returns the entire matched string, rather than the captures. If you wanted just the captures you could use the captures method.

puts word_match.captures # => "rb"
puts word_match.captures.class # => Array

If you were to pass a block to the match method you would get a string back with similar results to the scan method.

word_match = str.match(/\.(\w*)/) { |m| m.captures } # => [["rb"]]
puts word_scan.inspect  #=> ["rb"]
puts word_match #=> "rb

More information on these methods and how they work can be found in the ruby-doc for the String class.

Sign up to request clarification or add additional context in comments.

Comments

2

Don't write your own code for this, take advantage of Ruby's own built-in code:

File.extname("test.rb")         # => ".rb"
File.extname("a/b/d/test.rb")   # => ".rb"
File.extname(".a/b/d/test.rb")  # => ".rb"
File.extname("foo.")            # => "."
File.extname("test")            # => ""
File.extname(".profile")        # => ""
File.extname(".profile.sh")     # => ".sh"

You're missing some cases. Compare the above to the output of your attempts:

fnames = %w[
  test.rb
  a/b/d/test.rb
  .a/b/d/test.rb
  foo.
  test
  .profile
  .profile.sh
]

fnames.map { |fn|
  fn.match(/\.(\w*)/).to_s 
}
# => [".rb", ".rb", ".a", ".", "", ".profile", ".profile"]

fnames.map { |fn|
  fn.scan(/\.(\w*)/).to_s  
}
# => ["[[\"rb\"]]",
#     "[[\"rb\"]]",
#     "[[\"a\"], [\"rb\"]]",
#     "[[\"\"]]",
#     "[]",
#     "[[\"profile\"]]",
#     "[[\"profile\"], [\"sh\"]]"]

The documentation for File.extname says:

Returns the extension (the portion of file name in path starting from the last period).

If path is a dotfile, or starts with a period, then the starting dot is not dealt with the start of the extension.

An empty string will also be returned when the period is the last character in path.

On Windows, trailing dots are truncated.

The File class has many more useful methods to pick apart filenames. There's also the Pathname class which is very useful for similar things.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.