Ruby extract string via regular expression

Question

I have these strings:

'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
'da_report/GY4LFDN6/2017_11/activily_time2017_11/index.html'

From these two strings, I want to extract these two file names:

'2017_11/view_mission_join_player_count2017_11'
'2017_11/activily_time2017_11'

I wrote some regular expressions, but they seem wrong.

str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
str[/([^\/index.html]+)/, 1] # => "a_r"

@CodaChang Even so, it doesn't mean that it would be good practice to hard code these values, unless you only want to target these types of paths in particular. — Tim Biegeleisen
– Tim Biegeleisen, Commented Dec 25, 2017 at 7:53

Aleksei Matiushkin · Accepted Answer · 2017-12-25 08:01:14Z

1

Regular expression is an overkill here, and i prone to errors.

input = [
  "da_report/GY4LFDN6/" \
  "2017_11/view_mission_join_player_count2017_11" \
  "/index.html",
  "da_report/GY4LFDN6/" \
  "2017_11/activily_time2017_11" \
  "/index.html"
]  

input.map { |str| str.split('/')[2..3].join('/') }
#⇒ [
#   [0] "2017_11/view_mission_join_player_count2017_11",
#   [1] "2017_11/activily_time2017_11"
# ]

or, more elegant:

input.map { |str| str.split('/').grep(/2017_/).join('/') }

edited Dec 25, 2017 at 8:01

answered Dec 25, 2017 at 7:54

Aleksei Matiushkin

121k12 gold badges109 silver badges173 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

rj487 Over a year ago

Oh, I know this way, and it's a great answer. But I just want to try the regex way.

Aleksei Matiushkin Over a year ago

str[%r|(?<=\Ada_report/GY4LFDN6/)\w+/\w+|] (“I just want to use a regexp“ sounds a bit silly to me; regexp is not a way to go here.)

rj487 Over a year ago

Thanks, I will change my way to achieve that.

Tim Biegeleisen Over a year ago

@mudasobwa I disagree with your notion of using regex here. If the OP wanted to filter paths using names, then a string join approach would not be workable. Regex is not prone to error if the person using it knows regex.

Aleksei Matiushkin Over a year ago

@TimBiegeleisen if that was a different task, the different ways to solve it would probably be better. Also, I never said “regexp is prone to errors,” I said “here it’s prone to error.” Also, it’s noticeably slower.

Abdullah · Accepted Answer · 2017-12-25 07:43:42Z

0

Use /(?<=GY4LFDN6\/)(.*)(?=\/index.html)/

str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
str[/(?<=GY4LFDN6\/)(.*)(?=\/index.html)/]
 => "2017_11/view_mission_join_player_count2017_11"

live demo: http://rubular.com/r/Ued6UOXWDf

answered Dec 25, 2017 at 7:43

Abdullah

2,1212 gold badges22 silver badges30 bronze badges

Comments

Tim Biegeleisen · Accepted Answer · 2017-12-25 08:02:50Z

0

This answer assumes that you want to capture beginning with the third component of the path, up to and including the last component of the path before the filename. If so, then we can use the following regex pattern:

(?:[^/]*/){2}(.*)/.*

The quantity in parentheses is the capture group, i.e. what you want to extract from the entire path.

str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
puts str[/(?:[^\/]*\/){2}(.*)\/.*/, 1]

Demo

edited Dec 25, 2017 at 8:02

answered Dec 25, 2017 at 7:49

Tim Biegeleisen

526k32 gold badges323 silver badges399 bronze badges

Comments

The fourth bird · Accepted Answer · 2017-12-25 08:02:54Z

0

If you are looking for the values at the end of the string like in the format string/string followed by /filename.extension, you could use a positive lookahead for a file name.

\w+\/\w+(?=\/\w+\.\w+$)

Demo

edited Dec 25, 2017 at 8:02

answered Dec 25, 2017 at 7:51

The fourth bird

165k16 gold badges61 silver badges75 bronze badges

Comments

Cary Swoveland · Accepted Answer · 2017-12-26 05:35:09Z

0

Based on your examples, you may be able to use a very simple regex.

def extract(str)
  str[/\d{4}_\d{2}.+\d{4}_\d{2}/]
end

extract 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
  #=> "2017_11/view_mission_join_player_count2017_11"
extract 'da_report/GY4LFDN6/2017_11/activily_time2017_11/index.html'
  #=> "2017_11/activily_time2017_11"

answered Dec 26, 2017 at 5:35

Cary Swoveland

111k6 gold badges69 silver badges105 bronze badges

Collectives™ on Stack Overflow

Ruby extract string via regular expression

5 Answers 5

5 Comments

Comments

Demo

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

5 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related