1
http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6

I have tried to get the value: bOhxBeD, SyhyTGi and so on. This is what I come up with ( yes fairly simple ) /([a-zA-Z0-9]{7})/, it seems to work with PCRE:

([a-zA-Z0-9]{7})

Regular expression visualization

Debuggex Demo

But when it comes to Ruby, I use it like this :

str.match(/([a-zA-Z0-9]{7})/)
#<MatchData "bOhxBeD" 1:"bOhxBeD">

it doesn't seem to work. Can anyone point out what's wrong with this regex ? Thanks

2
  • 1
    Not an answer to your actual question, but this may be better suited to methods other than regexes. eg: require 'uri'; URI(str).path[1..-1].split(',') Commented Aug 27, 2014 at 6:43
  • 1
    @TimPeters this is a good answer too, thanks. But somehow to me when I look at this kind of thing, I think about regex anyway. So I try hard to learn it properly. But still, nice solution there :) Commented Aug 27, 2014 at 6:45

5 Answers 5

3

You need to add word boundary \b inorder to match an exact 7 alphanumeric characters.

\b[a-zA-Z0-9]{7}\b

DEMO

irb(main):006:0> "http://something.com/bOhxBeD,SyhyTGi,TMDDSIB,U72gx2J,kQTIRy9,7VXgGDw,eSxIcK6,S5oNlnn,WBHHsLk,BdMGd2d,U9kNlsF,cHVyc7Y,D83kaJ5,cLWgdSO,iWtCIF3,ount8L6".scan(/\b([a-zA-Z0-9]{7})\b/)
=> [["bOhxBeD"], ["SyhyTGi"], ["TMDDSIB"], ["U72gx2J"], ["kQTIRy9"], ["7VXgGDw"], ["eSxIcK6"], ["S5oNlnn"], ["WBHHsLk"], ["BdMGd2d"], ["U9kNlsF"], ["cHVyc7Y"], ["D83kaJ5"], ["cLWgdSO"], ["iWtCIF3"], ["ount8L6"]]
Sign up to request clarification or add additional context in comments.

Comments

2
 (?!.*?\/)[a-zA-Z0-9]{7}

Is should be this.Or else it will pick 7 letter words from link as well."somethi" will be in ans.But i guess that is not required.

1 Comment

it uses a negative lookahead i.e ?!.So anything which matches negative lookahead cannot be matched by the main matcher.So negative lookahead says match anything upto last /.So the ans can now come from raminign string only.That is what was required.
2

match only picks up the first match.
You can try the global version of match which is scan.
You can use scan to search string not containing specific characters using [^...]:

str.scan(/[^\/\.\,]+/)[3..-1]   
#=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw", "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y", "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]  

Update:
If you know that the strings between the comma are always 7 characters, you can use this instead:

   str.scan(/[^\/\.\,]{7}/)[1..-1]

Comments

1

it happens because your regexp match just one element which contain 7 chars, nothing more, as simple solution could be:

str.match(/\/(.*)\z/)[1].split(',')

Comments

1

You could use String#[] and String#split:

str[/.*\/(.*)/,1].split(',')
  #=> ["bOhxBeD", "SyhyTGi", "TMDDSIB", "U72gx2J", "kQTIRy9", "7VXgGDw",
  #    "eSxIcK6", "S5oNlnn", "WBHHsLk", "BdMGd2d", "U9kNlsF", "cHVyc7Y",
  #    "D83kaJ5", "cLWgdSO", "iWtCIF3", "ount8L6"]

.*\/ in the regex, "greedy" as it is, will consume characters up to and including the last forward slash in the string. Capture group #1 (.*) sucks up the remainder of the string and, due to the presence of ,1, returns it. split(',') then breaks up the string to give you the desired array.

Another way:

str[str[/.*\//].size..-1].split(',')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.