ruby regex scan multiple match

Question

I am trying to get the text between two tag.

 foobar  => bar

I tried using 'asdasdqwe '.scan(/[a-zA-Z0-9]*<\/b>(.*)<br\/>/) and it gives me proper result.

but when I try this :

'<b>exclude</b>op1<br/>exclude 2<b>exclude</b>op2<br/>exclude 2<b>exclude</b>op3<br/>exclude 2'.scan(/<b>[a-zA-Z0-9]*<\/b>(.*)<br\/>/) { |ele|
puts ele
}

It matches the first  tag and the last   tag and returns the whole string I was expecting an array of matches

Related question: stackoverflow.com/questions/1732348/…

Andrew Grimm
– Andrew Grimm

2011-11-25 06:56:13 +00:00
Commented Nov 25, 2011 at 6:56 — Andrew Grimm
– Andrew Grimm, Commented Nov 25, 2011 at 6:56

pguardiario · Accepted Answer · 2011-11-25 08:28:13Z

9

Instead of using regex on html use nokogiri:

Nokogiri::HTML.fragment(str).css('b').each do |b|
    puts b.next.text
end

answered Nov 25, 2011 at 8:28

pguardiario

55.2k21 gold badges130 silver badges169 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Dogbert · Accepted Answer · 2011-11-25 06:44:32Z

8

Change (.*) to (.*?) to make it ungreedy

/<b>[a-zA-Z0-9]*<\/b>(.*?)<br\/>/

Test

[2] pry(main)> '<b>exclude</b>op1<br/>exclude 2<b>exclude</b>op2<br/>exclude 2<b>exclude</b>op3<br/>exclude 2'.scan(/<b>[a-zA-Z0-9]*<\/b>(.*?)<br\/>/) { |ele|
[2] pry(main)*   puts ele
[2] pry(main)* }  
op1
op2
op3

answered Nov 25, 2011 at 6:44

Dogbert

224k43 gold badges419 silver badges416 bronze badges

1 Comment

Reactormonk Over a year ago

You cannot parse HTML with regex.

Collectives™ on Stack Overflow

ruby regex scan multiple match

2 Answers 2

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related