Extracting string from jmeter response using regular expression extractor

Question

I'm trying to extract the string (201 & 202) from the html response code below. So far I have tried the following regex

punumber=(.+)

but the problem is that there are many instances of the punumber on the page and gets me stuff that I dont need.
The string i need are inside the <h3 class="content-title">.

So can someone please help me write a regex to extract the punumber within the h3 class only?

<h3 class="content-title">
<!--  change when this is completed -->
    <a href="/container/recentIssue.jsp?punumber=201">
    Title 1
    </a>
</h3>

<h3 class="content-title">
<!--  change when this is completed -->
    <a href="/container/mostRecentIssue.jsp?punumber=202">
    Title 1
    </a>                                    
</h3>

@Alies Belik - Thanks for the edits but I had deliberately left the spaces and new lines in the above code because that is how the actual html response looks like. Also the numbers of additional newlines are different in each <h3 class="content-title"> block. — bkone
– bkone, Commented Jan 18, 2013 at 14:32
Ok, when you suppose that's necessary you can rollback your post ti the initial version. — Aliaksandr Belik
– Aliaksandr Belik, Commented Jan 18, 2013 at 14:42

UBIK LOAD PACK · Accepted Answer · 2013-01-17 21:30:39Z

5

This works for me:

Reference Name : test
Regexp : punumber=([^"]+?)"

Template : $1$

Match No : -1

(this will get all values) NV_punumber

With -1, JMeter will create:

${test_1} => 201
${test_2} => 202

answered Jan 17, 2013 at 21:30

UBIK LOAD PACK

34.7k6 gold badges76 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

ant Over a year ago

nice one I like the creation of separate variables +1

bkone Over a year ago

Thanks @PMD UBIK-INGENIERIE - when I use punumber=([^"]+?)" I get about 77 instances of the pattern, What I want are the only ones inside the <h3> tag and there are 25 instances of those.

UBIK LOAD PACK Over a year ago

You mean you have the same <a href="/container/mostRecentIssue.jsp?punumber=WWW"> within <h3> and some not within it, and you only wnat the one within <h3> right ? Then with only regexp extractor it will be rather hard. You could try to first extract content within h3 then use Regexp Extractor and make it work on variable instead of sample response. Another option is to use new JMeter 2.9 feature based on JQuery /CSS selectors, as it's not released yet you should use nightly build (read build instructions). It seems to me more appropriate to your use case

bkone Over a year ago

@PMD - I used two regex extractors, one to extract just the <h3> and the next one to extract the punumber. The first one looks like <h3 class="content-title">(.+)\n.*\n.*\n.*\n.*\n.*\n.*\n.*\n.* which extracts the whole tag content. Is there a better way to express the regex?

UBIK LOAD PACK Over a year ago

Did you try this : h3 class="content-title">(.+?)</h3

|

ant · Accepted Answer · 2013-01-17 21:32:54Z

2

Here is the regex that works for me :

punumber=(\d+)

If you're parsing html you should consider using something else other than regex to extract info like jsoup.

Anyways here is the jmeter test file attached with dummy sampler(with regex post processor) simulating your case and debug sampler that gets the result you want.

http://pastebin.com/Uti8Pv9E

answered Jan 17, 2013 at 21:32

ant

22.9k36 gold badges139 silver badges185 bronze badges

2 Comments

UBIK LOAD PACK Over a year ago

Your regepx is better than mine :-)

bkone Over a year ago

Thanks @Ant, when I use punumber=(\d+) I get about 77 instances of the pattern, What I want are the only ones inside the <h3> tag and there are 25 instances of those. I will also give the jsoup stuff a shot. Thanks again.

Aliaksandr Belik · Accepted Answer · 2013-01-18 16:40:54Z

You can possibly combine in this case XPath Extractor with structured query (to get all href values with punumber from ONLY instances inside <h3> tags) together with extracting then punumber value from href in ForEach Controller loop.

. . .
YOUR HTTP REQUEST
    XPath Extractor
    Use Tidy = true
    Reference Name = punum
    XPath Query = //h3[@class="content-title"]/a[text()="Title 1"]/@href
    Default value = NOT_FOUND
ForEach Controller
Input variable prefix = punum
Output variable name = pnum
Add "_" before number = true
    User Parameters
    cnt = ${__counter(FALSE,)}
    Regular Expression Extractor
    Apply to = Jmeter Variable = pnum
    Reference Name = punumber_${cnt}
    Regular Expression = punumber=(\d+)
    Template = $1$
    Match No. = 1
    Default value = NOT_FOUND
    ...
. . .

XPath Extractor will give you hrefs values of all the <a> items under <h3> tag as punum_1,punum_2,...,punum_N vars.
Foreach Controller takes one after another punum_X var, refers it as pnum, applies to it RegEx Extractor to get punumber value and stores extracted value as punumber_1, punumber_2,...,punumber_N (using counter defined in User Parameters and incremented each step).

NOTE: Since here XPath Extractor is used to parse HTML (not XML) response ensure that Use Tidy (tolerant parser) option is CHECKED (in XPath Extractor's control panel).

Same test-plan available here: http://db.tt/dnACZtGL (I've used @ant's one from his answer, thank him).

I found using two regex extractors easy but this one worked as well.

Collectives™ on Stack Overflow

Extracting string from jmeter response using regular expression extractor

3 Answers 3

6 Comments

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related