Getting number from the string using regexp

Question

I have the following html line

<b>String :</b></b></td><td class="title">14</td>

I'm trying to parse it on order to get number only. Looks simple but "s/^.*$:digit:$.*$/\1/" shows whole line. I tried also "s/^.*$\d+$.*$/\1/" but it return the same result.

If try "s/^.*String.*>$.*$<.*$/\1/" command then it returns what is needed but "s/^.*String.*>$\d+$<.*$/\1/" returns again whole line.

Do you think is possible to get here number from the string specifying include only digit in group?

Edit: I need it for Java language. Example here is juts for getting working regular expression which I test using sed command.

Thank you.

It’s rather a language that uses POSIX BRE/GNU BRE (since () are escaped). — Gumbo
– Gumbo, Commented Nov 24, 2010 at 19:10
Friends don't let friends parse HTML with regular expressions. Use an HTML parser instead. — Ether
– Ether, Commented Nov 24, 2010 at 19:13
@Ether: Don't be stupid. He's not parsing HTML, he's extracting a number. Friends don't let friends do cargo-cult programming either. — Mike Caron
– Mike Caron, Commented Nov 24, 2010 at 19:16
@Mike, if he's extracting a number, he's likely extracting other things as well. Pretty soon it's what one might call parsing. — Mark Thomas
– Mark Thomas, Commented Nov 24, 2010 at 19:23
@Mark: Maybe. But, that's neither here nor there, since the OP is not doing anything like the question @Ether posted. — Mike Caron
– Mike Caron, Commented Nov 24, 2010 at 19:26

Sinan Ünür · Accepted Answer · 2010-11-24 19:13:16Z

3

Use HTML::TableExtract.

answered Nov 24, 2010 at 19:13

Sinan Ünür

118k15 gold badges201 silver badges347 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

theChrisKent · Accepted Answer · 2010-11-24 19:14:06Z

0

In javascript you can do this:

var num = parseInt(someString.replace( /\D/g , ''));

answered Nov 24, 2010 at 19:14

theChrisKent

15.1k3 gold badges64 silver badges62 bronze badges

Comments

The Archetypal Paul · Accepted Answer · 2010-11-24 19:16:19Z

0

I think you have a slightly peculiar regex implementation. What's the environment?

   s/^[^\d]*\(\d+\)<[^\d]**$/\1/

Has to be worth a go, though. Check whether the set pattern needs [ or [ and if it allows character classes (\d) first. If no character classes 0-9 should do it.

answered Nov 24, 2010 at 19:16

The Archetypal Paul

41.9k20 gold badges107 silver badges136 bronze badges

Comments

cristian · Accepted Answer · 2010-11-24 19:23:35Z

0

regex (?:<(?:[^>])+>)(\d+)(?:(?:<\/[^>]+)+>) capture only the numbers from your text that are betwen html tags

edited Nov 24, 2010 at 19:23

answered Nov 24, 2010 at 19:17

cristian

8,7443 gold badges40 silver badges44 bronze badges

Comments

Mike Caron · Accepted Answer · 2010-11-25 08:59:20Z

0

Although you don't explain what language you're using, the answer is simple.

When you have captured expressions (parenthesis), there are multiple results.

The first one, #0, is always the whole match. Since you have .* before and after the digits, the extra HTML is included in the result.

However, in the second match, #1, you should have only the number. The way to retrieve this result varies depending on the language, but if you update your question, we may be able to help you in that regard.

Edit:

public static String extractNumber(String input) {
    Pattern p = Pattern.compile("s/(\\d+)/");

    Matcher m = p.matcher(input);

    if(m.find()) {
        String num = m.group(1);
        return Integer.parseInt(num);
    }

    return null;
}

This will extract the first number it finds in the input text. And, it demonstrates how to use groups as well.

I haven't tested it since I don't have a proper java environment set up at the moment, but it looks okay. Let me know if you have any problems.

edited Nov 25, 2010 at 8:59

answered Nov 24, 2010 at 19:15

Mike Caron

14.6k4 gold badges53 silver badges77 bronze badges

1 Comment

Mike Caron Over a year ago

@yart: I've updated my post with a method that should help you out.

Collectives™ on Stack Overflow

Getting number from the string using regexp

5 Answers 5

Comments

Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related