1

I am a newbie in java regex. I would like to know how to extract numbers or float numbers before %. For example:

"Titi 10% Toto and tutu equals 20X"
"Titi 10.50% Toto and tutu equals 20X"
"Titi 10-10.50% Toto and tutu equals 20X
"Titi 10sd50 % Toto and tutu equals 20X
"Titi 10-10.50% or 10sd50 % Toto and tutu equals 20X

Output :

10
10.50
10-10.50
10sd50
10-10.50;10sd50

My idea is to replace all before and after "space + number(% or space%)" by ; in order to extract all values or group values before %. I tried to use that: replaceAll("[^0-9.]+|\\.(?!\\d)(?!\\b)\\%",";"); = NO SUCCESS

How can I do it?

7
  • Not my downvote, but might be better suited by a parser. You can start by splitting the string on space, and then examining each word (possibly using a regex) to see it be a) a candidate for retention, and if so, then b) strip off the stuff you don't want. Commented Jan 16, 2017 at 10:00
  • Thank you. I know this is a trivial task for Java users but I am not an expert !!! I would like to learn more about that. I try your solution. Thanks a lot Tim Commented Jan 16, 2017 at 10:07
  • Next time you post please include the code you have tried. SO is not a free code writing service and most folks resent questions which show little/no effort but ask for a complex answer. Commented Jan 16, 2017 at 10:12
  • 10sd50 is not a number, nor 10-10.50 Commented Jan 16, 2017 at 10:14
  • 1
    Why did you replace +/- with sd? Was this done by your program, intentionally or unintentionally? Commented Jan 16, 2017 at 10:35

2 Answers 2

2

This one should do the job:

((?:\d+(?:+|-|sd))?\d+(?:\.\d+)\h*%)

Explanation:

(               : start group 1
  (?:           : start non capture group
    \d+         : 1 or more digits
    (?:+|-|sd)  : non capture group that contains + or - or sd
  )?            : end group
  \d+           : 1 or more digits
  (?:           : start non capture group
    \.          : a dot
    \d+         : 1 or more digits
  )             : end group
  \h*           : 0 or more horizontal spaces
  %             : character %
)               : end of group 1

The result will be in group 1.

In java you have to double escape, I've not done it here for readability.

Sign up to request clarification or add additional context in comments.

Comments

1

You can do as follows:

  • First find all the matches in each string
  • Replace the last character(%) of each match elements with Blank
  • Do as your own formatting.

A java samples is given :

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {

    public static void main(String[] args) {
        final String regex = "\\d+(\\.?\\d+)?(\\+|\\-|sd)?(\\d+(\\.?\\d+)?)?[ ]*%";
        final String test_str = "\"Titi 10% Toto and tutu equals 20X\"\n"
                + "\"Titi 10.50% Toto and tutu equals 20X\"\n"
                + "\"Titi 10-10.50% Toto and tutu equals 20X\n"
                + "\"Titi 10sd50 % Toto and tutu equals 20X\n"
                + "\"Titi 10-10.50% or 10sd50 % Toto and tutu equals 20X";

        final Pattern pattern = Pattern.compile(regex);
        for(String data : test_str.split("\\r?\\n")) {
            Matcher matcher = pattern.matcher(data);
            while (matcher.find()) {
                System.out.print(data.substring(matcher.start(), matcher.end()-1) + " ") ;
            }
            System.out.println();
        }
    }
}

The above code gives :

10 
10.50 
10-10.50 
10sd50  
10-10.50 10sd50 

You can do anything with these data. You can see the Explanations : Regex101

2 Comments

Thank you very much Sunkuet. And your external link is very nice :)
@B.Gees , If the answer helps you, then you can upvote my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.