1

I was trying to extract my data from a string by using regular expression.

My data looks like:

 12 170 0.11918
170  12 0.11918
 12 182 0.06361
182  12 0.06361
 12 198 0.05807
198  12 0.05807
 12 242 0.08457
242  12 0.08457
 11  30 0.08689
 30  11 0.08689

The problems here are the different number of whitespace between two numbers.

All in all i want to extract from each line two Integers and one Double. Therefore i tried to use regular expressions.

  Pattern p = Pattern.compile("(([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+))");
  Matcher m = p.matcher("  6    7781     0.01684000");
  while (m.find()) {
     System.out.println(m.group(0));  
  }

I now my regular expression doesn't work. Has anyone some help for a suitable regular expression therefore i can work with the data or any other help for me?

2
  • 1
    play a little bit with txt2re.com Commented Aug 11, 2014 at 22:29
  • 1
    thanks for that help. Such an nice tool!!! Commented Aug 11, 2014 at 22:39

6 Answers 6

2

why not read each line and do a line.trim().split("\\s+")? If your project has already used guava, the Splitter could be used too.

Sign up to request clarification or add additional context in comments.

Comments

1

check http://txt2re.com/index-java.php3?s=%2012%20170%200.11918&11&5&12&4&13&1

you're probably interested in the int1, int2 and float1 below

 public static void main(String[] args)
  {
    String txt=" 12 170 0.11918";

    String re1="(\\s+)";    // White Space 1
    String re2="(\\d+)";    // Integer Number 1
    String re3="(\\s+)";    // White Space 2
    String re4="(\\d+)";    // Integer Number 2
    String re5="(\\s+)";    // White Space 3
    String re6="([+-]?\\d*\\.\\d+)(?![-+0-9\\.])";  // Float 1

    Pattern p = Pattern.compile(re1+re2+re3+re4+re5+re6,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
    Matcher m = p.matcher(txt);
    if (m.find())
    {
        String ws1=m.group(1);
        String int1=m.group(2);
        String ws2=m.group(3);
        String int2=m.group(4);
        String ws3=m.group(5);
        String float1=m.group(6);
        System.out.print("("+ws1.toString()+")"+"("+int1.toString()+")"+"("+ws2.toString()+")"+"("+int2.toString()+")"+"("+ws3.toString()+")"+"("+float1.toString()+")"+"\n");
    }
  }

2 Comments

It's not clear why you would want to specify CASE_INSENSITIVE on a pattern that is matching only numbers, sign symbols, decimal points, and spaces, or why you would want to specify DOTALL on a pattern that is matching a single line, although it does no harm.
just because the tool I am using (txt2re) uses this by default
1

I recommend using a Scanner.

Scanner scanner = new Scanner(line);
scanner.useDelimiter(" ");
int int1 = scanner.nextInt()
int int2 = scanner.nextInt()
double double1 = scanner.nextDouble()

Comments

0

Try this:

([\d.]+) - This will get all strings containing only digits or periods (.).

Edit:

I see you're wanting three groups out of a line. This instead, will help by ignoring white space, and grabbing the three groups of numbers. The leading ^ and trailing $ ensure that you're only matching on a single line.

^\s*?([\d.]+)\s*([\d.]+)\s*?([\d.]+)\s*?$

Comments

0

Something like this (fix up the float part as needed) -

 # raw:  (?m)^\h*(\d+)\h+(\d+)\h+(\d*\.\d+)
 # quoted: "(?m)^\\h*(\\d+)\\h+(\\d+)\\h+(\\d*\\.\\d+)"

 (?m)             # Multi-line modifier
 ^                # BOL
 \h*              # optional, horizontal whitespace
 ( \d+ )          # (1), int
 \h+              # required, horizontal whitespace
 ( \d+ )          # (2), int
 \h+              # required, horizontal whitespace
 ( \d* \. \d+ )   # (3), float

Comments

0
String s = " 12 170 0.11918\n" + "170  12 0.11918 \n"
            + " 12 182 0.06361\n" + "182  12 0.06361 \n"
            + " 12 198 0.05807\n" + "198  12 0.05807 \n"
            + " 12 242 0.08457\n" + "242  12 0.08457 \n"
            + " 11  30 0.08689\n" + " 30  11 0.08689 \n";

    String[] lines = s.split("\\n");

    for( String line : lines ) {
        Scanner scan = new Scanner(line);
        scan.useDelimiter("\\s+");
        scan.useLocale(Locale.ENGLISH);
        System.out.println(scan.nextInt());
        System.out.println(scan.nextInt());
        System.out.println(scan.nextDouble());
    }

I would use a Scanner for this problem.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.