2

In java i would like to read a file line by line and print the line to the output. I want to solve this with regular expressions.

while (...)
{
  private static java.util.regex.Pattern line = java.util.regex.Pattern.compile(".*\\n");
  System.out.print(scanner.next(line));
}

The regex in the code is not correct, as i get InputMismatchException. I am working on this regex for 2 hours. Please help with it.

With regex powertoy i see that ".*\n" is correct. But my program runs incorrectly.

The whole source is:

/**
 * Extracts the points in the standard input in off file format to the standard output in ascii points format.
 */

 import java.util.regex.Pattern;
 import java.util.Scanner;

class off_to_ascii_points 
{
    private static Scanner scanner = new Scanner(System.in);    
    private static Pattern fat_word_pattern = Pattern.compile("\\s*\\S*\\s*");
    private static Pattern line = Pattern.compile(".*\\n", Pattern.MULTILINE);

    public static void main(String[] args) 
    {
        try
        {
            scanner.useLocale(java.util.Locale.US);

                    /* skip to the number of points */
            scanner.skip(fat_word_pattern);

            int n_points = scanner.nextInt();

                    /* skip the rest of the 2. line */
            scanner.skip(fat_word_pattern); scanner.skip(fat_word_pattern);

            for (int i = 0; i < n_points; ++i)
            {
                    System.out.print(scanner.next(line));
                      /*
                      Here my mistake is. 
                      next() reads only until the delimiter, 
                      which is by default any white-space-sequence. 
                      That is next() does not read till the end of the line 
                      what i wanted.

                      Changing "next(line)" to "nextLine()" solves the problem.
                      Also, setting the delimiter to line_separator 
                      right before the loop solves the problem too.
                      */
            }

        }
        catch(java.lang.Exception e)
        {
            System.err.println("exception");
            e.printStackTrace();
        }
    }
}

The beginning of an example input is:

OFF
4999996 10000000 0
-28.6663 -11.3788 -58.8252 
-28.5917 -11.329 -58.8287 
-28.5103 -11.4786 -58.8651 
-28.8888 -11.7784 -58.9071 
-29.6105 -11.2297 -58.6101 
-29.1189 -11.429 -58.7828 
-29.4967 -11.7289 -58.787 
-29.1581 -11.8285 -58.8766 
-30.0735 -11.6798 -58.5941 
-29.9395 -11.2302 -58.4986 
-29.7318 -11.5794 -58.6753 
-29.0862 -11.1293 -58.7048 
-30.2359 -11.6801 -58.5331 
-30.2021 -11.3805 -58.4527 
-30.3594 -11.3808 -58.3798 

I first skip to the number 4999996 which is the number of lines containing point coordinates. These lines are that i am trying to write to the output.

5 Answers 5

4

I suggest using

private static Pattern line = Pattern.compile(".*");

scanner.useDelimiter("[\\r\\n]+"); // Insert right before the for-loop

System.out.println(scanner.next(line)); //Replace print with println

Why your code doesn't work as expected:

This has to do with the Scanner class you use and how that class works.

The javadoc states:

A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.

That means when you call one of the Scanner's.next* methods the scanner reads the specified input until the next delimiter is encountered.

So your first call to scanner.next(line) starts reading the following line

-28.6663 -11.3788 -58.8252 

And stops at the space after -28.6663. Then it checks if the token (-28.6663) matches your provided pattern (.*\n) which obviously doesn't match (-28.6663). That's why.

Sign up to request clarification or add additional context in comments.

Comments

1

If you only want to print the file to standard out, why do you want to use regexps? If you know that you always want to skip the first two lines, there are simpler ways to accomplish it.

import java.util.Scanner;
import java.io.File;

public class TestClass {
    public static void main(String[] args) throws Exception {
        Scanner in=new Scanner(new File("test.txt"));
        in.useDelimiter("\n"); // Or whatever line delimiter is appropriate
        in.next(); in.next(); // Skip first two lines
        while(in.hasNext())
            System.out.println(in.next());
    }
}

1 Comment

I have to read in the number of lines. Which is the first word in the 2. line.
0

You have to switch the Pattern into multiline mode.

line = Pattern.compile("^.*$", Pattern.MULTILINE);
System.out.println(scanner.next(line));

1 Comment

MULTILINE is not working either. The $ character is not enough for me, as I want the new_line character to be included into the matched string.
0

By default the scanner uses the white space as its delimiter. You must change the delimiter to the new line before you read the line after the first skips. The code you need to change is to insert the following line before the for loop:

scanner.useDelimiter(Pattern.compile(System.getProperty("line.separator")));

and update the Pattern variable line as following:

private static Pattern line = Pattern.compile(".*", Pattern.MULTILINE);

1 Comment

The "line.separator" property is not to be relied on. Any given file may use any style of line separator, or even a mix of two more styles. Scanner's hasNextLine() and nextLine() methods take that into account.
0

Thank everybody for the help.

Now i understand my mistake:

The API documentation states, that every nextT() method of the Scanner class first skips the delimiter pattern, then it tries to read a T value. However it forgets to say that each next...() method reads only till the first occurrence of the delimiter!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.