1

How do I process following PHP regex in Java:

if(preg_match("/\r\n(.*?)\$/",$req,$match)){ $data=$match[1]; }

This line is part of following function, by the way:

function getheaders($req){
  $r=$h=$o=null;
  if(preg_match("/GET (.*) HTTP/"   ,$req,$match)){ $r=$match[1]; }
  if(preg_match("/Host: (.*)\r\n/"  ,$req,$match)){ $h=$match[1]; }
  if(preg_match("/Origin: (.*)\r\n/",$req,$match)){ $o=$match[1]; }
  if(preg_match("/Sec-WebSocket-Key2: (.*)\r\n/",$req,$match)){ $key2=$match[1]; }
  if(preg_match("/Sec-WebSocket-Key1: (.*)\r\n/",$req,$match)){ $key1=$match[1]; }
  if(preg_match("/\r\n(.*?)\$/",$req,$match)){ $data=$match[1]; }
  return array($r,$h,$o,$key1,$key2,$data);
}

Thanks in advance!

So far I have:

Matcher matcher = Pattern.compile("\r\n(.*?)\\$").matcher(req);
while(matcher.find()){
    data = matcher.group(1);
}

I am sure, however, that this is wrong.

Ok guys, thanks for your answers, but they did not help yet. May I ask you to tell me, however, what this regex means:

  if(preg_match("/\r\n(.*?)\$/",$req,$match)){ $data=$match[1]; }

I know, that if it does find a match with /\r\n(.*?)\$/ in the string $req, it will save the different kinds of mathces into the array $match. BUT: what is being matched here? And what's the difference between $match[0] and $match[1]? Maybe, if I understand this, I will be able to reconstruct the way to produce equal results in Java.

Thanks Jaroslav, but:

The string I am trying to process however (the last line of the handshake sent to me by Google Chrome, is:

Cookie: 34ad04df964553fb6017b93d35dccd5f=%7C34%7C36%7C37%7C40%7C41%7C42%7C43%7C44%7C45%7C46%7C47%7C48%7C49%7C50%7C52%7C53%7C54%7C55%7C56%7C57%7C58%7C59%7C60%7C61%7C62%7C63%7C64%7C65%7C66%7C67%7C68%7C69%7C70%7C71%7C72%7C73%7C74%7C75%7C76%7C77%7C78%7C79%7C80%7C81%7C82%7C83%7C84%7C85%7C86%7C87%7C88%7C89%7C90%7C91%7C92%7C93%7C94%7C95%7C96%7C97%7C98%7C99%7C100%7C101%7C102%7C103%7C104%7C105%7C106%7C107%7C108%7C109%7C110%7C111%7C112%7C113%7C114%7C115%7C116%7C117%7C118%7C119%7C120%7C121%7C122%7C123%7C124%7C125%7C126%7C127%7C128%7C129%7C130%7C131%7C132%7C133%7C134%7C135%7C136%7C137%7C138%7C139%7C%3B%7C%3B%7C%3B%7C%3B1%3B2%3B3%3B4%3B5%3B6%3B7%3B8%3B9%3B10%3B11%3B14%3B15%3B18%3B23%3B24%3B25%3B26%3B28%3B29%3B30%3B31%3B32%3B33%3B%7C

Hey guys, I just now realize what I have been asking was irrelevant :( But one answer has been right.

7
  • What do you have so far? Commented Jul 16, 2011 at 11:34
  • Matcher.find is matching by sequences. You should escape special characters - look at my example below. Try also caching compiled Pattern (not to comile it every time) to improve performance. Commented Jul 16, 2011 at 11:43
  • try removing \\$ at the end - while your input is longer that one line it may cause a problem Commented Jul 16, 2011 at 11:55
  • @arik-so: it looks like you don't have to use regular expressions. In Java there are more convenient methods for accessing headers. Look at my updated response for details. Commented Jul 16, 2011 at 12:32
  • Thanks, zacheusz, that's interesting, but the thing is: it's not a simple header, but I've got to process that header somehow. I need not take the whole header, but either the last eigth bytes of it (according to wikipedia) OR whatever phpwebsocket does, and then process it for my resonse (WebSocket is not easy :D ) Commented Jul 16, 2011 at 12:47

2 Answers 2

1

Use java.util.regex.Pattern look here for instructions for this class. Here is Regular Expressions Tutorial. And here is the example:

String p = "Host: (.*)\\r\\n";
String input = "Host: example.com\r\n";
Pattern pattern = Pattern.compile(p);
Matcher matcher = pattern.matcher(input);
if(matcher.matches()) {
  String output = matcher.group(1);
    System.out.println(output);
} else {
    System.out.println("not found");
}

Note: Matcher.find matches subsequences, Matcher.matches matches entire region. IMHO in your example \\$ at the end may cause a problem when your input is multiline and you parse it at once.

In Java there are more convenient methods for accessing headers. At client side this is HttpURLConnection.getHeaderField. At the server side there is HttpServletRequest.getHeader.

Sign up to request clarification or add additional context in comments.

Comments

0
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class SplitDemo2 {

    private static final String REGEX = "/\\r\\n(.*?)\\$/";
    private static final String INPUT = "/GET (.*) HTTP/";

    public static void main(String[] args) {
        Pattern p = Pattern.compile(REGEX);
        Matcher m = p.matcher(INPUT); // get a matcher object
        int count = 0;

        while(m.find()) {
          count++;
          System.out.println("Match number "+count);
          System.out.println("start(): "+m.start());
          System.out.println("end(): "+m.end());
   }
}

}

More info on regex http://download.oracle.com/javase/tutorial/essential/regex/matcher.html

Explanation of your regex

\r\n(.*?)\$

\r Carriage return character.

\n Line break character.

(.*?) A numbered capture group

\$ Matches a $ character.

4 Comments

Unfortunately, nothing is found using this expression.
And: what does this (.*?) expression mean? It is not my code I am trying to understand, but that of phpwebsocket.
It doesn't. You can check your expression here regexplanet.com/simple/index.html I just wrote down how to do it Java.
Thanks. But there is something, see my updated question: .*? does not even turn up there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.