2

I want to split a number of strings similar to name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST] into only these tokens:

john
20
toledo
seattle
[2/8/12 15:48:01:837 MST]

I'm doing this

String delims = "(name|id|dest|from|date_time)?[:,\\s]+";
String line = "name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST]";
String[] lineTokens = line.split(delims, 5);

for (String t : lineTokens)
{
    // for debugging
    System.out.println (t);
    // other processing I want to do
}   

but every even element in lineTokens turns out to be either empty or just whitespace. Each odd element in lineTokens is what I want, i.e. lineTokens[0] is "", lineTokens[1] is "john", lineTokens[2] is "", lineTokens[3] is "20", etc. Can anyone explain what I'm doing wrong?

3 Answers 3

3

The problem is that your regex is not matching , id: as a whole, it is matching , as one and then id: as a 2nd match. Between these two matches you have an empty string. You need to modify it to match the whole thing. Something like this:

String delims = "(, )?(name|id|dest|from|date_time)?[:\\s]+";

http://ideone.com/Qgs8y

Sign up to request clarification or add additional context in comments.

2 Comments

This works. I'm curious as to why the 0th index of lineTokens is still empty though, and why you pass 6 in for split()'s second argument (seems like it's just because the 0th index is empty).
Oh, the answer to both of those is the same actually. If your string starts with a delimiter you're going to get an empty string as your first match. Need to ignore that match.
2

Why not a little less complicated regex solution.

String str =  "name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST]";
String[] expr = str.split(", ");
for(String e : expr)
System.out.println(e.split(": ")[1]);

Output =

john

20

toledo

seattle

[2/8/12 15:48:01:837 MST]

2 Comments

Technically ", " and ": " are still regexes.
@isbadawi: nice catch but you know what i mean.
1

I made some changes to your code:

    String delims = "(name|id|dest|from|date_time)[:,\\s]+";
    String line = "name: john, id: 20, dest: toledo, from: seattle, date_time: [2/8/12 15:48:01:837 MST]";
    String[] lineTokens = line.split(delims);

    for (String t : lineTokens)
    {
        // for debugging
        System.out.println (t);
        // other processing I want to do
    }   

also you should ignore the first element in lineTokens, since it's the capturing from the beginning of the line till "name:...."

1 Comment

have you tested this? I think the output from your code would be, like, lineTokens[1] == "john, " instead of lineTokens[1] == "john"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.