0
String text = '[["item1","item2","item3"], ["some", "item"], ["far", "out", "string"]]';

I would like to iterate over each individual ArrayList. I don't know how to turn that string into appropriate ArrayList object.

4
  • ArrayLists are one dimensional. Do you want an ArrayList of ArrayLists? Commented Oct 10, 2010 at 1:08
  • yes, I just don't know how to convert that string into the ArrayList java object. Commented Oct 10, 2010 at 1:16
  • Well it's non-trivial, you basically need to write a JSON parser...speaking of which, can you just use a JSON parser? Commented Oct 10, 2010 at 1:21
  • Do you have control over how that string is formatted? Can you restructure the declaration to String[][] text = {{"item1","item2","item3"}, {"some", "item"}, {"far", "out", "string"}}; If so, it's trivial. If not, you have to do some parsing (Is this homework)? Commented Oct 10, 2010 at 1:25

4 Answers 4

4

This syntax looks like a subset of JSON, and I would guess that the client side is actually encoding it as JSON. Assuming that is true, the simplest approach will be to use an off-the-shelf JSON parser, and some simple Java code to convert the resulting objects into the form that your code requires.

Sure, you could implement your own parser by hand, but it is probably not worth the effort, especially if you have to deal with string escaping, possible variability in whitespaces and so on. Don't forget that if you implement your own parser, you NEED TO IMPLEMENT UNIT TESTS to make sure that it works across the full range of expected valid input, and for invalid input as well. (Testing the cases of invalid input is important because you don't want your server to fall over if some hacker sends requests containing bad input.)

Before you go any further, you really need to confirm the exact syntax that the client is sending you. Just looking at an example is not going to answer that. You either need a document specifying what the syntax is, or you need to look at the client / application source code.

Sign up to request clarification or add additional context in comments.

3 Comments

yes, I got the JSON parser working properly. Last thing I needed was parsing through the string manually.
@Kim - I don't understand your comment about "parsing through the string manually".
Oh ... I understand what you mean now. I was reading "last thing I needed" in a literal sense, rather than in the rhetorical sense you intended.
3

Here's a simple parser, it should deal with all kinds of abusive nesting and will be robust to single and double quotes -- but it won't care if you mix them 'test" is treated equivalent to "test".

edit: added comments, and now it deals with escaped quotes in strings. (and now improved string token handling even more)

import java.io.IOException;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.List;

public class StringToList {

    public static void main(String[] args) throws IOException{
        StringReader sr = new StringReader("[[\"it\\\"em1\", \"item2\",\"item3\"], [\"some\",\"item\"], [\"far\",\"out\",\"string\"]]");
        System.out.println(tokenize(sr));
    }

    @SuppressWarnings({ "rawtypes", "unchecked" })
    public static List tokenize(StringReader in) throws IOException{
        List stack = new ArrayList<Object>();
        int c;
        while((c = in.read()) != -1){
            switch(c){
            case '[':
                // found a nested structure, recurse..
                stack.add(tokenize(in));
                break;
            case ']':
                // found the end of this run, return the
                // current stack
                return stack;
            case '"':
            case '\'':
                // get the next full string token
                stack.add(stringToken(in));
                break;
            }
        }

        // we artificially start with a list, though in principle I'm
        // defining the string to hold only a single list, so this
        // gets rid of the one I created artifically.
        return (List)stack.get(0);
    }

    public static String stringToken(StringReader in) throws IOException{
        StringBuilder str = new StringBuilder();
        boolean escaped = false;
        int c;
        outer: while((c = in.read()) != -1){
            switch(c){
            case '\\':
                escaped = true;
                break;
            case '"':
            case '\'':
                if(escaped){
                    escaped = false;
                }else{
                    break outer;
                }
            default:
                str.append((char)c);
            }
        }
        return str.toString();
    }

}

Just a couple of notes: this won't enforce your syntax to be correct, so if you do something goofy with the quotes, like I described, it might still get parsed as (un)expected. Also, I don't enforce commas at al, you don't even need a space between the quotes, so ["item1""item2"] is just as valid using this parser as ["item1", "item2"], but perhaps more oddly, this thing should also deal with ["item1"asdf"item2"] ignoring asdf.

Comments

2

Since you are using a string that looks like JSON, I would just use a JSON parser. One of the simplest to uses is gson. Here is an example using gson:

String text = '[["item1","item2","item3"], ["some", "item"], ["far", "out", "string"]]';
GSON gson = new GSON();
ArrayList<ArrayList<String>> list = gson.fromJson(text, new TypeToken<ArrayList<ArrayList<String>>>() {}.getType());

Here is the gson site: http://code.google.com/p/google-gson/

Comments

-2

You need to build a parser by hand. It's not hard, but it will take up time. In the previous comment you said you want an ArrayList of ArrayList... hmmm... good

Just parse the string char by char and recognize each token by first defining recursive parsing rules. Recursive descendant parser rules are usually graphical, but I can try to use ABNF for you

LIST = NIL / LIST_ITEM *( ',' SP LIST_ITEM)
LIST_ITEM = NIL / '[' STRING_ITEM *(, SP STRING ITEM) ']'
STRING_ITEM = '"' ANYCHAR '"'
SP = space
ANYCHAR = you know, anything that is not double quotes
NIL = ''

Another approach is to use Regular Expressions. Here are a couple of samples. First capture outer elements by

(\[[^\]]*\])

The above regex capture everything from '[' to the first ']', but you need to modify it or cut the brackets from your string (just drop first and last char)

Then capture inner elements by

(\"[^\"]\")

Simple as the above

2 Comments

The regex approach will probably work most of the time. But if the string syntax supports an escaping mechanism (e.g. to handle strings containing double-quote characters) then the regexes get messy. And if you have to deal with an indefinite number of array levels, then regexes won't work at all.
Stephen, you are right. That code was hand-written to provide quick hints to Kim. Thanks for having shown possible flaws

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.