0

I need to get the following information from an xml file:

    "abridged_cast": [
      {
        "name": "Tom Hanks",
        "characters": ["Woody"]
      },
      {
        "name": "Tim Allen",
        "characters": ["Buzz Lightyear"]
      },
      {
        "name": "Joan Cusack",
        "characters": ["Jessie the Cowgirl"]
      },
      {
        "name": "Don Rickles",
        "characters": ["Mr. Potato Head"]
      },
      {
        "name": "Wallace Shawn",
        "characters": ["Rex"]
      }
    ],

So far I have been able to cut it to:

    "abridged_cast": [
     {
        "name": "Tom Hanks",
        "characters": ["Woody"]

The above is obtained using this regex:

\"abridged_cast\": \\[([^]]+)\\]

I need the regex to extend to the bottom ], but I can't seem to get it to work. I have tried a huge number of variations with no luck.

4
  • 3
    This is valid JSON, you can parse it with Jackson or GSON. Commented Jan 7, 2015 at 7:31
  • 2
    IMHO using a JSON lib would be a much more straightforward approach Commented Jan 7, 2015 at 7:34
  • 5
    Aaack. Please don't use regexes to parse JSON. Use a JSON parser. Commented Jan 7, 2015 at 7:34
  • Does your input have linefeeds? Commented Jan 7, 2015 at 7:36

2 Answers 2

1

This is a bit of a train wreck, but:

"abridged_cast": \[(\s*\{\s*"name": "[a-zA-Z .]+",\s*"characters": \[("[a-zA-Z .]+", )*"[a-zA-Z .]+"\]\s*\}(,(?=\s*\{)|\s))*\s*\],?

See demo.

Since the "characters" field is an array, I've allowed for multiple terms there, an example of which I included in the demo.

Note that I have just shown the raw regex; to use it in java you'll have to escape the quotes and backslashes (which I didn't have the stomach for).

Sign up to request clarification or add additional context in comments.

1 Comment

Worked perfectly. I was trying to do it without making a huge one like this, but it's all the same in the end.
0

If you have complete and valid JSON, you can parse it with Jackson or GSON.

This is data classes:

public static class Role {
    private String name;
    private List<String> characters;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public List<String> getCharacters() {
        return characters;
    }

    public void setCharacters(List<String> characters) {
        this.characters = characters;
    }
}

public static class Cast {
    @JsonProperty("abridged_cast")
    private List<Role> roles;

    public List<Role> getRoles() {
        return roles;
    }

    public void setRoles(List<Role> roles) {
        this.roles = roles;
    }
}

And this is how you can parse it:

ObjectMapper om = new ObjectMapper();
Cast cast = om.readValue(s, Cast.class);

where s is your JSON.

1 Comment

This would work, but my system was built around using regular expressions and was completely built and ready. All I needed was a working regex for that one part.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.