0

I need to split a String with dot '.' but with one catch as explained below For example, if a String is like this

   String str = "A.B.C"

then, splitting with dot, will give A,B and C.

But if the some part is marked with single inverted comma, then split should ignore it

String str = "A.B.'C.D'"

then, splitting with dot, should give A,B and C.D.

How can I achieve this?

4 Answers 4

2

If the String is always in the given format, you could try : \\.(?![A-Za-z]') as regex

demo here

Sign up to request clarification or add additional context in comments.

3 Comments

I could be wrong, but it looks like its matched the dots.
@Doomsknight - It splits on what it selects / consumes (not what it matches). It consumes only the literal "." the negative-look ahead is just matched, not consumed.:)
Oops, im getting mixed up with Match. +1 for a working solution.
2

First, split at ' and afterwards, if any of the split results end in ., split at . as well again.

"A.B.'C.D'"
=>
"A.B.", "C.D"
=> "A", "B", "C.D"

Java 8 Example

public static void main(String[] args) {
    final String str = "A.B.'C.D'";
    final List<String> result = new ArrayList<>();

    for (String singleQuoteSplitResultArrayElement : str.split("'")) {
        if (singleQuoteSplitResultArrayElement.endsWith(".")) {
            Collections.addAll(result, singleQuoteSplitResultArrayElement.split("\\."));
        } else {
            result.add(singleQuoteSplitResultArrayElement);
        }
    }

    System.out.println(result.stream().collect(Collectors.joining(", ")));
}

Comments

0

What you can do is as follows - will work with single letter and multiple letter tokens:

String input = "A.B.'C.D'";
//                                              | not following capital letter(s) and '
//                                              |           | dot (escaped)
//                                              |           |  | not followed by 
//                                              |           |  | capital letter(s) and '
System.out.println(Arrays.toString(input.split("(?<![A-Z]+?')\\.(?![A-Z]+?')")));

Output

[A, B, 'C.D']

Note

If you want it case-insensitive, prepend (?i) to the Pattern: (?i)(?<![A-Z]+?')\\.(?![A-Z]+?')")

Comments

0

I don't know of a method in the standard library that does this. It is not too difficult to write yourself, though:

public static String[] splitByDots(String s)
{
    List<String> ss = new ArrayList<>();
    boolean inString = false;
    int start = 0;

    for (int p = 0; p < s.length(); p++) {
        char ch = s.charAt(p);
        if (ch == '\'') {
            inString = !inString;
        }
        else if (ch == '.') { 
            if (!inString) {
                ss.add(s.substring(start, p));
                start = p + 1;
            }
        }
    }

    ss.add(s.substring(start));
    return ss.toArray(new String[ss.size()]);
}

If you want to trim whitespace or remove the quote characters, you will have to tweak the above code a bit, but otherwise it does what you asked for.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.