0

I am trying to do a split of String array at the i th location. with a regex for 4 or more spaces.

i found a lot of information here and other sites, hence I came up with

String[] parts = titlesAuthor[i].split("    ");

so the split can happen between the title and authors name which contains either 4 or more spaces or does not exist as all.

Example:

titleAuthor[0] = Investigational drugs for autonomic dysfunction in Parkinson's disease          Perez-Lloret S

After running the above split, parts[0] is coming up as empty and part[1] has the complete string.

please help!

code :

for (int i = 0; i < nodes.getLength(); i++) { Element element = (Element) nodes.item(i); NodeList title = element.getElementsByTagName("TEXT"); line = (Element) title.item(0); titlesAuthor[i] = getCharacterDataFromElement(line); System.out.println(titlesAuthor[i]); parts = titlesAuthor[i].split(" "); System.out.println(parts[0]); System.out.println(parts[1]); } 
7
  • 2
    So... we're supposed to fix your code without ever seeing the code? Commented Feb 6, 2016 at 23:53
  • its long with a lot of commented out stuff, should i post it all? Commented Feb 6, 2016 at 23:56
  • 1
    1. You're splitting a String into a String[] (array), not a String array to whatever. 2. your example doesn't even compile, you need double quotes around the string literal. 3. You're splitting with exactly 4 spaces, so if you would have 10 spaces then you would split it up like this: ["before the spaces",/*we 'ate' 2*4 spaces*/"--"] (I changed spaces with dashes to see them) Commented Feb 6, 2016 at 23:59
  • Post MVCE. Commented Feb 6, 2016 at 23:59
  • for (int i = 0; i < nodes.getLength(); i++) { Element element = (Element) nodes.item(i); NodeList title = element.getElementsByTagName("TEXT"); line = (Element) title.item(0); titlesAuthor[i] = getCharacterDataFromElement(line); System.out.println(titlesAuthor[i]); parts = titlesAuthor[i].split(" "); System.out.println(parts[0]); System.out.println(parts[1]); } Commented Feb 7, 2016 at 0:00

4 Answers 4

1

Use regex \s{4}

Actually 4 is the number of spaces , you can change it to whatever number you want.

See the demo

Sign up to request clarification or add additional context in comments.

1 Comment

this matches exactly 4 spaces, as the one in the question
0

To catch 4 or more spaces you need to indicate it with a +:

String[] parts = titlesAuthor[i].split("    +");

or:

String[] parts = titlesAuthor[i].split(" {4,}");

update: it looks like your xml doesn't look exactly as you think. In the code you provided add:

System.out.println(i + ":" + titlesAuthor[i] + ";");

and you'll see some spaces or new lines at the beginnng.

8 Comments

tried it, its leaving parts[0] empty and saving the title in parts[1]
I'm sorry, but can you fix the titleAuthor[0] in your question? I told you above that you need double quotes.
its one line from xml file among 1500 other records
I'm pretty sure that the input string is not what you think it is. Unless you give us an example input in the question, that can be compiled, I can't help further I'm afraid
yeah it was not, i used String regex = " {4,}"; parts = titlesAuthor[i].split(regex); The first element has 6 empty spaces, the second has the title and third has the author
|
0

THIS will skip the space.. split ("\s+")

1 Comment

this matches 1,2,3 spaces as well as space,tab,space, etc :)
0

In your example, your code is splitting when it finds four consecutive spaces. The String that you are splitting above has ten consecutive spaces between:

"disease          Perez".

Thus, there is a split between the spaces. Pretend "#" is a space:

Investigational drugs for autonomic dysfunction in Parkinson's disease|SPLIT|null|SPLIT|##Perez-Lloret S

Your split will result in:

{[Investigational drugs for autonomic dysfunction in Parkinson's disease],[null], [##Perez-Lloret S]}

because your code found two instances of four spaces. The parts[1] is empty because there was nothing present in between the two splits.

Hope this helps!

3 Comments

but this doesn't explain why he doesn't get the 1st part of the sentence in parts[0]
in 1st part im getting 6 empty spaces, 2nd part the title and third the author if exists. realized every line has 6 empty spaces in the beginning
Can you repost your code in a readable format. When you post it, highlight it and press command 'k' if you are using a Mac.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.