0

I have some text like:

DESC:manner How did serfdom develop in and then leave Russia ?
ENTY:cremat What films featured the character Popeye Doyle ?
DESC:manner How can I find a list of celebrities ' real names ?

I read them line by line and I want to convert each line to a string Array word by word . like this:

Array = [DESC, :, manner, How, did, serfdom ,develop, in ,and ,then ,leave, Russia ,?]

2
  • 7
    Can you post the code that you have so far, and explain what is not working as you expect? Commented Dec 12, 2012 at 14:27
  • 1
    use String#split(" ")? Commented Dec 12, 2012 at 14:31

8 Answers 8

2
String[] arr = str.replaceAll(":"," : ").split(" ");
Sign up to request clarification or add additional context in comments.

Comments

0

The problem is that you want to keep some delimiters and not others (keep : and loose the spaces). I think you need a regular expresion to accomplish this. Something like this should do it:

String str = "DESC:manner How did serfdom develop in and then leave Russia ?";
String arr[] = str.split("((?<=:)|(?=:))|( )");

This uses the Lookahead and Lookbehind RegEx to find/keep the delimiter : while we added in a normal split for the space ( ) to trash those.

After this arr should be:

arr = [DESC, :, manner, How, did, serfdom, develop, in, and, then, leave, Russia, ?]

Comments

0

String value = ""; String[] values = value.split(" ");

This should get you the array by space as a delimiter.

2 Comments

i) Let the OP try ii) This doesn't meet the requirements of OP. There's no space between 'DESC',':' and 'manner'.
It could be only the starter, he has to do the rest. Meaning, we can guide, no need to code for his requirement. Agree?
0

For a more comprehensive solution, you can use boundary matchers, as described here

String s = "DESC:manner How did serfdom develop in and then leave Russia ?";

String[] split = s.split("\\b");

the split array contains what you are looking for.

1 Comment

This would treat the spaces between words as separate strings.
0

u mean u want to split your string line to a string array.

there are two convenient ways to do this.

first of course is the split method. U can refer this method in the j2se String class.

second would be the regex pattern. U can refer the information in the j2se regex class either

Comments

0

For your example:

String[] part = line.split(":| ");

Where line would be one your example lines.

Note that there is a space after the | in the regex.

I would advise reading up on Regular Expressions and getting hold of a tool like Expresso to try them out.

Expresso: http://www.ultrapico.com/Expresso.htm

1 Comment

That's not what the OP wanted, it trashes the : instead of adding it as an element in the result.
0

If you don't mind punctuation being removed from the result, String#split("\\W") (split on nonword characters) will do it:

// you've got this from the input parser
String inputLine = "DESC:manner How did serfdom develop in and then leave Russia ?";

String[] wordArray = inputLine.split("\\W");

That gives:

wordArray = [DESC, manner, How, did, serfdom, develop, in, and, then, leave, Russia]

If you need punctuation, I don't think a regex will be able to do it, since it does the split by destroying the matched character.

1 Comment

I don't think a regex will be able to do it - sure it can, you just need the corerct regex :)
0

You can use Guava's Splitter:

Iterable<String> wordsIterable = Splitter.on(Pattern.compile("\\b")).trimResults().omitEmptyStrings().split(string);
String[] words = Iterables.toArray(wordsIterable, String.class);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.