1

I'm having some trouble with a string represented as an array of characters. What I'd like to do, as I would do in java, is the following:

     while (i < chars.length) {
        char ch = chars[i];
        if ((WORD_CHARS.indexOf(ch) >= 0) == punctuation) {

            String token = buffer.toString();
            if (token.length() > 0) {
                parts.add(token);
            }
            buffer = new StringBuffer();

        }
        buffer.append(ch);
        i++;
    }

What I'm doing is something like this:

while(i < strlen(chars)) {

    char ch = chars[i];
    if(([WORD_CHARS rangeOfString:ch] >= 0) == punctuation) {

        NSString *token = buffer.toString();
        if([token length] > 0) {
            [parts addObject:token];
        }
        buffer = [NSMutableString string];
    }
    [buffer append(ch)];
    i++;
}

I'm not sure how I'm supposed to convert

 String token = buffer.toString();

to objective c, where buffer is an NSMutableString. Also, how do I check this if condition in objective c?

if ((WORD_CHARS.indexOf(ch) >= 0) == punctuation) 

WORD_CHARS is an NSString. I'm also having trouble with appending ch to buffer.

Any help is greatly appreciated.

4
  • 1
    developer.apple.com/library/mac/#documentation/Cocoa/Reference/… Commented Jun 30, 2013 at 11:12
  • 3
    Even if your Java code can be translated almost verbatim to Objective-C, there might be better and simpler methods available to achieve the result. Therefore it would help if you show some sample input and the expected output. Commented Jun 30, 2013 at 11:23
  • And, this question has nothing to do with Java, please remove the tag. Commented Jun 30, 2013 at 11:24
  • I think that a comparison between the Java code snippet and an Objective-C one isn't misplaced. Commented Jun 30, 2013 at 14:33

2 Answers 2

6

Sometimes a line by line translation isn't the best way.

I'd do something more similar to this (untested) code if chars is an NSString;

NSCharacterSet *punctuation = 
    [NSCharacterSet characterSetWithCharactersInString:@"<your separators>"];
NSArray *parts = [chars componentsSeparatedByCharactersInSet:punctuation];

That should leave parts an NSArray of NSStrings that contain your original NSString split by punctuation.

Sign up to request clarification or add additional context in comments.

Comments

1

From your example it appears that you are trying to omit punctuations and create a list of words from a given string. Well Foundation has you covered if that's your intent. If it's not your intent, feel free to minus one.

Say your original string is stored in a variable named string. Here's one way to enumerate through all words in the string, which automatically skips punctuations.

NSRange fullRange = NSMakeRange(0, string.length);
[string enumerateSubstringsInRange:fullRange
                           options:NSStringEnumerationByWords 
                       usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
    // this block will be invoked for each word in the string
    // and the word is stored in substring.
}];

Given this sentence:

Typically, multiple-word names will be returned as multiple tokens, following the standard tokenization practice of the tagger. If this option is set, then multiple-word names will be joined together and returned as a single token.

the tokens I got were (notice the absence of punctuation):

Typically
multiple
word
names
will
be
returned
as
multiple
tokens
following
the
standard
tokenization
practice
of
the
tagger
If
this
option
is
set
then
multiple
word
names
will
be
joined
together
and
returned
as
a
single
token

If you have more complex requirements, you can look at enumeration using enumerateLinguisticTagsInRange:scheme:options:orthography:usingBlock:.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.