Cocoa: Explode or break an NSString into individual words

Breaking apart a string of text into component words is a requirement for performing searches in text and other text processing. This task is easy in Cocoa/Objective-C, although it requires digging through a few class references in the documentation. If you need a more complicated expansion of a string, at least this code will give you a starting point.


To break the NSString bigString into an NSArray containing the individual words separated by whitespace, use:

NSString *bigString = @"not really that big";
 
NSArray *words = [bigString componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];

The heart of this operation is the componentsSeparatedByCharactersInSet method of NSString. It breaks bigString into an array of NSStrings. The word boundaries are set by the NSCharacterSet object generated by the class method whitespaceCharacterSet which provides space and tab characters. The various unicode newline characters can be added to those whitespace characters by calling the whitespaceAndNewlineCharacterSet method in the example above.

Of course, words can be separated by more than whitespace and newlines. Punctuation characters can be referenced using the punctuationCharacterSet method to NSCharacterSet. To perform a proper detonation of grammatical text into constituent words separated by whitespace, newlines, and punctuation, you must create a character set that is a union of those three sets:

NSMutableCharacterSet *separators = [NSMutableCharacterSet punctuationCharacterSet];
 
[separators formUnionWithCharacterSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
 
NSArray *words = [bigString componentsSeparatedByCharactersInSet:separators];

 

About Quinn McHenry

Quinn was one of the original co-founders of Tech-Recipes. He is currently crafting iOS applications as a senior developer at Small Planet Digital in Brooklyn, New York.
View more articles by Quinn McHenry

The Conversation

Follow the reactions below and share your own thoughts.

3 Responses to “Cocoa: Explode or break an NSString into individual words”

  1. June 18, 2009 at 11:31 pm, rob said:

    thanks!

    Reply

  2. September 03, 2009 at 5:05 pm, rdiz said:

    thanks holmes! For those of us transition into a new language its nice to google things like this without digging through books, if only things were documented as nicely as php.net

    Reply

  3. September 14, 2009 at 1:04 pm, Guest said:

    Insightful read. I have stumbled and twittered this for my friends. Others no doubt will like it like I did.

    Reply

Leave a Reply