Using NSScanner to parse a string - objective-c

I have a string with formatting tags in it, such as There are {adults} adults, and {children} children. I have a dictionary which has "adults" and "children" as keys, and I need to look up the value and replace the macros with that value. This is fully dynamic; the keys could be anything (so I can't hardcode a stringByReplacingString).
In the past, I've done similar things before just by looping through a mutable string, and searching for the characters; removing what I've already searched for from the source string as I go. It seems like this is exactly the type of thing NSScanner is designed for, so I tried this:
NSScanner *scanner = [NSScanner scannerWithString:format];
NSString *foundString;
scanner.charactersToBeSkipped = nil;
NSMutableString *formatedResponse = [NSMutableString string];
while ([scanner scanUpToString:#"{" intoString:&foundString]) {
[formatedResponse appendString:[foundString stringByReplacingOccurrencesOfString:#"{" withString:#""]]; //Formatted string contains everything up to the {
[scanner scanUpToString:#"}" intoString:&foundString];
NSString *key = [foundString stringByReplacingOccurrencesOfString:#"}" withString:#""];
[formatedResponse appendString:[data objectForKey:key]];
}
NSRange range = [format rangeOfString:#"}" options:NSBackwardsSearch];
if (range.location != NSNotFound) {
[formatedResponse appendString:[format substringFromIndex:range.location + 1]];
}
The problem with this is that when my string starts with "{", then the scanner returns NO, instead of YES. (Which is what the documentation says should happen). So am I misusing NSScanner? The fact that scanUpToString doesn't include the string that was being searched for as part of its output seems to make it almost useless...
Can this be easily changed to do what I want, or do I need to re-write using a mutable string and searching for the characters manually?

Use isAtEnd to determine when to stop. Also, the { and } are not included in the result of scanUpToString:, so they will be at the beginning of the next string, but the append after the loop is not necessary since the scanner will return scanned content even if the search string is not found.
// Prevent scanner from ignoring whitespace between formats. For example, without this, "{a} {b}" and "{a}{b}" and "{a}
//{b}" are all equivalent
[scanner setCharactersToBeSkipped:[NSCharacterSet characterSetWithCharactersInString:#""]];
while(![scanner isAtEnd]) {
if([scanner scanUpToString:#"{" intoString:&foundString]) {
[formattedResponse appendString:foundString];
}
if(![scanner isAtEnd]) {
[scanner scanString:#"{" intoString:nil];
foundString = #""; // scanUpToString doesn't modify foundString if no characters are scanned
[scanner scanUpToString:#"}" intoString:&foundString];
[formattedResponse appendString:[data objectForKey:foundString];
[scanner scanString:#"}"];
}
}

Related

Parsing SRT file with Objective C

Text example:
1
00:00:00,000 --> 00:00:01,000
This is the first line
2
00:00:01,000 --> 00:00:02,000
This is the second line
3
00:00:02,000 --> 00:00:03,000
This is the last line
In JavaScript I would parse this with a regular expression certainly. I'm just wondering, is that the best way to do this in Obj C? I'm sure I could figure out a way to do this, but I'm wanting to do it an appropriate way.
I only need to know where to start and I'm happy to do the rest, but for understanding sake I'm going to end up with something like this (pseudo code):
NSDictionary
index -> [0-9]+
start -> hh:mm:ss,mmm
end -> hh:mm:ss,mmm
text -> one of the lines of text
In this case, I'd be parsing three entries into my dictionary.
Some background: I wrote a small app and created a file called stuff.srt containing your examples that resides in the bundle; hence, my means of accessing it.
This is just a quick and dirty thing, a proof-of-concept. Note that it doesn't check results. Real applications always check their results. As you can see, the work takes place in the -applicationDidFinishLaunching: method (I'm working in Mac OS X, not iOS).
EDIT:
It's been pointed out that the code as originally posted didn't handle multiple text lines correctly. To address this, I take advantage of the fact that SRT files use CRLF as their line breaks, and search for two occurrences of this sequence. I then change all occurrences of CRLF in the text string to spaces, based on what I observed here. This doesn't account for leading or trailing spaces in each line of the text.
I changed the contents of the stuff.srt file to this:
1
00:00:00,000 --> 00:00:01,000
This is the first line
and it has a secondary line
2
00:00:01,000 --> 00:00:02,000
This is the second line
3
00:00:02,000 --> 00:00:03,000
This is the last line
and it has a secondary line too
and the code has been revised as follows (I also put everything into an #autoreleasepool directive; there might be a lot of autoreleased objects generated in the course of parsing the file!):
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
NSString *path = [[NSBundle mainBundle] pathForResource:#"stuff" ofType:#"srt"];
NSString *string = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:NULL];
NSScanner *scanner = [NSScanner scannerWithString:string];
while (![scanner isAtEnd])
{
#autoreleasepool
{
NSString *indexString;
(void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&indexString];
NSString *startString;
(void) [scanner scanUpToString:#" --> " intoString:&startString];
// My string constant doesn't begin with spaces because scanners
// skip spaces and newlines by default.
(void) [scanner scanString:#"-->" intoString:NULL];
NSString *endString;
(void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&endString];
NSString *textString;
// (void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&textString];
// BEGIN EDIT
(void) [scanner scanUpToString:#"\r\n\r\n" intoString:&textString];
textString = [textString stringByReplacingOccurrencesOfString:#"\r\n" withString:#" "];
// Addresses trailing space added if CRLF is on a line by itself at the end of the SRT file
textString = [textString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
// END EDIT
NSDictionary *dictionary = [NSDictionary dictionaryWithObjectsAndKeys:
indexString, #"index",
startString, #"start",
endString , #"end",
textString , #"text",
nil];
NSLog(#"%#", dictionary);
}
}
}
The revised output looks like this:
2013-02-09 16:10:17.727 SRTFileScan[4846:303] {
end = "00:00:01,000";
index = 1;
start = "00:00:00,000";
text = "This is the first line and it has a secondary line";
}
2013-02-09 16:10:17.729 SRTFileScan[4846:303] {
end = "00:00:02,000";
index = 2;
start = "00:00:01,000";
text = "This is the second line";
}
2013-02-09 16:10:17.730 SRTFileScan[4846:303] {
end = "00:00:03,000";
index = 3;
start = "00:00:02,000";
text = "This is the last line and it has a secondary line too";
}
One other thing I learned from what I've read today: The SRT file format originated in France, and the comma seen in the input is the decimal separator used there.
Apple has a sample code to parse subtitle files. Check the relevant part here:
https://developer.apple.com/library/mac/samplecode/avsubtitleswriterOSX/Listings/avsubtitleswriter_SubtitlesTextReader_m.html#//apple_ref/doc/uid/DTS40013409-avsubtitleswriter_SubtitlesTextReader_m-DontLinkElementID_5
My suggest is to use a NSDateFormatter to parse the second line. I would split that string in two strings (see componentsSeparatedByString: in NSString class reference). This while reading the file line per line.
So the loop would be:
If the file contains again data, read the next line;
If the next line is a multiple of 4, allocate a new object. This object should be able to contain two dates, one integer and one string;
If the next line is not a multiple of 4, read the line and assign it's value to the corresponding field.

Yet another NSScanner characterSetWithCharactersInString newb

Let's assume I have a string ("G00 X0.0000 Y0.0000") and I need to to parse its contents. Here is my code:
NSCharacterSet *params = [NSCharacterSet characterSetWithCharactersInString:#"XY"];
//setup the scanner
NSScanner *scanner = [NSScanner scannerWithString:stringToBeScanned];
NSString *scanned = nil;
//scan the string
NSLog(#"%#", stringToBeScanned);
while ([scanner scanUpToCharactersFromSet:params intoString:&scanned]) {
struct keypair code;
code.key = [scanned characterAtIndex:0];
code.value = [[scanned substringFromIndex:1] doubleValue];
NSLog(#"--> %# [%lu]= (%c, %.4f)", scanned, [scanner scanLocation], code.key, code.value);
}
And the output to NSLog:
G00 X0.0000 Y0.0000
--> G00 [4]= (G, 0.0000)
My characterSet includes both 'X' and 'Y' and I can't figure out why my NSScanner won't scan in the 'X0.0000 ' - it should find that Y and pull in everything from X up to Y according to my understanding.
I can see from the scanLocation that the scanner is stopping at index 4 (correctly), but the loop either doesn't continue or evaluates to false. Shouldn't the scanner keep looping and finding my delimiters (from the characterSet) and grabbing data?
scanUpToCharactersFromSet:intoString: scans up to the "X" and gives you the characters it scanned "G00 ".
Note that it does not scan the "X". When you call the method again, it looks at the next character (the "X"), notices that it is a character in the set, and stops scanning. As it scanned no characters, it then returns NO.
To scan the "X" (or "Y"), you will want to use scanCharactersFromSet:intoString: as well.
I solved this issue. Basically I receive a string with a list of "codes" followed by a value associated with that command/parameter. There could several different "commands" in each string, or none at all. The key was to use scanCharactersFromSet: and scanUpToCharactersFromSet: in order to capture the right pairings and parse the entire string while staying very flexible. It's a little ugly, I know.
Here is my code:
//setup the scanner
NSScanner *scanner = [NSScanner scannerWithString:[self stringByAppendingString:#"!"]];
NSCharacterSet *codeset = [NSCharacterSet characterSetWithCharactersInString:#"GMTFIJKPRSXYZ!"];
NSString *scanned = nil;
char codechar;
//perform the first scan
[scanner scanCharactersFromSet:codeset intoString:&scanned];
if (scanned)
codechar = [scanned characterAtIndex:0];
//scan the string
while ([scanner scanUpToCharactersFromSet:codeset intoString:&scanned]) {
struct keypair code;
code.key = codechar;
code.value = [scanned doubleValue];
NSLog(#"--> (%c, %.4f)", code.key, code.value);
//skip over the delimeter we encountered
[scanner scanCharactersFromSet:codeset intoString:&scanned];
if (scanned)
codechar = [scanned characterAtIndex:0];
}

Call a method on every word in NSString

I would like to loop through an NSString and call a custom function on every word that has certain criterion (For example, "has 2 'L's"). I was wondering what the best way of approaching that was. Should I use Find/Replace patterns? Blocks?
-(NSString *)convert:(NSString *)wordToConvert{
/// This I have already written
Return finalWord;
}
-(NSString *) method:(NSString *) sentenceContainingWords{
// match every word that meets the criteria (for example the 2Ls) and replace it with what convert: does.
}
To enumerate the words in a string, you should use -[NSString enumerateSubstringsInRange:options:usingBlock:] with NSStringEnumerationByWords and NSStringEnumerationLocalized. All of the other methods listed use a means of identifying words which may not be locale-appropriate or correspond to the system definition. For example, two words separated by a comma but not whitespace (e.g. "foo,bar") would not be treated as separate words by any of the other answers, but they are in Cocoa text views.
[aString enumerateSubstringsInRange:NSMakeRange(0, [aString length])
options:NSStringEnumerationByWords | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
if ([substring rangeOfString:#"ll" options:NSCaseInsensitiveSearch].location != NSNotFound)
/* do whatever */;
}];
As documented for -enumerateSubstringsInRange:options:usingBlock:, if you call it on a mutable string, you can safely mutate the string being enumerated within the enclosingRange. So, if you want to replace the matching words, you can with something like [aString replaceCharactersInRange:substringRange withString:replacementString].
The two ways I know of looping an array that will work for you are as follows:
NSArray *words = [sentence componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
for (NSString *word in words)
{
NSString *transformedWord = [obj method:word];
}
and
NSArray *words = [sentence componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[words enumerateObjectsWithOptions:NSEnumerationConcurrent usingBlock:^(id word, NSUInteger idx, BOOL *stop){
NSString *transformedWord = [obj method:word];
}];
The other method, –makeObjectsPerformSelector:withObject:, won't work for you. It expects to be able to call [word method:obj] which is backwards from what you expect.
If you could write your criteria with regular expressions, then you could probably do a regular expression matching to fetch these words and then pass them to your convert: method.
You could also do a split of string into an array of words using componentsSeparatedByString: or componentsSeparatedByCharactersInSet:, then go over the words in the array and detect if they fit your criteria somehow. If they fit, then pass them to convert:.
Hope this helps.
As of iOS 12/macOS 10.14 the recommended way to do this is with the Natural Language framework.
For example:
import NaturalLanguage
let myString = "..."
let tokeniser = NLTokenizer(unit: .word)
tokeniser.string = myString
tokeniser.enumerateTokens(in: myString.startIndex..<myString.endIndex) { wordRange, attributes in
performActionOnWord(myString[wordRange])
return true // or return false to stop enumeration
}
Using NLTokenizer also has the benefit of allowing you to optionally specify the language of the string beforehand:
tokeniser.setLanguage(.hebrew)
I would recommend using a while loop to go through the string like this.
NSRange spaceRange = [sentenceContainingWords rangeOfString:#" "];
NSRange previousRange = (NSRange){0,0};
do {
NSString *wordString;
wordString = [sentenceContainingWord substringWithRange:(NSRange){previousRange.location+1,(spaceRange.location-1)-(previousRange.location+1)}];
//use the +1's to not include the spaces in the strings
[self convert:wordString];
previousRange = spaceRange;
spaceRange = [sentenceContainingWords rangeOfString:#" "];
} while(spaceRange.location != NSNotFound);
This code would probably need to be rewritten because its pretty rough, but you should get the idea.
Edit: Just saw Jacob Gorban's post, you should definitely do it like that.

scanUpToCharactersFromSet stops after one loop

I'm trying to get the contents of a CSV file into an array. When I've done this before I had one record per line, and used the newline character with scanUpToCharactersFromSet:intoString:, passing newlineCharacterSet as the character set:
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet]
intoString:&line])
Now, I'm working with a file where many of the entries themselves contain newline characters. I've tried adding a unique character to the end of each record (a * character) but my loop only runs once. Is there something which is making the while loop break that I don't know about? Here's the code I'm using now:
NSError *error;
NSString *data = [[NSString alloc] initWithContentsOfFile:[[self delegate] filePath] encoding:NSUTF8StringEncoding error:&error];
NSScanner *lineScanner = [NSScanner scannerWithString:data];
NSString *line = nil;
// Start parsing the CSV file
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#"*"]
intoString:&line]) {
NSArray *elements = [line componentsSeparatedByString:#","];
NSLog("Name: %#", [elements objectAtIndex:1]);
}
**Edit: ** Thanks to Peter's answer below, I found that my scanner was stuck behind the * character. I added this line in the loop:
[lineScanner scanCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#"*"] intoString:NULL];
and now it's working like it should.
Let's go through one pass at a time:
First:
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:[NSCharacterSet newlineCharacterSet]] intoString:&line]) {
The scanner puts everything before the line break into line. It advances up to the newline.
Second:
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:[NSCharacterSet newlineCharacterSet]] intoString:&line]) {
The scanner is already on a line break, so it scans no characters. As documented, since it scanned no characters, it returns NO. Your loop terminates.
The solution is to scan the line break at the end of the loop, to get the scanner past it. You can pass NULL for the output parameter, assuming you don't care what the line break was.
This is correct behavior: If you did/do care what the characters you scanned up to were, this lets you obtain them. That would be more difficult if NSScanner scanned past the characters automatically.
I think the while condition is wrong. According to the String Programming Guide, it should be something like:
while ([theScanner isAtEnd] == NO) {
[lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#"*"] intoString:&line]
// ...
}

NSString tokenize in Objective-C

What is the best way to tokenize/split a NSString in Objective-C?
Found answer here:
NSString *string = #"oop:ack:bork:greeble:ponies";
NSArray *chunks = [string componentsSeparatedByString: #":"];
Everyone has mentioned componentsSeparatedByString: but you can also use CFStringTokenizer (remember that an NSString and CFString are interchangeable) which will tokenize natural languages too (like Chinese/Japanese which don't split words on spaces).
If you just want to split a string, use -[NSString componentsSeparatedByString:]. For more complex tokenization, use the NSScanner class.
If your tokenization needs are more complex, check out my open source Cocoa String tokenizing/parsing toolkit: ParseKit:
http://parsekit.com
For simple splitting of strings using a delimiter char (like ':'), ParseKit would definitely be overkill. But again, for complex tokenization needs, ParseKit is extremely powerful/flexible.
Also see the ParseKit Tokenization documentation.
If you want to tokenize on multiple characters, you can use NSString's componentsSeparatedByCharactersInSet. NSCharacterSet has some handy pre-made sets like the whitespaceCharacterSet and the illegalCharacterSet. And it has initializers for Unicode ranges.
You can also combine character sets and use them to tokenize, like this:
// Tokenize sSourceEntityName on both whitespace and punctuation.
NSMutableCharacterSet *mcharsetWhitePunc = [[NSCharacterSet whitespaceAndNewlineCharacterSet] mutableCopy];
[mcharsetWhitePunc formUnionWithCharacterSet:[NSCharacterSet punctuationCharacterSet]];
NSArray *sarrTokenizedName = [self.sSourceEntityName componentsSeparatedByCharactersInSet:mcharsetWhitePunc];
[mcharsetWhitePunc release];
Be aware that componentsSeparatedByCharactersInSet will produce blank strings if it encounters more than one member of the charSet in a row, so you might want to test for lengths less than 1.
If you're looking to tokenise a string into search terms while preserving "quoted phrases", here's an NSString category that respects various types of quote pairs: "" '' ‘’ “”
Usage:
NSArray *terms = [#"This is my \"search phrase\" I want to split" searchTerms];
// results in: ["This", "is", "my", "search phrase", "I", "want", "to", "split"]
Code:
#interface NSString (Search)
- (NSArray *)searchTerms;
#end
#implementation NSString (Search)
- (NSArray *)searchTerms {
// Strip whitespace and setup scanner
NSCharacterSet *whitespace = [NSCharacterSet whitespaceAndNewlineCharacterSet];
NSString *searchString = [self stringByTrimmingCharactersInSet:whitespace];
NSScanner *scanner = [NSScanner scannerWithString:searchString];
[scanner setCharactersToBeSkipped:nil]; // we'll handle whitespace ourselves
// A few types of quote pairs to check
NSDictionary *quotePairs = #{#"\"": #"\"",
#"'": #"'",
#"\u2018": #"\u2019",
#"\u201C": #"\u201D"};
// Scan
NSMutableArray *results = [[NSMutableArray alloc] init];
NSString *substring = nil;
while (scanner.scanLocation < searchString.length) {
// Check for quote at beginning of string
unichar unicharacter = [self characterAtIndex:scanner.scanLocation];
NSString *startQuote = [NSString stringWithFormat:#"%C", unicharacter];
NSString *endQuote = [quotePairs objectForKey:startQuote];
if (endQuote != nil) { // if it's a valid start quote we'll have an end quote
// Scan quoted phrase into substring (skipping start & end quotes)
[scanner scanString:startQuote intoString:nil];
[scanner scanUpToString:endQuote intoString:&substring];
[scanner scanString:endQuote intoString:nil];
} else {
// Single word that is non-quoted
[scanner scanUpToCharactersFromSet:whitespace intoString:&substring];
}
// Process and add the substring to results
if (substring) {
substring = [substring stringByTrimmingCharactersInSet:whitespace];
if (substring.length) [results addObject:substring];
}
// Skip to next word
[scanner scanCharactersFromSet:whitespace intoString:nil];
}
// Return non-mutable array
return results.copy;
}
#end
If you are looking for splitting linguistic feature's of a string (Words, paragraphs, characters, sentences and lines), use string enumeration:
NSString * string = #" \n word1! word2,%$?'/word3.word4 ";
[string enumerateSubstringsInRange:NSMakeRange(0, string.length)
options:NSStringEnumerationByWords
usingBlock:
^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(#"Substring: '%#'", substring);
}];
// Logs:
// Substring: 'word1'
// Substring: 'word2'
// Substring: 'word3'
// Substring: 'word4'
This api works with other languages where spaces are not always the delimiter (e.g. Japanese). Also using NSStringEnumerationByComposedCharacterSequences is the proper way to enumerate over characters, since many non-western characters are more than one byte long.
I had a case where I had to split the console output after an LDAP query with ldapsearch. First set up and execute the NSTask (I found a good code sample here: Execute a terminal command from a Cocoa app). But then I had to split and parse the output so as to extract only the print-server names out of the Ldap-query-output. Unfortunately it is rather tedious string-manipulation which would be no problem at all if we were to manipulate C-strings/arrays with simple C-array operations. So here is my code using cocoa objects. If you have better suggestions, let me know.
//as the ldap query has to be done when the user selects one of our Active Directory Domains
//(an according comboBox should be populated with print-server names we discover from AD)
//my code is placed in the onSelectDomain event code
//the following variables are declared in the interface .h file as globals
#protected NSArray* aDomains;//domain combo list array
#protected NSMutableArray* aPrinters;//printer combo list array
#protected NSMutableArray* aPrintServers;//print server combo list array
#protected NSString* sLdapQueryCommand;//for LDAP Queries
#protected NSArray* aLdapQueryArgs;
#protected NSTask* tskLdapTask;
#protected NSPipe* pipeLdapTask;
#protected NSFileHandle* fhLdapTask;
#protected NSMutableData* mdLdapTask;
IBOutlet NSComboBox* comboDomain;
IBOutlet NSComboBox* comboPrinter;
IBOutlet NSComboBox* comboPrintServer;
//end of interface globals
//after collecting the print-server names they are displayed in an according drop-down comboBox
//as soon as the user selects one of the print-servers, we should start a new query to find all the
//print-queues on that server and display them in the comboPrinter drop-down list
//to find the shares/print queues of a windows print-server you need samba and the net -S command like this:
// net -S yourPrintServerName.yourBaseDomain.com -U yourLdapUser%yourLdapUserPassWord -W adm rpc share -l
//which dispalays a long list of the shares
- (IBAction)onSelectDomain:(id)sender
{
static int indexOfLastItem = 0; //unfortunately we need to compare this because we are called also if the selection did not change!
if ([comboDomain indexOfSelectedItem] != indexOfLastItem && ([comboDomain indexOfSelectedItem] != 0))
{
indexOfLastItem = [comboDomain indexOfSelectedItem]; //retain this index for next call
//the print-servers-list has to be loaded on a per univeristy or domain basis from a file dynamically or from AN LDAP-QUERY
//initialize an LDAP-Query-Task or console-command like this one with console output
/*
ldapsearch -LLL -s sub -D "cn=yourLdapUser,ou=yourOuWithLdapUserAccount,dc=yourDomain,dc=com" -h "yourLdapServer.com" -p 3268 -w "yourLdapUserPassWord" -b "dc=yourBaseDomainToSearchIn,dc=com" "(&(objectcategory=computer)(cn=ps*))" "dn"
//our print-server names start with ps* and we want the dn as result, wich comes like this:
dn: CN=PSyourPrintServerName,CN=Computers,DC=yourBaseDomainToSearchIn,DC=com
*/
sLdapQueryCommand = [[NSString alloc] initWithString: #"/usr/bin/ldapsearch"];
if ([[comboDomain stringValue] compare: #"firstDomain"] == NSOrderedSame) {
aLdapQueryArgs = [NSArray arrayWithObjects: #"-LLL",#"-s", #"sub",#"-D", #"cn=yourLdapUser,ou=yourOuWithLdapUserAccount,dc=yourDomain,dc=com",#"-h", #"yourLdapServer.com",#"-p",#"3268",#"-w",#"yourLdapUserPassWord",#"-b",#"dc=yourFirstDomainToSearchIn,dc=com",#"(&(objectcategory=computer)(cn=ps*))",#"dn",nil];
}
else {
aLdapQueryArgs = [NSArray arrayWithObjects: #"-LLL",#"-s", #"sub",#"-D", #"cn=yourLdapUser,ou=yourOuWithLdapUserAccount,dc=yourDomain,dc=com",#"-h", #"yourLdapServer.com",#"-p",#"3268",#"-w",#"yourLdapUserPassWord",#"-b",#"dc=yourSecondDomainToSearchIn,dc=com",#"(&(objectcategory=computer)(cn=ps*))",#"dn",nil];
}
//prepare and execute ldap-query task
tskLdapTask = [[NSTask alloc] init];
pipeLdapTask = [[NSPipe alloc] init];//instead of [NSPipe pipe]
[tskLdapTask setStandardOutput: pipeLdapTask];//hope to get the tasks output in this file/pipe
//The magic line that keeps your log where it belongs, has to do with NSLog (see https://stackoverflow.com/questions/412562/execute-a-terminal-command-from-a-cocoa-app and here http://www.cocoadev.com/index.pl?NSTask )
[tskLdapTask setStandardInput:[NSPipe pipe]];
//fhLdapTask = [[NSFileHandle alloc] init];//would be redundand here, next line seems to do the trick also
fhLdapTask = [pipeLdapTask fileHandleForReading];
mdLdapTask = [NSMutableData dataWithCapacity:512];//prepare capturing the pipe buffer which is flushed on read and can overflow, start with 512 Bytes but it is mutable, so grows dynamically later
[tskLdapTask setLaunchPath: sLdapQueryCommand];
[tskLdapTask setArguments: aLdapQueryArgs];
#ifdef bDoDebug
NSLog (#"sLdapQueryCommand: %#\n", sLdapQueryCommand);
NSLog (#"aLdapQueryArgs: %#\n", aLdapQueryArgs );
NSLog (#"tskLdapTask: %#\n", [tskLdapTask arguments]);
#endif
[tskLdapTask launch];
while ([tskLdapTask isRunning]) {
[mdLdapTask appendData: [fhLdapTask readDataToEndOfFile]];
}
[tskLdapTask waitUntilExit];//might be redundant here.
[mdLdapTask appendData: [fhLdapTask readDataToEndOfFile]];//add another read for safety after process/command stops
NSString* sLdapOutput = [[NSString alloc] initWithData: mdLdapTask encoding: NSUTF8StringEncoding];//convert output to something readable, as NSData and NSMutableData are mere byte buffers
#ifdef bDoDebug
NSLog(#"LdapQueryOutput: %#\n", sLdapOutput);
#endif
//Ok now we have the printservers from Active Directory, lets parse the output and show the list to the user in its combo box
//output is formatted as this, one printserver per line
//dn: CN=PSyourPrintServer,OU=Computers,DC=yourBaseDomainToSearchIn,DC=com
//so we have to search for "dn: CN=" to retrieve each printserver's name
//unfortunately splitting this up will give us a first line containing only "" empty string, which we can replace with the word "choose"
//appearing as first entry in the comboBox
aPrintServers = (NSMutableArray*)[sLdapOutput componentsSeparatedByString:#"dn: CN="];//split output into single lines and store it in the NSMutableArray aPrintServers
#ifdef bDoDebug
NSLog(#"aPrintServers: %#\n", aPrintServers);
#endif
if ([[aPrintServers objectAtIndex: 0 ] compare: #"" options: NSLiteralSearch] == NSOrderedSame){
[aPrintServers replaceObjectAtIndex: 0 withObject: slChoose];//replace with localized string "choose"
#ifdef bDoDebug
NSLog(#"aPrintServers: %#\n", aPrintServers);
#endif
}
//Now comes the tedious part to extract only the print-server-names from the single lines
NSRange r;
NSString* sTemp;
for (int i = 1; i < [aPrintServers count]; i++) {//skip first line with "choose". To get rid of the rest of the line, we must isolate/preserve the print server's name to the delimiting comma and remove all the remaining characters
sTemp = [aPrintServers objectAtIndex: i];
sTemp = [sTemp stringByTrimmingCharactersInSet: [NSCharacterSet whitespaceAndNewlineCharacterSet]];//remove newlines and line feeds
#ifdef bDoDebug
NSLog(#"sTemp: %#\n", sTemp);
#endif
r = [sTemp rangeOfString: #","];//now find first comma to remove the whole rest of the line
//r.length = [sTemp lengthOfBytesUsingEncoding:NSUTF8StringEncoding];
r.length = [sTemp length] - r.location;//calculate number of chars between first comma found and lenght of string
#ifdef bDoDebug
NSLog(#"range: %i, %i\n", r.location, r.length);
#endif
sTemp = [sTemp stringByReplacingCharactersInRange:r withString: #"" ];//remove rest of line
#ifdef bDoDebug
NSLog(#"sTemp after replace: %#\n", sTemp);
#endif
[aPrintServers replaceObjectAtIndex: i withObject: sTemp];//put back string into array for display in comboBox
#ifdef bDoDebug
NSLog(#"aPrintServer: %#\n", [aPrintServers objectAtIndex: i]);
#endif
}
[comboPrintServer removeAllItems];//reset combo box
[comboPrintServer addItemsWithObjectValues:aPrintServers];
[comboPrintServer setNumberOfVisibleItems:aPrintServers.count];
[comboPrintServer selectItemAtIndex:0];
#ifdef bDoDebug
NSLog(#"comboPrintServer reloaded with new values.");
#endif
//release memory we used for LdapTask
[sLdapQueryCommand release];
[aLdapQueryArgs release];
[sLdapOutput release];
[fhLdapTask release];
[pipeLdapTask release];
// [tskLdapTask release];//strangely can not be explicitely released, might be autorelease anyway
// [mdLdapTask release];//strangely can not be explicitely released, might be autorelease anyway
[sTemp release];
}
}
I have my self come across instance where it was not enough to just separate string by component many tasks such as 1) Categorizing token into types 2) Adding new tokens 3)Separating string between custom closures like all words between "{" and "}"For any such requirements i found Parse Kit a life saver.
I used it to parse .PGN (prtable gaming notation) files successfully its very fast and lite.