Reading ints from NSData? - objective-c

I think I am getting a little confused here, what I have is a plain text file with the numbers "5 10 2350" in it. As you can see below I am trying to read the first value using readDataOfLength, I think maybe where I am getting muddled is that I should be reading as chars, but then 10 is 2 chars and 2350 is 4. Can anyone point m in the right direction to reading these.
NSString *dataFile_IN = #"/Users/FGX/Documents/Xcode/syntax_FileIO/inData.txt";
NSFileHandle *inFile;
NSData *readBuffer;
int intBuffer;
int bufferSize = sizeof(int);
inFile = [NSFileHandle fileHandleForReadingAtPath:dataFile_IN];
if(inFile != nil) {
readBuffer = [inFile readDataOfLength:bufferSize];
[readBuffer getBytes: &intBuffer length: bufferSize];
NSLog(#"BUFFER: %d", intBuffer);
[inFile closeFile];
}
EDIT_001
Both excellent answers from Jarret and Ole, here is what I have gone with. One final question "METHOD 02" picks up a carriage return to a blank line at the bottom of the text file, returns it as a subString, which in turn gets converted to "0" can I set the NSCharacterSet to stop that, currently I just added a length check on the string.
NSInteger intFromFile;
NSScanner *scanner;
NSArray *subStrings;
NSString *eachString;
// METHOD 01 Output: 57 58 59
strBuffer = [NSString stringWithContentsOfFile:dataFile_IN encoding:NSUTF8StringEncoding error:&fileError];
scanner = [NSScanner scannerWithString:strBuffer];
while ([scanner scanInteger:&intFromFile]) NSLog(#"%d", intFromFile);
// METHOD 02 Output: 57 58 59 0
strBuffer = [NSString stringWithContentsOfFile:dataFile_IN encoding:NSUTF8StringEncoding error:&fileError];
subStrings = [strBuffer componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
for(eachString in subStrings) {
if ([eachString length] != 0) {
NSLog(#"{%#} %d", eachString, [eachString intValue]);
}
}
gary

There are several conveniences in Cocoa that can make your life a bit easier here:
NSString *dataFile_IN = #"/Users/FGX/Documents/Xcode/syntax_FileIO/inData.txt";
// Read all the data at once into a string... an convenience around the
// need the open a file handle and convert NSData
NSString *s = [NSString stringWithContentsOfFile:dataFile_IN
encoding:NSUTF8StringEncoding
error:nil];
// Use a scanner to loop over the file. This assumes there is nothing in
// the file but integers separated by whitespace and newlines
NSInteger anInteger;
NSScanner *scanner = [NSScanner scannerWithString:s];
while (![scanner isAtEnd]) {
if ([scanner scanInteger:&anInteger]) {
NSLog(#"Found an integer: %d", anInteger);
}
}
Otherwise, using your original approach, you'd pretty much have to read character-by-character, adding each character to a "buffer" and then evaluating your integer when you encounter a space (or newline, or some other separator).

If you read the file's contents into a string as Jaret suggested, and assuming the string only contains numbers and whitespace, you can also call:
NSArray *substrings = [s componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
This will split the string at whitespace and newline characters and return an array of the substrings. You would then have to convert the substrings to integers by looping over the array and calling [substring integerValue].

One way to do it would be first to first turn your readBuffer into a string as follows:
NSString * dataString = [[NSString alloc] initWithData:readBuffer encoding:NSUTF8StringEncoding];
Then split the string into values:
NSString *dataString=#"5 10 2350"; // example string to split
NSArray * valueStrings = [dataString componentsSeparatedByString:#" "];
for(NSString *valueString in valueStrings)
{
int value=[valueString intValue];
NSLog(#"%d",value);
}
Output of this is
5
10
2350

Related

Add 1 to a number in an NSString that contains characters Objective-C

I am new to learning Objective-C (my first programming language!) and trying to write a little program that will add 1 to a number contained within a string. E.g. AA1BB becomes AA2BB.
.
So far I have tried to extract the number and add 1. Then extract the letters and add everything back together in a new string. I have had some success but can't manage to get back to the original arrangement of the initial string.
The code I have so far gives a result of 2BB and disregards the characters before the number which is not what I am after (the result I am trying for with this example would be AA2BB). I can't figure out why!
NSString* aString = #"AA1BB";
NSCharacterSet *numberCharset = [NSCharacterSet characterSetWithCharactersInString:#"0123456789-"]; //Creating a set of Characters object//
NSScanner *theScanner = [NSScanner scannerWithString:aString];
int someNumbers = 0;
while (![theScanner isAtEnd]) {
// Remove Letters
[theScanner scanUpToCharactersFromSet:numberCharset
intoString:NULL];
if ([theScanner scanInt:&someNumbers]) {}
}
NSCharacterSet *letterCharset = [NSCharacterSet characterSetWithCharactersInString:#"ABCDEFGHIJKLMNOPQRSTUVWXYZ"];
NSScanner *letterScanner = [NSScanner scannerWithString:aString];
NSString* someLetters;
while (![letterScanner isAtEnd]) {
// Remove numbers
[letterScanner scanUpToCharactersFromSet:letterCharset
intoString:NULL];
if ([letterScanner scanCharactersFromSet:letterCharset intoString:&someLetters]) {}
}
++someNumbers; //adds +1 to the Number//
NSString *newString = [[NSString alloc]initWithFormat:#"%i%#", someNumbers, someLetters];
NSLog (#"String is now %#", newString);
This is an alternative solution with Regular Expression.
It finds the range of the integer (\\d+ is one or more digits), extracts it, increments it and replaces the value at the given range.
NSString* aString = #"AA1BB";
NSRange range = [aString rangeOfString:#"\\d+" options:NSRegularExpressionSearch];
if (range.location != NSNotFound) {
NSInteger numericValue = [aString substringWithRange:range].integerValue;
numericValue++;
aString = [aString stringByReplacingCharactersInRange:range withString:[NSString stringWithFormat:#"%ld", numericValue]];
}
NSLog(#"%#", aString);

Check if NSString only contains one character repeated

I want to know a simple and fast way to determine if all characters in an NSString are the same.
For example:
NSString *string = "aaaaaaaaa"
=> return YES
NSString *string = "aaaaaaabb"
=> return NO
I know that I can achieve it by using a loop but my NSString is long so I prefer a shorter and simpler way.
you can use this, replace first character with null and check lenght:
-(BOOL)sameCharsInString:(NSString *)str{
if ([str length] == 0 ) return NO;
return [[str stringByReplacingOccurrencesOfString:[str substringToIndex:1] withString:#""] length] == 0 ? YES : NO;
}
Here are two possibilities that fail as quickly as possible and don't (explicitly) create copies of the original string, which should be advantageous since you said the string was large.
First, use NSScanner to repeatedly try to read the first character in the string. If the loop ends before the scanner has reached the end of the string, there are other characters present.
NSScanner * scanner = [NSScanner scannerWithString:s];
NSString * firstChar = [s substringWithRange:[s rangeOfComposedCharacterSequenceAtIndex:0]];
while( [scanner scanString:firstChar intoString:NULL] ) continue;
BOOL stringContainsOnlyOneCharacter = [scanner isAtEnd];
Regex is also a good tool for this problem, since "a character followed by any number of repetitions of that character" is in very simply expressed with a single back reference:
// Match one of any character at the start of the string,
// followed by any number of repetitions of that same character
// until the end of the string.
NSString * patt = #"^(.)\\1*$";
NSRegularExpression * regEx =
[NSRegularExpression regularExpressionWithPattern:patt
options:0
error:NULL];
NSArray * matches = [regEx matchesInString:s
options:0
range:(NSRange){0, [s length]}];
BOOL stringContainsOnlyOneCharacter = ([matches count] == 1);
Both these options correctly deal with multi-byte and composed characters; the regex version also does not require an explicit check for the empty string.
use this loop:
NSString *firstChar = [str substringWithRange:NSMakeRange(0, 1)];
for (int i = 1; i < [str length]; i++) {
NSString *ch = [str substringWithRange:NSMakeRange(i, 1)];
if(![ch isEqualToString:firstChar])
{
return NO;
}
}
return YES;

Get a substring from an NSString until arriving to any letter in an NSArray - objective C

I am trying to parse a set of words that contain -- first greek letters, then english letters. This would be easy if there was a delimiter between the sets.That is what I've built so far..
- (void)loadWordFileToArray:(NSBundle *)bundle {
NSLog(#"loadWordFileToArray");
if (bundle != nil) {
NSString *path = [bundle pathForResource:#"alfa" ofType:#"txt"];
//pull the content from the file into memory
NSData* data = [NSData dataWithContentsOfFile:path];
//convert the bytes from the file into a string
NSString* string = [[NSString alloc] initWithBytes:[data bytes]
length:[data length]
encoding:NSUTF8StringEncoding];
//split the string around newline characters to create an array
NSString* delimiter = #"\n";
incomingWords = [string componentsSeparatedByString:delimiter];
NSLog(#"incomingWords count: %lu", (unsigned long)incomingWords.count);
}
}
-(void)parseWordArray{
NSLog(#"parseWordArray");
NSString *seperator = #" = ";
int i = 0;
for (i=0; i < incomingWords.count; i++) {
NSString *incomingString = [incomingWords objectAtIndex:i];
NSScanner *scanner = [NSScanner localizedScannerWithString: incomingString];
NSString *firstString;
NSString *secondString;
NSInteger scanPosition;
[scanner scanUpToString:seperator intoString:&firstString];
scanPosition = [scanner scanLocation];
secondString = [[scanner string] substringFromIndex:scanPosition+[seperator length]];
// NSLog(#"greek: %#", firstString);
// NSLog(#"english: %#", secondString);
[outgoingWords insertObject:[NSMutableArray arrayWithObjects:#"greek", firstString, #"english",secondString,#"category", #"", nil] atIndex:0];
[englishWords insertObject:[NSMutableArray arrayWithObjects:secondString,nil] atIndex:0];
}
}
But I cannot count on there being delimiters.
I have looked at this question. I want something similar. This would be: grab the characters in the string until an english letter is found. Then take the first group to one new string, and all the characters after to a second new string.
I only have to run this a few times, so optimization is not my highest priority.. Any help would be appreciated..
EDIT:
I've changed my code as shown below to make use of NSLinguisticTagger. This works, but is this the best way? Note that the interpretation for english characters is -- for some reason "und"...
The incoming string is: άγαλμα, το statue, only the last 6 characters are in english.
int j = 0;
for (j=0; j<incomingString.length; j++) {
NSString *language = [tagger tagAtIndex:j scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
if ([language isEqual: #"und"]) {
NSLog(#"j is: %i", j);
int k = 0;
for (k=0; k<j; k++) {
NSRange range = NSMakeRange (0, k);
NSString *tempString = [incomingString substringWithRange:range ];
NSLog (#"tempString: %#", tempString);
}
return;
}
NSLog (#"Language: %#", language);
}
Alright so what you could do is use NSLinguisticTagger to find out the language of the word (or letter) and if the language has changed then you know where to split the string. You can use NSLinguisticTagger like this:
NSArray *tagschemes = #[NSLinguisticTagSchemeLanguage];
NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:tagschemes options: NSLinguisticTagPunctuation | NSLinguisticTaggerOmitWhitespace];
[tagger setString:#"This is my string in English."];
NSString *language = [tagger tagAtIndex:0 scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
//Loop through each index of the string's characters and check the language as above.
//If it has changed then you can assume the language has changed.
Alternatively you can use NSSpellChecker's requestCheckingOfString to get teh dominant language in a range of characters:
NSSpellChecker *spellChecker = [NSSpellChecker sharedSpellChecker];
[spellChecker setAutomaticallyIdentifiesLanguages:YES];
NSString *spellCheckText = #"Guten Herr Mustermann. Dies ist ein deutscher Text. Bitte löschen Sie diesen nicht.";
[spellChecker requestCheckingOfString:spellCheckText
range:(NSRange){0, [spellCheckText length]}
types:NSTextCheckingTypeOrthography
options:nil
inSpellDocumentWithTag:0
completionHandler:^(NSInteger sequenceNumber, NSArray *results, NSOrthography *orthography, NSInteger wordCount) {
NSLog(#"dominant language = %#", orthography.dominantLanguage);
}];
This answer has information on how to detect the language of an NSString.
Allow me to introduce two good friends of mine.
NSCharacterSet and NSRegularExpression.
Along with them, normalization. (In Unicode terms)
First, you should normalize strings before analyzing them against a character set.
You will need to look at the choices, but normalizing to all composed forms is the way I would go.
This means an accented character is one instead of two or more.
It simplifies the number of things to compare.
Next, you can easily build your own NSCharacterSet objects from strings (loaded from files even) to use to test set membership.
Lastly, regular expressions can achieve the same thing with Unicode Property Names as classes or categories of characters. Regular expressions could be more terse but more expressive.

NSString by removing the initial zeros?

How can I remove leading zeros from an NSString?
e.g. I have:
NSString *myString;
with values such as #"0002060", #"00236" and #"21456".
I want to remove any leading zeros if they occur:
e.g. Convert the previous to #"2060", #"236" and #"21456".
Thanks.
For smaller numbers:
NSString *str = #"000123";
NSString *clean = [NSString stringWithFormat:#"%d", [str intValue]];
For numbers exceeding int32 range:
NSString *str = #"100004378121454";
NSString *clean = [NSString stringWithFormat:#"%d", [str longLongValue]];
This is actually a case that is perfectly suited for regular expressions:
NSString *str = #"00000123";
NSString *cleaned = [str stringByReplacingOccurrencesOfString:#"^0+"
withString:#""
options:NSRegularExpressionSearch
range:NSMakeRange(0, str.length)];
Only one line of code (in a logical sense, line breaks added for clarity) and there are no limits on the number of characters it handles.
A brief explanation of the regular expression pattern:
The ^ means that the pattern should be anchored to the beginning of the string. We need that to ensure it doesn't match legitimate zeroes inside the sequence of digits.
The 0+ part means that it should match one or more zeroes.
Put together, it matches a sequence of one or more zeroes at the beginning of the string, then replaces that with an empty string - i.e., it deletes the leading zeroes.
The following method also gives the output.
NSString *test = #"0005603235644056";
// Skip leading zeros
NSScanner *scanner = [NSScanner scannerWithString:test];
NSCharacterSet *zeros = [NSCharacterSet
characterSetWithCharactersInString:#"0"];
[scanner scanCharactersFromSet:zeros intoString:NULL];
// Get the rest of the string and log it
NSString *result = [test substringFromIndex:[scanner scanLocation]];
NSLog(#"%# reduced to %#", test, result);
- (NSString *) removeLeadingZeros:(NSString *)Instring
{
NSString *str2 =Instring ;
for (int index=0; index<[str2 length]; index++)
{
if([str2 hasPrefix:#"0"])
str2 =[str2 substringFromIndex:1];
else
break;
}
return str2;
}
In addition to adali's answer, you can do the following if you're worried about the string being too long (i.e. greater than 9 characters):
NSString *str = #"000200001111111";
NSString *strippedStr = [NSString stringWithFormat:#"%lld", [temp longLongValue]];
This will give you the result: 200001111111
Otherwise, [NSString stringWithFormat:#"%d", [temp intValue]] will probably return 2147483647 because of overflow.

Using Objective C/Cocoa to unescape unicode characters, ie \u1234

Some sites that I am fetching data from are returning UTF-8 strings, with the UTF-8 characters escaped, ie: \u5404\u500b\u90fd
Is there a built in cocoa function that might assist with this or will I have to write my own decoding algorithm.
It's correct that Cocoa does not offer a solution, yet Core Foundation does: CFStringTransform.
CFStringTransform lives in a dusty, remote corner of Mac OS (and iOS) and so it's a little know gem. It is the front end to Apple's ICU compatible string transformation engine. It can perform real magic like transliterations between greek and latin (or about any known scripts), but it can also be used to do mundane tasks like unescaping strings from a crappy server:
NSString *input = #"\\u5404\\u500b\\u90fd";
NSString *convertedString = [input mutableCopy];
CFStringRef transform = CFSTR("Any-Hex/Java");
CFStringTransform((__bridge CFMutableStringRef)convertedString, NULL, transform, YES);
NSLog(#"convertedString: %#", convertedString);
// prints: 各個都, tada!
As I said, CFStringTransform is really powerful. It supports a number of predefined transforms, like case mappings, normalizations or unicode character name conversion. You can even design your own transformations.
I have no idea why Apple does not make it available from Cocoa.
Edit 2015:
OS X 10.11 and iOS 9 add the following method to Foundation:
- (nullable NSString *)stringByApplyingTransform:(NSString *)transform reverse:(BOOL)reverse;
So the example from above becomes...
NSString *input = #"\\u5404\\u500b\\u90fd";
NSString *convertedString = [input stringByApplyingTransform:#"Any-Hex/Java"
reverse:YES];
NSLog(#"convertedString: %#", convertedString);
Thanks #nschmidt for the heads up.
There is no built-in function to do C unescaping.
You can cheat a little with NSPropertyListSerialization since an "old text style" plist supports C escaping via \Uxxxx:
NSString* input = #"ab\"cA\"BC\\u2345\\u0123";
// will cause trouble if you have "abc\\\\uvw"
NSString* esc1 = [input stringByReplacingOccurrencesOfString:#"\\u" withString:#"\\U"];
NSString* esc2 = [esc1 stringByReplacingOccurrencesOfString:#"\"" withString:#"\\\""];
NSString* quoted = [[#"\"" stringByAppendingString:esc2] stringByAppendingString:#"\""];
NSData* data = [quoted dataUsingEncoding:NSUTF8StringEncoding];
NSString* unesc = [NSPropertyListSerialization propertyListFromData:data
mutabilityOption:NSPropertyListImmutable format:NULL
errorDescription:NULL];
assert([unesc isKindOfClass:[NSString class]]);
NSLog(#"Output = %#", unesc);
but mind that this isn't very efficient. It's far better if you write up your own parser. (BTW are you decoding JSON strings? If yes you could use the existing JSON parsers.)
Here's what I ended up writing. Hopefully this will help some people along.
+ (NSString*) unescapeUnicodeString:(NSString*)string
{
// unescape quotes and backwards slash
NSString* unescapedString = [string stringByReplacingOccurrencesOfString:#"\\\"" withString:#"\""];
unescapedString = [unescapedString stringByReplacingOccurrencesOfString:#"\\\\" withString:#"\\"];
// tokenize based on unicode escape char
NSMutableString* tokenizedString = [NSMutableString string];
NSScanner* scanner = [NSScanner scannerWithString:unescapedString];
while ([scanner isAtEnd] == NO)
{
// read up to the first unicode marker
// if a string has been scanned, it's a token
// and should be appended to the tokenized string
NSString* token = #"";
[scanner scanUpToString:#"\\u" intoString:&token];
if (token != nil && token.length > 0)
{
[tokenizedString appendString:token];
continue;
}
// skip two characters to get past the marker
// check if the range of unicode characters is
// beyond the end of the string (could be malformed)
// and if it is, move the scanner to the end
// and skip this token
NSUInteger location = [scanner scanLocation];
NSInteger extra = scanner.string.length - location - 4 - 2;
if (extra < 0)
{
NSRange range = {location, -extra};
[tokenizedString appendString:[scanner.string substringWithRange:range]];
[scanner setScanLocation:location - extra];
continue;
}
// move the location pas the unicode marker
// then read in the next 4 characters
location += 2;
NSRange range = {location, 4};
token = [scanner.string substringWithRange:range];
unichar codeValue = (unichar) strtol([token UTF8String], NULL, 16);
[tokenizedString appendString:[NSString stringWithFormat:#"%C", codeValue]];
// move the scanner past the 4 characters
// then keep scanning
location += 4;
[scanner setScanLocation:location];
}
// done
return tokenizedString;
}
+ (NSString*) escapeUnicodeString:(NSString*)string
{
// lastly escaped quotes and back slash
// note that the backslash has to be escaped before the quote
// otherwise it will end up with an extra backslash
NSString* escapedString = [string stringByReplacingOccurrencesOfString:#"\\" withString:#"\\\\"];
escapedString = [escapedString stringByReplacingOccurrencesOfString:#"\"" withString:#"\\\""];
// convert to encoded unicode
// do this by getting the data for the string
// in UTF16 little endian (for network byte order)
NSData* data = [escapedString dataUsingEncoding:NSUTF16LittleEndianStringEncoding allowLossyConversion:YES];
size_t bytesRead = 0;
const char* bytes = data.bytes;
NSMutableString* encodedString = [NSMutableString string];
// loop through the byte array
// read two bytes at a time, if the bytes
// are above a certain value they are unicode
// otherwise the bytes are ASCII characters
// the %C format will write the character value of bytes
while (bytesRead < data.length)
{
uint16_t code = *((uint16_t*) &bytes[bytesRead]);
if (code > 0x007E)
{
[encodedString appendFormat:#"\\u%04X", code];
}
else
{
[encodedString appendFormat:#"%C", code];
}
bytesRead += sizeof(uint16_t);
}
// done
return encodedString;
}
simple code:
const char *cString = [unicodeStr cStringUsingEncoding:NSUTF8StringEncoding];
NSString *resultStr = [NSString stringWithCString:cString encoding:NSNonLossyASCIIStringEncoding];
from: https://stackoverflow.com/a/7861345