How to detect text file encoding in objective-c? - objective-c

I want to know the text file encoding in objective-c. Can you explain me how to know that?

You can use stringWithContentsOfFile:usedEncoding:error:, which returns, in addition to the new string, the encoding that was used.
I should note that this is a heuristic process by nature -- it's not always possible to determine the character encoding of a file.

Some text document show the gibberish in my project, so I need to know the encoding of the text file, to change its encoding, let it can be read by human.
I found this : http://lists.w3.org/Archives/Public/www-validator/2002Aug/0084.html
and Using OC to rewrite code,it can work for me:
NSString *documentPath = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject];
NSString *sourceFilePath = [documentPath stringByAppendingPathComponent:#"fileName.txt"];
NSFileHandle *sourceFileHandle = [NSFileHandle fileHandleForReadingAtPath:sourceFilePath];
NSData *begainData = [sourceFileHandle readDataOfLength:3];
Byte *bytes = (Byte *)[begainData bytes];
if (bytes[0] == 0xff
&& bytes[1] == 0xfe
&& (begainData.length < 4
|| bytes[2] != 0
|| bytes[3] != 0
)
)
{
NSLog(#"unicode");
}
if (bytes[0] == 0xfe
&& bytes[1] == 0xff
)
NSLog(#"BigEndianUnicode");
if (bytes[0] == 0xef && bytes[1] == 0xbb && bytes[2] == 0xbf)
NSLog(#"UTF8");
if (bytes[0] == 0x2b && bytes[1] == 0x2f && bytes[2] == 0x76)
NSLog(#"UTF7");
if (bytes[0] == 0xff && bytes[1] == 0xfe && bytes[2] == 0 && bytes[3] == 0)
NSLog(#"UTF32");
if (begainData.length < 3)
NSLog(#"ascii");

Related

How to handle encode data to ASCII

I have to decode file encoded in ASCII. However when I want to decode the file, i get a "?" instead of "é" that testify that the decoding didn't happen properly.
This is the code that I am using. Which condition should I had to properly encode ASCII ?
- (void)_sniffEncoding {
NSStringEncoding encoding = NSUTF8StringEncoding;
uint8_t bytes[CHUNK_SIZE];
NSInteger readLength = [_stream read:bytes maxLength:CHUNK_SIZE];
if (readLength > 0 && readLength <= CHUNK_SIZE) {
[_stringBuffer appendBytes:bytes length:readLength];
[self setTotalBytesRead:[self totalBytesRead] + readLength];
NSInteger bomLength = 0;
if (readLength > 3 && bytes[0] == 0x00 && bytes[1] == 0x00 && bytes[2] == 0xFE && bytes[3] == 0xFF) {
encoding = NSUTF32BigEndianStringEncoding;
bomLength = 4;
} else if (readLength > 3 && bytes[0] == 0xFF && bytes[1] == 0xFE && bytes[2] == 0x00 && bytes[3] == 0x00) {
encoding = NSUTF32LittleEndianStringEncoding;
bomLength = 4;
} else if (readLength > 3 && bytes[0] == 0x1B && bytes[1] == 0x24 && bytes[2] == 0x29 && bytes[3] == 0x43) {
encoding = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingISO_2022_KR);
bomLength = 4;
} else if (readLength > 1 && bytes[0] == 0xFE && bytes[1] == 0xFF) {
encoding = NSUTF16BigEndianStringEncoding;
bomLength = 2;
} else if (readLength > 1 && bytes[0] == 0xFF && bytes[1] == 0xFE) {
encoding = NSUTF16LittleEndianStringEncoding;
bomLength = 2;
} else if (readLength > 2 && bytes[0] == 0xEF && bytes[1] == 0xBB && bytes[2] == 0xBF) {
encoding = NSUTF8StringEncoding;
bomLength = 3;
} else {
NSString *bufferAsUTF8 = nil;
for (NSInteger triedLength = 0; triedLength < 4; ++triedLength) {
bufferAsUTF8 = [[NSString alloc] initWithBytes:bytes length:readLength-triedLength encoding:NSUTF8StringEncoding];
if (bufferAsUTF8 != nil) {
break;
}
}
if (bufferAsUTF8 != nil) {
encoding = NSUTF8StringEncoding;
} else {
NSLog(#"unable to determine stream encoding; assuming MacOSRoman");
encoding = NSMacOSRomanStringEncoding;
}
}
if (bomLength > 0) {
[_stringBuffer replaceBytesInRange:NSMakeRange(0, bomLength) withBytes:NULL length:0];
}
}
_streamEncoding = encoding;
}

convert string to URL objective-C

I am creating a screen where user can input any text. Now that text has to be converted to a valid url. I have looked into lot of stack overflow question and they have come up with the fooling solution:
1. CFURLCreateStringByAddingPercentEscapes
NSString *encodedString = (NSString *)CFURLCreateStringByAddingPercentEscapes(
NULL,
(CFStringRef)unencodedString,
NULL,
(CFStringRef)#"!*'();:#&=+$,/?%#[]",
kCFStringEncodingUTF8 );
2. Category on the string
- (NSString *)urlencode {
NSMutableString *output = [NSMutableString string];
const unsigned char *source = (const unsigned char *)[self UTF8String];
int sourceLen = strlen((const char *)source);
for (int i = 0; i < sourceLen; ++i) {
const unsigned char thisChar = source[i];
if (thisChar == ' '){
[output appendString:#"+"];
} else if (thisChar == '.' || thisChar == '-' || thisChar == '_' || thisChar == '~' ||
(thisChar >= 'a' && thisChar <= 'z') ||
(thisChar >= 'A' && thisChar <= 'Z') ||
(thisChar >= '0' && thisChar <= '9')) {
[output appendFormat:#"%c", thisChar];
} else {
[output appendFormat:#"%%%02X", thisChar];
}
}
return output;
}
Example: If the user manually puts the required %20 (for example) in the text instead of white space and then if we use any of the above solution the %20 would be converting to %25.
Could anybody let me know how can I fix the issue.
Because % will be urlencoded to %25, actually the %20 will be urlencoded to %2520, which is the desired output.
If you want to keep the %20 in the user input, you can replace all the %20 with a single white space, then urlencoded the replaced string.

Definition error while using instance of class that is in an if loop

----I have two instances of NSString, and one of them is defined within a while loop, but the other one is after that. Xcode seems to think that since this first instance (we'll call string1) is in a while loop, it will not be defined. However, for the program to proceed out of the while loop IT WILL ALWAYS DEFINE STRING1. An NSString is in another while loop thats the same thing.
----Outside of both while loops, at the end, in the code I have a method of NSString done to both of them (isEqualtoString), but Xcode tells me that string1 and string two are not defined. The program should work, but the compiler stops me. Is there anything I can change to make string1 and string2 appear defined in Xcode's eyes.
----I'm using this for the registration page, and I need these in while loops because they need to cycle until the user enters in through the console a username that fits my requirements.
EDIT: Added in actual code.
int numb1, numb2;
char usercheck1[60];
char usercheck2[60];
//Registration
numb2 = 1;
while (numb2 == 1){
numb1 = 1;
while (numb1 == 1){
numb1 = 0;
NSLog(#"Username:");
fgets(usercheck1, sizeof usercheck1, stdin);
int c2;
while((c2 = getchar()) != '\n' && c2 != EOF);
if (usercheck1 [sizeof (usercheck1)-1] == '\n'){ // In case that the input string has 12 characters plus '\n'
usercheck1 [sizeof (usercheck1)-1] = '\0';} // Plus '\0', the '\n' isn't added and the if condition is false.
NSString* string1 = [NSString stringWithUTF8String: usercheck1];
//Makes sure string contains no spaces and string length is correct size.
if ([string1 length] > 12){
NSLog (#"Username must be 12 characters or less!");
numb1 = 1;}
if ([string1 length] < 5){
NSLog (#"Username must be 4 characters or more!");
numb1 = 1;}
if ([string1 rangeOfString:#" " ].location != NSNotFound){
NSLog(#"Username cannot contain spaces!");
numb1 = 1;}
}
numb1 = 1;
while (numb1 == 1){
numb1 = 0;
NSLog(#"Confirm Username:");
fgets(usercheck2, sizeof usercheck2, stdin);
int c2;
while((c2 = getchar()) != '\n' && c2 != EOF);
if (usercheck2 [sizeof (usercheck2)-1] == '\n'){ // In case that the input string has 12 characters plus '\n'
usercheck2 [sizeof (usercheck2)-1] = '\0';} // Plus '\0', the '\n' isn't added and the if condition is false.
NSString* string2 = [NSString stringWithUTF8String: usercheck2];
//Makes sure string contains no spaces and string length is correct size.
if ([string2 length] > 12){
NSLog (#"Username must be 12 characters or less!");
numb1 = 1;}
if ([string2 length] < 5){
NSLog (#"Username must be 4 characters or more!");
numb1 = 1;}
if ([string2 rangeOfString:#" " ].location != NSNotFound){
NSLog(#"Username cannot contain spaces!");
numb1 = 1;}
}
if ([string2 isEqualToString: string1] == YES){
NSLog(#"Usernames confirmed! Username:%s", string2);
numb2 = 0;}
else {NSLog(#"Usernames do not match. Try again");
numb2 = 1;}
}
}
As you can see, it would work if it actually compiled and ran, but the compiler just doesn't like me using string2 in the if statement for isEqualToString. It gives me the error :
"Use of undeclared identifier 'string2'"
Also, move that statement and the else statment outside the two sub-while statements, it gives me that error for BOTH string1 and string2.
XCode version is 4.6.3, I'm programming for the Mac OS X on 10.8.4
You can't access variables outside of the scope in which they are declared. Since string1 and string2 are declared within the two while blocks, you can't use them outside of the while blocks.
There are many things that could be improved in this code. Try something like this:
NSString *username1;
NSString *username2;
while (1) {
while (1) {
NSLog(#"Username:");
char usercheck[60];
fgets(usercheck, sizeof usercheck1, stdin);
int c2;
while ((c2 = getchar()) != '\n' && c2 != EOF);
if (usercheck [sizeof (usercheck) - 1] == '\n') { // In case that the input string has 12 characters plus '\n'
usercheck[sizeof (usercheck)-1] = '\0';
} // Plus '\0', the '\n' isn't added and the if condition is false.
NSString *string1 = [NSString stringWithUTF8String:usercheck];
// Makes sure string contains no spaces and string length is correct size.
if ([string1 length] > 12) {
NSLog(#"Username must be 12 characters or less!");
} else if ([string1 length] < 5) {
NSLog(#"Username must be 4 characters or more!");
} else if ([string1 rangeOfString:#" "].location != NSNotFound) {
NSLog(#"Username cannot contain spaces!");
} else {
username1 = string1;
break; // username is good
}
}
while (1) {
NSLog(#"Confirm Username:");
char usercheck[60];
fgets(usercheck, sizeof usercheck, stdin);
int c2;
while ((c2 = getchar()) != '\n' && c2 != EOF);
if (usercheck[sizeof (usercheck) - 1] == '\n') { // In case that the input string has 12 characters plus '\n'
usercheck[sizeof (usercheck) - 1] = '\0';
} // Plus '\0', the '\n' isn't added and the if condition is false.
NSString *string2 = [NSString stringWithUTF8String:usercheck];
//Makes sure string contains no spaces and string length is correct size.
if ([string2 length] > 12) {
NSLog (#"Username must be 12 characters or less!");
} else if ([string2 length] < 5) {
NSLog (#"Username must be 4 characters or more!");
} else if ([string2 rangeOfString:#" "].location != NSNotFound) {
NSLog(#"Username cannot contain spaces!");
} else {
username2 = string2;
break;
}
}
if ([username1 isEqualToString:username2]) {
NSLog(#"Usernames confirmed! Username:%#", username1);
break;
} else {
NSLog(#"Usernames do not match. Try again");
}
}

How to check whether a char is digit or not in Objective-C?

I need to check if a char is digit or not.
NSString *strTest=#"Test55";
char c =[strTest characterAtIndex:4];
I need to find out if 'c' is a digit or not. How can I implement this check in Objective-C?
Note: The return value for characterAtIndex: is not a char, but a unichar. So casting like this can be dangerous...
An alternative code would be:
NSString *strTest = #"Test55";
unichar c = [strTest characterAtIndex:4];
NSCharacterSet *numericSet = [NSCharacterSet decimalDigitCharacterSet];
if ([numericSet characterIsMember:c]) {
NSLog(#"Congrats, it is a number...");
}
In standard C there is a function int isdigit( int ch ); defined in "ctype.h". It will return nonzero (TRUE) if ch is a digit.
Also you can check it manually:
if(c>='0' && c<='9')
{
//c is digit
}
There is a C function called isdigit.
This is actually quite simple:
isdigit([YOUR_STRING characterAtIndex:YOUR_POS])
You may want to check NSCharacterSet class reference.
You can think of writing a generic function like the following for this:
BOOL isNumericI(NSString *s)
{
NSUInteger len = [s length];
NSUInteger i;
BOOL status = NO;
for(i=0; i < len; i++)
{
unichar singlechar = [s characterAtIndex: i];
if ( (singlechar == ' ') && (!status) )
{
continue;
}
if ( ( singlechar == '+' ||
singlechar == '-' ) && (!status) ) { status=YES; continue; }
if ( ( singlechar >= '0' ) &&
( singlechar <= '9' ) )
{
status = YES;
} else {
return NO;
}
}
return (i == len) && status;
}

Why does NSString compare: return NSOrderedSame when the strings are different?

Why does compare return NSOrderedSame?:
NSString *testString = [anObject aString];
if ([testString compare:#"a string which doesn't equal testString"] == NSOrderedSame) {
//do stuff
}
NB: I added this question so I won't make this mistake again (hence the immediate answer I gave).
This is because testString can equal nil. Sending a message to nil returns nil. NSOrderedSame equals 0, and 0 equals nil.
NSLog(#"nil == NSOrderedSame = %d", (nil == NSOrderedSame)); //nil == NSOrderedSame = 1
NSLog(#"[nil compare:#\"arf\"] == nil = %d", ([nil compare:#"arf"] == 0)); //[nil compare:#\"arf\"] == nil = 1
To avoid this ensure that the object is not nil before comparing, eg:
if (testString != nil && [testString compare:#"testString"] == NSSOrderedSame) ...
NB: I added this question so I wouldn't make this mistake again.
Probably [anObject aString] returns nil, sending nil a message returns 0, and 0 == NSOrderedSame.