how to change the input of this program? (C Language) - objective-c

How can I replace the use of (FILE) and (fopen) with (scanf) to get the input values and send in these 2 functions?
I want to use this function in Objective-c code.
For more info you can see the whole code here link
static void stemfile(FILE * f)
{ while(TRUE)
{ int ch = getc(f);
if (ch == EOF) return;
if (LETTER(ch))
{ int i = 0;
while(TRUE)
{ if (i == i_max) increase_s();
ch = tolower(ch); /* forces lower case */
s[i] = ch; i++;
ch = getc(f);
if (!LETTER(ch)) { ungetc(ch,f); break; }
}
s[stem(s,0,i-1)+1] = 0;
/* the previous line calls the stemmer and uses its result to
zero-terminate the string in s */
printf("%s",s);
}
else putchar(ch);
}
}
int main(int argc, char * argv[])
{ int i;
s = (char *) malloc(i_max+1);
for (i = 1; i < argc; i++)
{ FILE * f = fopen(argv[i],"r");
if (f == 0) { fprintf(stderr,"File %s not found\n",argv[i]); exit(1); }
stemfile(f);
}
free(s);
return 0;
}

The scanf() function cannot be a direct replacement for the existing code. The existing code (which is not very well written IMO), splits up the input character stream into letters (defined by the LETTER() macro to be either uppercase or lowercase characters), and non-letters, and converts these letter sequences into lowercase before applying the stem() function to them.
The scanf() function, on the other hand extracts primitive types (int, char, double, etc.) and explicitly delimited strings from the input stream. The delimiters in the given code (i.e. anything that is not LETTER()) is too vague for scanf() (though not for a regular expression). scanf() needs a specific character on each end of a substring to look for. Also, scanf() cannot convert to lowercase automatically.
Assuming your input continues to be files, I think the easiest solution might be to leave the code as-is and use it, convoluted as it may be. There is nothing about it that shouldn't run as part of a larger Objective-C program. Objective-C, after all, still provides access to the C standard library, at least within the limits that the operating system sets (iOS is far more limiting than MacOS, if your are on an Apple platform).
The general problem here is that of tokenization: breaking an input sequence of unclassified symbols (like characters) into sequence of classified tokens (like words and spaces). A common approach to the problem is to use a finite state machine/automaton (FSA/FSM) to apply parsing logic to the input sequence and extract the tokens as they are encountered. An FSA can be a bit hard to set up, but it is very robust and general.

I'm still not sure why you would want to use scanf() in main(). It would presumably mean changing the interface of stemfile() (including the name since it would no longer be processing a file) to take a character string as input. And scanf() is going to make life difficult; it will read strings separated by blanks, which may be part of its attraction, but it will include any punctuation that is included in the 'word'.
As Randall noted, the code in the existing function is a little obsure; I think it could be written more simply as follows:
#include <stdio.h>
#include <ctype.h>
#define LETTER(x) isalpha(x)
extern int stem(char *s, int lo, int hi);
static void stemfile(FILE * f)
{
int ch;
while ((ch = getc(f)) != EOF)
{
if (LETTER(ch))
{
char s[1024];
int i = 0;
s[i++] = ch;
while ((ch = getc(f)) != EOF && LETTER(ch))
s[i++] = ch;
if (ch != EOF)
ungetc(ch, f);
s[i] = '\0';
s[stem(s, 0, i-1)+1] = 0;
/* the previous line calls the stemmer and uses its result to
zero-terminate the string in s */
printf("%s", s);
}
else
putchar(ch);
}
}
I've slightly simplified things by making s into a simple local variable (it appears to have been a global, as does imax), removing imax and the increase_s() function. Those are largely incidental to the operation of the function.
If you want this to process a (null-terminated) string instead, then:
static void stemstring(const char *src)
{
char ch;
while ((ch = *src++) != '\0')
{
if (LETTER(ch))
{
int i = 0;
char s[1024];
s[i++] = ch;
while ((ch = *src++) != '\0' && LETTER(ch))
s[i++] = ch;
if (ch != '\0')
src--;
s[i-1] = '\0';
s[stem(s,0,i-1)+1] = 0;
/* the previous line calls the stemmer and uses its result to
zero-terminate the string in s */
printf("%s",s);
}
else
putchar(ch);
}
}
This systematically changes getc(f) into *src++, EOF into \0, and ungetc() into src--. It also (safely) changes the type of ch from int (necessary for I/O) to char. If you are worried about buffer overflow, you have to work a bit harder in the function, but few words in practice will be even 1024 bytes (and you could use 4096 as easily as 1024, with correspondingly smaller - infinitesimal - chance of real data overflowing the buffer. You need to judge whether that is a 'real' risk for you.
The main program can become quite simply:
int main(void)
{
char string[1024];
while (scanf("%1023s", string) == 1)
stemstring(string);
return(0);
}
Clearly, because of the '1023' in the format, this will never overflow the inner buffer. (NB: Removed the . from "%.1023s" in first version of this answer; scanf() is not the same as printf()!).
Challenged: does this work?
Yes - this code below (adding a dummy stem() function and slightly modifying the printing) works reasonably well for me:
#include <stdio.h>
#include <ctype.h>
#include <assert.h>
#define LETTER(x) isalpha(x)
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
static int stem(const char *s, int begin, int end)
{
assert(s != 0);
return MAX(end - begin - 3, 3);
}
static void stemstring(const char *src)
{
char ch;
while ((ch = *src++) != '\0')
{
if (LETTER(ch))
{
int i = 0;
char s[1024];
s[i++] = ch;
while ((ch = *src++) != '\0' && LETTER(ch))
s[i++] = ch;
if (ch != '\0')
src--;
s[i-1] = '\0';
s[stem(s,0,i-1)+1] = 0;
/* the previous line calls the stemmer and uses its result to
zero-terminate the string in s */
printf("<<%s>>\n",s);
}
else
putchar(ch);
}
putchar('\n');
}
int main(void)
{
char string[1024];
while (scanf("%1023s", string) == 1)
stemstring(string);
return(0);
}
Example dialogue
H: assda23
C: <<assd>>
C: 23
H: 3423///asdrrrf12312
C: 3423///<<asdr>>
C: 12312
H: 12//as//12
C: 12//<<a>>
C: //12
The lines marked H: are human input (the H: was not part of the input); the lines marked C: are computer output.
Next attempt
The trouble with concentrating on grotesquely overlong words (1023-characters and more) is that you can overlook the simple. With scanf() reading data, you automatically get single 'words' with no spaces in them as input. Here's a debugged version of stemstring() with debugging printing code in place. The problem was two off-by-one errors. One was in the assignment s[i-1] = '\0'; where the -1 was not needed. The other was in the handling of the end of a string of letters; the while ((ch = *src++) != '\0') leftsrcone place too far, which led to interesting effects with short words entered after long words (when the difference in length was 2 or more). There's a fairly detailed trace of the test case I devised, using words such as 'great' and 'book' which you diagnosed (correctly) as being mishandled. Thestem()` function here simply prints its inputs and outputs, and returns the full length of the string (so there is no stemming occurring).
#include <stdio.h>
#include <ctype.h>
#include <assert.h>
#define LETTER(x) isalpha(x)
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
static int stem(const char *s, int begin, int end)
{
int len = end - begin + 1;
assert(s != 0);
printf("ST (%d,%d) <<%*.*s>> RV %d\n", begin, end, len, len, s, len);
// return MAX(end - begin - 3, 3);
return len;
}
static void stemstring(const char *src)
{
char ch;
printf("-->> stemstring: <<%s>>\n", src);
while ((ch = *src++) != '\0')
{
if (ch != '\0')
printf("LP <<%c%s>>\n", ch, src);
if (LETTER(ch))
{
int i = 0;
char s[1024];
s[i++] = ch;
while ((ch = *src++) != '\0' && LETTER(ch))
s[i++] = ch;
src--;
s[i] = '\0';
printf("RD (%d) <<%s>>\n", i, s);
s[stem(s, 0, i-1)+1] = '\0';
/* the previous line calls the stemmer and uses its result to
zero-terminate the string in s */
printf("RS <<%s>>\n", s);
}
else
printf("NL <<%c>>\n", ch);
}
//putchar('\n');
printf("<<-- stemstring\n");
}
int main(void)
{
char string[1024];
while (scanf("%1023s", string) == 1)
stemstring(string);
return(0);
}
The debug-laden output is shown (the first line is the typed input; the rest is the output from the program):
what a great book this is! What.hast.thou.done?
-->> stemstring: <<what>>
LP <<what>>
RD (4) <<what>>
ST (0,3) <<what>> RV 4
RS <<what>>
<<-- stemstring
-->> stemstring: <<a>>
LP <<a>>
RD (1) <<a>>
ST (0,0) <<a>> RV 1
RS <<a>>
<<-- stemstring
-->> stemstring: <<great>>
LP <<great>>
RD (5) <<great>>
ST (0,4) <<great>> RV 5
RS <<great>>
<<-- stemstring
-->> stemstring: <<book>>
LP <<book>>
RD (4) <<book>>
ST (0,3) <<book>> RV 4
RS <<book>>
<<-- stemstring
-->> stemstring: <<this>>
LP <<this>>
RD (4) <<this>>
ST (0,3) <<this>> RV 4
RS <<this>>
<<-- stemstring
-->> stemstring: <<is!>>
LP <<is!>>
RD (2) <<is>>
ST (0,1) <<is>> RV 2
RS <<is>>
LP <<!>>
NL <<!>>
<<-- stemstring
-->> stemstring: <<What.hast.thou.done?>>
LP <<What.hast.thou.done?>>
RD (4) <<What>>
ST (0,3) <<What>> RV 4
RS <<What>>
LP <<.hast.thou.done?>>
NL <<.>>
LP <<hast.thou.done?>>
RD (4) <<hast>>
ST (0,3) <<hast>> RV 4
RS <<hast>>
LP <<.thou.done?>>
NL <<.>>
LP <<thou.done?>>
RD (4) <<thou>>
ST (0,3) <<thou>> RV 4
RS <<thou>>
LP <<.done?>>
NL <<.>>
LP <<done?>>
RD (4) <<done>>
ST (0,3) <<done>> RV 4
RS <<done>>
LP <<?>>
NL <<?>>
<<-- stemstring
The techniques shown - printing diagnostic information at key points in the program - is one way of debugging a program such as this. The alternative is stepping through the code with a source code debugger - gdb or its equivalent. I probably more often use print statements, but I'm an old fogey who finds IDE's too hard to use (because they don't behave like the command line I'm used to).
Granted, it isn't your code any more, but I do think you should have been able to do most of the debugging yourself. I'm grateful that you reported the trouble with my code. However, you also need to learn how to diagnose problems in other people's code; how to instrument it; how to characterize and locate the problems. You could then report the problem with precision - "you goofed with your end of word condition, and ...".

Related

How to receive strings from HC05 Bluetooth module using ATmega16 microcontroller

I am having problem in receiving string from HC05 to ATmega16. I am able receive characters but not able to receive strings.
I want to control DC motor wirelessly using ATmega16 and Bluetooth module (HC05). I am sending the timer OCR1A values from serial monitor app to ATmega16 by HC05 but not succeeded.
#define F_CPU 16000000UL
#include<string.h>
#include <avr/io.h>
#include <util/delay.h>
#include <stdlib.h>
#include <stdio.h>
void UART_init()
{
UCSRB |= (1 << RXEN) | (1 << TXEN);
UCSRC |= (1 << URSEL) | (1 << UCSZ0) | (1 << UCS Z1);
UBRRL = 0x67;
}
unsigned char UART_RxChar()
{
while( (UCSRA & (1 << RXC)) == 0 );
return(UDR);
}
void UART_TxChar( char ch )
{
while( !(UCSRA & (1 << UDRE)) ); /* Wait for empty transmit buffer*/
UDR = ch ;
}
void UART_SendString( char* str )
{
unsigned char j = 0;
while( j <= 2 )
{
UART_TxChar( str[j] );
j++;
}
}
int main( void )
{
char buff[3];
char j;
int i = 0, k = 0;
DDRD = (1 << PD5);
UART_init();
while( 1 )
{
buff[0] = UART_RxChar();
buff[1] = UART_RxChar();
buff[2] = UART_RxChar();
j = UART_RxChar();
if( j == '!' )
{
UART_SendString( buff ); // this is to check whether the atmega16 received correct values for timer or not.
UART_SendString( "\n" );
}
}
}
The expected result is when I enter the number in serial monitor app, I should get back the same number on serial monitor app.
In the actual result I am getting different characters sometimes and empty some times.
The string buff is unterminated, so UART_SendString( buff ); will send whatever junk follows the received three characters until a NUL (0) byte is found.
char buff[4] = {0};
Will have room for the NUL and the initialisation will ensure that buff[3] is a NUL terminator.
Alternatively, send the three characters individually since without the terminator they do not constitute a valid C (ASCIIZ) string.
Apart from the lack of nul termination, you code requires input of exactly the form nnn!nnn!nnn!.... If the other end is in fact sending lines with CR or CR+LF terminators - nnn!<newline>nnn!<newline>nnn!<newline>... your receive loop will get out of sync.
A safer solution is to use the previously received three characters whenever a '!' character is received. This can be done in a number of ways - for long buffers a ring-buffer would be advised, but for just three characters it is probably efficient enough to simply shift characters left when inserting a new character - for example:
char buff[4] ;
for(;;)
{
memset( buff, '0', sizeof(buff) - 1 ) ;
char ch = 0 ;
while( (ch != '!' )
{
ch = UART_RxChar() ;
if( isdigit(ch) )
{
// Shift left one digit
memmove( buff, &buff[1], sizeof(buff) - 2 ) ;
// Insert new digit at the right
buff[sizeof(buff) - 2] = ch ;
}
else if( ch != '!' )
{
// Unexpected character, reset buffer
memset( buff, '0', sizeof(buff) - 1 ) ;
}
}
UART_SendString( buff ) ;
UART_SendString( "\n" ) ;
}
This also has the advantage that it will work when the number entered is less than three digits, and will discard any sequence containing non-digit characters.

How to validate an ASCII string with a two digit hexadecimal checksum appended?

I am using a Renesas 16 bt MCU with HEW (High-performance Embedded Workbench) compiler.
The system receives ACSII data of the form:
<data><cc>
where <cc> comprises two ASCII hex digits corresponding to the 8-bit bitwise XOR of all the preceding characters. The maximum length of the string including <cc> is 14.
Here is my attempt:
#pragma INTERRUPT Interrupt_Rx0
void Interrupt_Rx0 (void)
{
unsigned char rx_byte, rx_status_byte,hex;
char buffer[15],test[5];
int r,k[15];
char * pEnd;
unsigned char dat,arr[14],P3;
unsigned int i,P1[10];
rx_byte = u0rbl; //get rx data
rx_status_byte = u0rbh;
if ((rx_status_byte & 0x80) == 0x00) //if no error
{
if ((bf_rx0_start == 0) && (rx_byte == '?') && (bf_rx0_ready == 0))
{
byte_rx0_buffer[0]=rx_byte;
bf_rx0_start = 1;
byte_rx0_ptr = 1;
}
else
{
if (rx_byte == '?')
{
bf_rx0_start = 1;
byte_rx0_ptr = 0;
}
if(bf_rx0_start == 1)
{
byte_rx0_buffer[byte_rx0_ptr++] = rx_byte;
sprintf(buffer,"%X",rx_byte); //ASCII CONVERSION
dat=strtol(buffer,&pEnd,16);
// P1=(int)dat;
// sprintf(P1,"%s",dat);
delay_ms(2000);
k[byte_rx0_ptr++]=dat;
}
if ((byte_rx0_ptr == 14))
bf_rx0_start = 0;//end further rx until detect new STX
}
}
}
convert this value to hexadec value & xor it ie(3F^30^31^53^52^57=68), if i can do this calculation in program
You fundamentally don't understand the difference between values and encodings. Two plus three is five whether you represent the two as "2", "two", or "X X". Addition operates on values, not representations. So to "convert to hexadecimal & xor it" makes no sense. You XOR values, not representations. Hexadecimal is a representation.
To maintain a running XOR, just do something like int running_xor=0; at the top and then running_xor ^= rx_byte; each time you receive a byte. It will contain the correct value when you are finished. Set it to zero to reset it.
Get hexadecimal completely out of your head. That is just how those values are being printed for your consumption. That has nothing to do with the internal logic of your program which deals only in values.
You would do well to separate out the data validation from the data reception, even to the extent that you don't do it in the interrupt handler; it is likely to be better to buffer the data in the ISR unchecked and defer the data validation to the main code thread or a task-thread if you are using an RTOS. You certainly don't want to be calling heavy-weight library functions such as sprintf() or strtol() in an ISR!
Either way, here is a function that would take a pointer to a received string, and its length (to avoid an unnecessary strlen() call since you already know how many characters were received), and returns true if the checksum validates, and false otherwise. It has no restriction on data length - that would be performed by the calling function.
If you know that your checksum hex digits will always be either upper or lower-case, you can simplify the decodeHexNibble() function.
#include <stdint.h>
#include <stdbool.h>
uint8_t decodeHexNibble() ;
uint8_t decodeHexByte( char* hexbyte ) ;
uint8_t decodeHexNibble( char hexdigit ) ;
bool checkData( char* data, int length )
{
int data_len = length - 2 ;
char* bcc_ptr = &data[data_len] ;
uint8_t rx_bcc_val = 0 ;
uint8_t actual_bcc_val = 0 ;
int i = 0 ;
// Convert <cc> string to integer
rx_bcc_val = decodeHexByte( bcc_ptr ) ;
// Calculate XOR of <data>
for( i = 0; i < data_len; i++ )
{
actual_bcc_val ^= data[i] ;
}
return actual_bcc_val == rx_bcc_val ;
}
uint8_t decodeHexNibble( char hexdigit )
{
uint8_t nibble ;
if( hexdigit >= '0' && hexdigit <= '9' )
{
nibble = hexdigit - '0' ;
}
else if( hexdigit >= 'a' && hexdigit <= 'f' )
{
nibble = hexdigit - 'a' + 10 ;
}
else if( hexdigit >= 'A' && hexdigit <= 'F' )
{
nibble = hexdigit - 'A' + 10 ;
}
else
{
// Do something 'sensible' with invalid digits
nibble = 0 ;
}
return nibble ;
}
uint8_t decodeHexByte( char* hexbyte )
{
uint8_t byte = hexbyte[0] << 4 ;
byte |= hexbyte[1] ;
return byte ;
}

Determine Position of Most Signifiacntly Set Bit in a Byte

I have a byte I am using to store bit flags. I need to compute the position of the most significant set bit in the byte.
Example Byte: 00101101 => 6 is the position of the most significant set bit
Compact Hex Mapping:
[0x00] => 0x00
[0x01] => 0x01
[0x02,0x03] => 0x02
[0x04,0x07] => 0x03
[0x08,0x0F] => 0x04
[0x10,0x1F] => 0x05
[0x20,0x3F] => 0x06
[0x40,0x7F] => 0x07
[0x80,0xFF] => 0x08
TestCase in C:
#include <stdio.h>
unsigned char check(unsigned char b) {
unsigned char c = 0x08;
unsigned char m = 0x80;
do {
if(m&b) { return c; }
else { c -= 0x01; }
} while(m>>=1);
return 0; //never reached
}
int main() {
unsigned char input[256] = {
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0x3e,0x3f,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x5b,0x5c,0x5d,0x5e,0x5f,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x7b,0x7c,0x7d,0x7e,0x7f,
0x80,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f,
0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9f,
0xa0,0xa1,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xaf,
0xb0,0xb1,0xb2,0xb3,0xb4,0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xbb,0xbc,0xbd,0xbe,0xbf,
0xc0,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xcb,0xcc,0xcd,0xce,0xcf,
0xd0,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,0xdb,0xdc,0xdd,0xde,0xdf,
0xe0,0xe1,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xeb,0xec,0xed,0xee,0xef,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa,0xfb,0xfc,0xfd,0xfe,0xff };
unsigned char truth[256] = {
0x00,0x01,0x02,0x02,0x03,0x03,0x03,0x03,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,
0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,
0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,
0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08};
int i,r;
int f = 0;
for(i=0; i<256; ++i) {
r=check(input[i]);
if(r !=(truth[i])) {
printf("failed %d : 0x%x : %d\n",i,0x000000FF & ((int)input[i]),r);
f += 1;
}
}
if(!f) { printf("passed all\n"); }
else { printf("failed %d\n",f); }
return 0;
}
I would like to simplify my check() function to not involve looping (or branching preferably). Is there a bit twiddling hack or hashed lookup table solution to compute the position of the most significant set bit in a byte?
Your question is about an efficient way to compute log2 of a value. And because you seem to want a solution that is not limited to the C language I have been slightly lazy and tweaked some C# code I have.
You want to compute log2(x) + 1 and for x = 0 (where log2 is undefined) you define the result as 0 (e.g. you create a special case where log2(0) = -1).
static readonly Byte[] multiplyDeBruijnBitPosition = new Byte[] {
7, 2, 3, 4,
6, 1, 5, 0
};
public static Byte Log2Plus1(Byte value) {
if (value == 0)
return 0;
var roundedValue = value;
roundedValue |= (Byte) (roundedValue >> 1);
roundedValue |= (Byte) (roundedValue >> 2);
roundedValue |= (Byte) (roundedValue >> 4);
var log2 = multiplyDeBruijnBitPosition[((Byte) (roundedValue*0xE3)) >> 5];
return (Byte) (log2 + 1);
}
This bit twiddling hack is taken from Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup where you can see the equivalent C source code for 32 bit values. This code has been adapted to work on 8 bit values.
However, you may be able to use an operation that gives you the result using a very efficient built-in function (on many CPU's a single instruction like the Bit Scan Reverse is used). An answer to the question Bit twiddling: which bit is set? has some information about this. A quote from the answer provides one possible reason why there is low level support for solving this problem:
Things like this are the core of many O(1) algorithms such as kernel schedulers which need to find the first non-empty queue signified by an array of bits.
That was a fun little challenge. I don't know if this one is completely portable since I only have VC++ to test with, and I certainly can't say for sure if it's more efficient than other approaches. This version was coded with a loop but it can be unrolled without too much effort.
static unsigned char check(unsigned char b)
{
unsigned char r = 8;
unsigned char sub = 1;
unsigned char s = 7;
for (char i = 0; i < 8; i++)
{
sub = sub & ((( b & (1 << s)) >> s--) - 1);
r -= sub;
}
return r;
}
I'm sure everyone else has long since moved on to other topics but there was something in the back of my mind suggesting that there had to be a more efficient branch-less solution to this than just unrolling the loop in my other posted solution. A quick trip to my copy of Warren put me on the right track: Binary search.
Here's my solution based on that idea:
Pseudo-code:
// see if there's a bit set in the upper half
if ((b >> 4) != 0)
{
offset = 4;
b >>= 4;
}
else
offset = 0;
// see if there's a bit set in the upper half of what's left
if ((b & 0x0C) != 0)
{
offset += 2;
b >>= 2;
}
// see if there's a bit set in the upper half of what's left
if > ((b & 0x02) != 0)
{
offset++;
b >>= 1;
}
return b + offset;
Branch-less C++ implementation:
static unsigned char check(unsigned char b)
{
unsigned char adj = 4 & ((((unsigned char) - (b >> 4) >> 7) ^ 1) - 1);
unsigned char offset = adj;
b >>= adj;
adj = 2 & (((((unsigned char) - (b & 0x0C)) >> 7) ^ 1) - 1);
offset += adj;
b >>= adj;
adj = 1 & (((((unsigned char) - (b & 0x02)) >> 7) ^ 1) - 1);
return (b >> adj) + offset + adj;
}
Yes, I know that this is all academic :)
It is not possible in plain C. The best I would suggest is the following implementation of check. Despite quite "ugly" I think it runs faster than the ckeck version in the question.
int check(unsigned char b)
{
if(b&128) return 8;
if(b&64) return 7;
if(b&32) return 6;
if(b&16) return 5;
if(b&8) return 4;
if(b&4) return 3;
if(b&2) return 2;
if(b&1) return 1;
return 0;
}
Edit: I found a link to the actual code: http://www.hackersdelight.org/hdcodetxt/nlz.c.txt
The algorithm below is named nlz8 in that file. You can choose your favorite hack.
/*
From last comment of: http://stackoverflow.com/a/671826/315052
> Hacker's Delight explains how to correct for the error in 32-bit floats
> in 5-3 Counting Leading 0's. Here's their code, which uses an anonymous
> union to overlap asFloat and asInt: k = k & ~(k >> 1); asFloat =
> (float)k + 0.5f; n = 158 - (asInt >> 23); (and yes, this relies on
> implementation-defined behavior) - Derrick Coetzee Jan 3 '12 at 8:35
*/
unsigned char check (unsigned char b) {
union {
float asFloat;
int asInt;
} u;
unsigned k = b & ~(b >> 1);
u.asFloat = (float)k + 0.5f;
return 32 - (158 - (u.asInt >> 23));
}
Edit -- not exactly sure what the asker means by language independent, but below is the equivalent code in python.
import ctypes
class Anon(ctypes.Union):
_fields_ = [
("asFloat", ctypes.c_float),
("asInt", ctypes.c_int)
]
def check(b):
k = int(b) & ~(int(b) >> 1)
a = Anon(asFloat=(float(k) + float(0.5)))
return 32 - (158 - (a.asInt >> 23))

Objective c, Scanf() string taking in the same value twice

Hi all I am having a strange issue, when i use scanf to input data it repeats strings and saves them as one i am not sure why.
Please Help
/* Assment Label loop - Loops through the assment labels and inputs the percentage and the name for it. */
i = 0;
j = 0;
while (i < totalGradedItems)
{
scanf("%s%d", assLabel[i], &assPercent[i]);
i++;
}
/* Print Statement */
i = 0;
while (i < totalGradedItems)
{
printf("%s", assLabel[i]);
i++;
}
Input Data
Prog1 20
Quiz 20
Prog2 20
Mdtm 15
Final 25
Output Via Console
Prog1QuizQuizProg2MdtmMdtmFinal
Final diagnosis
You don't show your declarations...but you must be allocating just 5 characters for the strings:
When I adjust the enum MAX_ASSESSMENTLEN from 10 to 5 (see the code below) I get the output:
Prog1Quiz 20
Quiz 20
Prog2Mdtm 20
Mdtm 15
Final 25
You did not allow for the terminal null. And you didn't show us what was causing the bug! And the fact that you omitted newlines from the printout obscured the problem.
What's happening is that 'Prog1' is occupying all 5 bytes of the string you read in, and is writing a null at the 6th byte; then Quiz is being read in, starting at the sixth byte.
When printf() goes to read the string for 'Prog1', it stops at the first null, which is the one after the 'z' of 'Quiz', producing the output shown. Repeat for 'Prog2' and 'Mtdm'. If there was an entry after 'Final', it too would suffer. You are lucky that there are enough zero bytes around to prevent any monstrous overruns.
This is a basic buffer overflow (indeed, since the array is on the stack, it is a basic Stack Overflow); you are trying to squeeze 6 characters (Prog1 plus '\0') into a 5 byte space, and it simply does not work well.
Preliminary diagnosis
First, print newlines after your data.
Second, check that scanf() is not returning errors - it probably isn't, but neither you nor we can tell for sure.
Third, are you sure that the data file contains what you say? Plausibly, it contains a pair of 'Quiz' and a pair of 'Mtdm' lines.
Your variable j is unused, incidentally.
You would probably be better off having the input loop run until you are either out of space in the receiving arrays or you get a read failure. However, the code worked for me when dressed up slightly:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char assLabel[10][10];
int assPercent[10];
int i = 0;
int totalGradedItems = 5;
while (i < totalGradedItems)
{
if (scanf("%9s%d", assLabel[i], &assPercent[i]) != 2)
{
fprintf(stderr, "Error reading\n");
exit(1);
}
i++;
}
/* Print Statement */
i = 0;
while (i < totalGradedItems)
{
printf("%-9s %3d\n", assLabel[i], assPercent[i]);
i++;
}
return 0;
}
For the quoted input data, the output results are:
Prog1 20
Quiz 20
Prog2 20
Mdtm 15
Final 25
I prefer this version, though:
#include <stdio.h>
enum { MAX_GRADES = 10 };
enum { MAX_ASSESSMENTLEN = 10 };
int main(void)
{
char assLabel[MAX_GRADES][MAX_ASSESSMENTLEN];
int assPercent[MAX_GRADES];
int i = 0;
int totalGradedItems;
for (i = 0; i < MAX_GRADES; i++)
{
if (scanf("%9s%d", assLabel[i], &assPercent[i]) != 2)
break;
}
totalGradedItems = i;
for (i = 0; i < totalGradedItems; i++)
printf("%-9s %3d\n", assLabel[i], assPercent[i]);
return 0;
}
Of course, if I'd set up the scanf() format string 'properly' (meaning safely) so as to limit the length of the assessment names to fit into the space allocated, then the loop would stop reading on the second attempt:
...
char format[10];
...
snprintf(format, sizeof(format), "%%%ds%%d", MAX_ASSESSMENTLEN-1);
...
if (scanf(format, assLabel[i], &assPercent[i]) != 2)
With MAX_ASSESSMENTLEN at 5, the snprintf() generates the format string "%4s%d". The code compiled reads:
Prog 1
and stops. The '1' comes from the 5th character of 'Prog1'; the next assessment name is '20', and then the conversion of 'Quiz' into a number fails, causing the input loop to stop (because only one of two expected items was converted).
Despite the nuisance value, if you want to make your scanf() strings adjust to the size of the data variables it is reading into, you have to do something akin to what I did here - format the string using the correct size values.
i guess, you need to put a
scanf("%s%d", assLabel[i], &assPercent[i]);
space between %s and %d here.
And it is not saving as one. You need to put newline or atlease a space after %s on print to see difference.
add:
when i tried
#include <stdio.h>
int main (int argc, const char * argv[])
{
char a[1][2];
for(int i =0;i<3;i++)
scanf("%s",a[i]);
for(int i =0;i<3;i++)
printf("%s",a[i]);
return 0;
}
with inputs
123456
qwerty
sdfgh
output is:
12qwsdfghqwsdfghsdfgh
that proves that, the size of string array need to be bigger then decleared there.

What does having two asterisk ** in Objective-C mean?

I understand having one asterisk * is a pointer, what does having two ** mean?
I stumble upon this from the documentation:
- (NSAppleEventDescriptor *)executeAndReturnError:(NSDictionary **)errorInfo
It's a pointer to a pointer, just like in C (which, despite its strange square-bracket syntax, Objective-C is based on):
char c;
char *pc = &c;
char **ppc = &pc;
char ***pppc = &ppc;
and so on, ad infinitum (or until you run out of variable space).
It's often used to pass a pointer to a function that must be able to change the pointer itself (such as re-allocating memory for a variable-sized object).
=====
Following your request for a sample that shows how to use it, here's some code I wrote for another post which illustrates it. It's an appendStr() function which manages its own allocations (you still have to free the final version). Initially you set the string (char *) to NULL and the function itself will allocate space as needed.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void appendToStr (int *sz, char **str, char *app) {
char *newstr;
int reqsz;
/* If no string yet, create it with a bit of space. */
if (*str == NULL) {
*sz = strlen (app) + 10;
if ((*str = malloc (*sz)) == NULL) {
*sz = 0;
return;
}
strcpy (*str, app);
return;
}
/* If not enough room in string, expand it. We could use realloc
but I've kept it as malloc/cpy/free to ensure the address
changes (for the program output). */
reqsz = strlen (*str) + strlen (app) + 1;
if (reqsz > *sz) {
*sz = reqsz + 10;
if ((newstr = malloc (*sz)) == NULL) {
free (*str);
*str = NULL;
*sz = 0;
return;
}
strcpy (newstr, *str);
free (*str);
*str = newstr;
}
/* Append the desired string to the (now) long-enough buffer. */
strcat (*str, app);
}
static void dump(int sz, char *x) {
if (x == NULL)
printf ("%8p [%2d] %3d [%s]\n", x, sz, 0, "");
else
printf ("%8p [%2d] %3d [%s]\n", x, sz, strlen (x), x);
}
static char *arr[] = {"Hello.", " My", " name", " is", " Pax",
" and"," I", " am", " old."};
int main (void) {
int i;
char *x = NULL;
int sz = 0;
printf (" Pointer Size Len Value\n");
printf (" ------- ---- --- -----\n");
dump (sz, x);
for (i = 0; i < sizeof (arr) / sizeof (arr[0]); i++) {
appendToStr (&sz, &x, arr[i]);
dump (sz, x);
}
}
The code outputs the following. You can see how the pointer changes when the currently allocated memory runs out of space for the expanded string (at the comments):
Pointer Size Len Value
------- ---- --- -----
# NULL pointer here since we've not yet put anything in.
0x0 [ 0] 0 []
# The first time we put in something, we allocate space (+10 chars).
0x6701b8 [16] 6 [Hello.]
0x6701b8 [16] 9 [Hello. My]
0x6701b8 [16] 14 [Hello. My name]
# Adding " is" takes length to 17 so we need more space.
0x6701d0 [28] 17 [Hello. My name is]
0x6701d0 [28] 21 [Hello. My name is Pax]
0x6701d0 [28] 25 [Hello. My name is Pax and]
0x6701d0 [28] 27 [Hello. My name is Pax and I]
# Ditto for adding " am".
0x6701f0 [41] 30 [Hello. My name is Pax and I am]
0x6701f0 [41] 35 [Hello. My name is Pax and I am old.]
In that case, you pass in **str since you need to be able to change the *str value.
=====
Or the following, which does an unrolled bubble sort (oh, the shame!) on strings that aren't in an array. It does this by directly exchanging the addresses of the strings.
#include <stdio.h>
static void sort (char **s1, char **s2, char **s3, char **s4, char **s5) {
char *t;
if (strcmp (*s1, *s2) > 0) { t = *s1; *s1 = *s2; *s2 = t; }
if (strcmp (*s2, *s3) > 0) { t = *s2; *s2 = *s3; *s3 = t; }
if (strcmp (*s3, *s4) > 0) { t = *s3; *s3 = *s4; *s4 = t; }
if (strcmp (*s4, *s5) > 0) { t = *s4; *s4 = *s5; *s5 = t; }
if (strcmp (*s1, *s2) > 0) { t = *s1; *s1 = *s2; *s2 = t; }
if (strcmp (*s2, *s3) > 0) { t = *s2; *s2 = *s3; *s3 = t; }
if (strcmp (*s3, *s4) > 0) { t = *s3; *s3 = *s4; *s4 = t; }
if (strcmp (*s1, *s2) > 0) { t = *s1; *s1 = *s2; *s2 = t; }
if (strcmp (*s2, *s3) > 0) { t = *s2; *s2 = *s3; *s3 = t; }
if (strcmp (*s1, *s2) > 0) { t = *s1; *s1 = *s2; *s2 = t; }
}
int main (int argCount, char *argVar[]) {
char *a = "77";
char *b = "55";
char *c = "99";
char *d = "88";
char *e = "66";
printf ("Unsorted: [%s] [%s] [%s] [%s] [%s]\n", a, b, c, d, e);
sort (&a,&b,&c,&d,&e);
printf (" Sorted: [%s] [%s] [%s] [%s] [%s]\n", a, b, c, d, e);
return 0;
}
which produces:
Unsorted: [77] [55] [99] [88] [66]
Sorted: [55] [66] [77] [88] [99]
Never mind the implementation of sort, just notice that the variables are passed as char ** so that they can be swapped easily. Any real sort would probably be acting on a true array of data rather than individual variables but that's not the point of the example.
Pointer to Pointer
The definition of "pointer" says that it's a special variable that stores the address of another variable (not the value). That other variable can very well be a pointer. This means that it's perfectly legal for a pointer to be pointing to another pointer.
Let's suppose we have a pointer p1 that points to yet another pointer p2 that points to a character c. In memory, the three variables can be visualized as :
So we can see that in memory, pointer p1 holds the address of pointer p2. Pointer p2 holds the address of character c.
So p2 is pointer to character c, while p1 is pointer to p2. Or we can also say that p2 is a pointer to a pointer to character c.
Now, in code p2 can be declared as :
char *p2 = &c;
But p1 is declared as :
char **p1 = &p2;
So we see that p1 is a double pointer (i.e. pointer to a pointer to a character) and hence the two *s in declaration.
Now,
p1 is the address of p2 i.e. 5000
*p1 is the value held by p2 i.e. 8000
**p1 is the value at 8000 i.e. c
I think that should pretty much clear the concept, lets take a small example :
Source: http://www.thegeekstuff.com/2012/01/advanced-c-pointers/
For some of its use cases:
This is usually used to pass a pointer to a function that must be able to change the pointer itself, some of its use cases are:
Such as handling errors, it allows the receiving method to control what the pointer is referencing to. See this question
For creating an opaque struct i.e. so that others won't be able to allocate space. See this question
In case of memory expansion mentioned in the other answers of this question.
feel free to edit/improve this answer as I am learning:]
A pointer to a pointer.
In C pointers and arrays can be treated the same, meaning e.g. char* is a string (array of chars). If you want to pass an array of arrays (e.g. many strings) to a function you can use char**.
(reference: more iOS 6 development)
In Objective-C methods, arguments, including object pointers, are
passed by value, which means that the called method gets its own copy
of the pointer that was passed in. So if the called method wants to
change the pointer, as opposed to the data the pointer points to, you
need another level of indirection. Thus, the pointer to the pointer.