Is there a way to get current number of tokens parsed in stack in yacc - yacc

I am running into parser stack overflow in yacc. I am not sure how is the current parser stack size determined. IS there a way to get current parser stack size, so that once the number of tokens reaches the maximum stack depth, an error can be reported? Is there a variable in yacc that holds this information?

There is no standard way to get the parser stack size, although obviously it is internally available since the parser is capable of producing a stack overflow error (without segfaulting or otherwise invoking undefined behaviour). You don't need to check this yourself; you simply need to print the error message provided to yyerror; if the stack overflows, the error message will mention that fact.
There are a few ways you can end up with a version of yàcc which doesn't resize the stack. One is the use of the public domain Berkeley yacc, often called byacc; the version I have kicking around (from 1993) sets the default stack size to 500.
Another possibility is to use Gnu bison, compiling the result with a C++ compiler; by default, this will make the stack non-relocatable since bison doesn't know whether the semantic value union is trivially copyable. (Newer versions of bison might not have this restriction.) By default, the initial bison stack size is 200.
A common way to blow up stacks is to use right recursion for long lists. A particularly bad one is some variant on the following:
program: /* empty */
| statement program
;
which will cause the parser stack overflow if a "program" is too long. It's usually sufficient to just change that to left recursion:
program: /* empty */
| program statement
;

Related

What is the order of local variables on the stack?

I'm currently trying to do some tests with the buffer overflow vulnerability.
Here is the vulnerable code
void win()
{
printf("code flow successfully changed\n");
}
int main(int argc, char **argv)
{
volatile int (*fp)();
char buffer[64];
fp = 0;
gets(buffer);
if(fp) {
printf("calling function pointer, jumping to 0x%08x\n", fp);
fp();
}
}
The exploit is quite sample and very basic: all what I need here is to overflow the buffer and override the fp value to make it hold the address of win() function.
While trying to debug the program, I figured out that fb is placed below the buffer (i.e with a lower address in memory), and thus I am not able to modify its value.
I thought that once we declare a local variable x before y, x will be higher in memory (i.e at the bottom of the stack) so x can override y if it exceeds its boundaries which is not the case here.
I'm compiling the program with gcc gcc version 5.2.1, no special flags (only tested -O0)
Any clue?
The order of local variable on the stack is unspecified.
It may change between different compilers, different versions or different optimization options. It may even depend on the names of the variables or other seemingly unrelated things.
The order of local variables on the stack is not defined until compile/link (build) time. I'm no expert certainly, but I think you'd have to do some sort of a "hex dump", or perhaps run the code in a debugger environment to find out where it's allocated. And I'd also guess that this would be a relative address, not an absolute one.
And as #RalfFriedl has indicated in his answer, the location could change under any number of compiler options invoked for different platforms, etc.
But we do know it's possible to do buffer overflows. Although you do hear less about them now due to defensive measures such as address space layout randomization (ASLR), they're still around and paying the bills for someone I'd guess. There are of course many many online articles on the subject; here's one that seems fairly current(https://www.synopsys.com/blogs/software-security/detect-prevent-and-mitigate-buffer-overflow-attacks/).
Good luck (should you even say that to someone practicing buffer overflow attacks?). At any rate, I hope you learn some things, and use it for good :)

"Bad permissions for mapped region at address" Valgrind error for memset

I am running into a problem that appears to be due to a stack overflow. When I run the application under Valgrind, I get the following errors:
Thread 75:
Invalid write of size 4
at 0x833FBF6: <Class Name>::<Method Name>(short, short&) (<File Name>:692)
Address 0x222d75c0 is on thread 75's stack
Process terminating with default action of signal 11 (SIGSEGV): dumping core
Bad permissions for mapped region at address 0x222D6000
at 0x4022BA3: memset (mc_replace_strmem.c:586)
by 0x833FC80: <Class Name>::<Method Name>(short, short&) (<File Name>:708)
If I open the core file in gdb, go to frame 1 where the memset is being called, and do an "info registers", it shows that $esp = 0x222d5210 and $ebp = 0x222d75c8.
Doesn't that seem to indicate that the stack would include memory at addres 0x222D6000? If that's true, then why would we get the "Bad permissions" error?
The other odd thing is that line 692 of the source file is the very first line of the method (i.e., "void ::(short var1, short &var2)"). So, why would we get an invalid write at that point?
As I said, it seems to be a case of running out of stack space, but even if we use the "limit stacksize" command to increase the amount of allocated stack space, we still encounter the same problem.
I've been beating my head against the wall for several days trying to debug this problem. Any advice would be appreciated.
It turns out that this problem was due to a stack overflow after all. I didn't realize that the code that spawned the thread that was causing the problem explicitly set the size of the stack to be used by the thread. That's why changing the value used by the "limit stacksize" command didn't make a difference. Once, I modified the code that set the stack size to increase the amount of memory allocated, the problem went away.
What you could do is to activate the Valgrind gdbserver, and
attach using gdb+vgdb to your program running under Valgrind.
You can then use various valgrind monitor commands to have more
info about the problem. E.g. look again at the register values,
use 'monitor v.info scheduler' to see the stack trace and the stack size, ...
Full list of monitor commands with memcheck+valgrind can be found at
http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands
and
http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.valgrind-monitor-commands

How are the digits in ObjC method type encoding calculated?

Is is a follow-up to my previous question:
What are the digits in an ObjC method type encoding string?
Say there is an encoding:
v24#0:4:8#12B16#20
How are those numbers calculated? B is a char so it should occupy just 1 byte (not 4 bytes). Does it have something to do with "alignment"? What is the size of void?
Is it correct to calculate the numbers as follows? Ask sizeof on every item and round up the result to multiple of 4? And the first number becomes the sum of all the other ones?
The numbers were used in the m68K days to denote stack layout. That is, you could literally decode the the method signature and, for just about all types, know exactly which bytes at what offset within the stack frame you could diddle to get/set arguments.
This worked because the m68K's ABI was entirely [IIRC -- been a long long time] stack based argument/return passing. There wasn't anything shoved into registers across call boundaries.
However, as Objective-C was ported to other platforms, always-on-the-stack was no longer the calling convention. Arguments and return values are often passed in registers.
Thus, those offsets are now useless. As well, the type encoding used by the compiler is no longer complete (because it never was terribly useful) and there will be types that won't be encoded. Not too mention that encoding some C++ templatized types yields method type encoding strings that can be many Kilobytes in size (I think the record I ran into was around 30K of type information).
So, no, it isn't correct to use sizeof() to generate the numbers because they are effectively meaningless to everything. The only reason why they still exist is for binary compatibility; there are bits of esoteric code here and there that still parse the type encoding string with the expectation that there will be random numbers sprinkled here and there.
Note that there are vestiges of API in the ObjC runtime that still lead one to believe that it might be possible to encode/decode stack frames on the fly. It really isn't as the C ABI doesn't guarantee that argument registers will be preserved across call boundaries in the face of optimization. You'd have to drop to assembly and things get ugly really really fast (>shudder<).
The full encoding string is constructed (in clang) by the method ASTContext::getObjCEncodingForMethodDecl, which you can find in lib/AST/ASTContext.cpp.
The method that does the size rounding is ASTContext::getObjCEncodingTypeSize, in the same file. It forces each size to be at least the size of an int. On all of Apple's current platforms, an int is 4 bytes.
The stack frame size and argument offsets are calculated by the compiler. I'm actually trying to track this down in the Clang source myself this week; it possibly has something to do with CodeGenTypes::arrangeObjCMessageSendSignature. (Looks like Rob just made my life a lot easier!)
The first number is the sum of the others, yes -- it's the total space occupied by the arguments. To get the size of the type represented by an ObjC type encoding in your code, you should use NSGetSizeAndAlignment().

Calling functions from within function(float *VeryBigArray,long SizeofArray) from within objC method fails with EXC_BAD_ACCESS

Ok I finally found the problem. It was inside the C function(CarbonTuner2) not the objC method. I was creating inside the function an array of the same size as the file size so if the filesize was big it created a really big array and my guess is that when I called another function from there, the local variables were put on the stack which created the EXC_BAD_ACCESS. What I did then is instead of using a variable to declare to size of the array I put the number directly. Then the code didnt even compile. it knew. The error wassomething like: Array size too big. I guess working 20+hours in a row isnt good XD But I am definitly gonna look into tools other than step by step debuggin to figure these ones out. Thanks for your help. Here is the code. If you divide gFileByteCount by 2 you dont get the error anymore:
// ConverterController.h
# import <Cocoa/Cocoa.h>
# import "Converter.h"
#interface ConverterController : NSObject {
UInt64 gFileByteCount ;
}
-(IBAction)ProcessFile:(id)sender;
void CarbonTuner2(long numSampsToProcess, long fftFrameSize, long osamp);
#end
// ConverterController.m
# include "ConverterController.h"
#implementation ConverterController
-(IBAction)ProcessFile:(id)sender{
UInt32 packets = gTotalPacketCount;//alloc a buffer of memory to hold the data read from disk.
gFileByteCount=250000;
long LENGTH=(long)gFileByteCount;
CarbonTuner2(LENGTH,(long)8192/2, (long)4*2);
}
#end
void CarbonTuner2(long numSampsToProcess, long fftFrameSize, long osamp)
{
long numFrames = numSampsToProcess / fftFrameSize * osamp;
float g2DFFTworksp[numFrames+2][2 * fftFrameSize];
double hello=sin(2.345);
}
Your crash has nothing to do with incompatibilities between C and ObjC.
And as previous posters said, you don't need to include math.h.
Run your code under gdb, and see where the crash happens by using backtrace.
Are you sure you're not sending bad arguments to the math functions?
E.g. this causes BAD_ACCESS:
double t = cos(*(double *)NULL);
Objective C is built directly on C, and the C underpinnings can and do work.
For an example of using math.h and parts of standard library from within an Objective C module, see:
http://en.wikibooks.org/wiki/Objective-C_Programming/syntax
There are other examples around.
Some care is needed around passing the variables around; use the C variables for the C and standard library calls; don't mix the C data types and Objective C data types incautiously. You'll usually want a conversion here.
If that is not the case, then please consider posting the code involved, and the error(s) you are receiving.
And with all respect due to Mr Hellman's response, I've hit errors when I don't have the header files included; I prefer to include the headers. But then, I tend to dial the compiler diagnostics up a couple of notches, too.
For what it's worth, I don't include math.h in my Cocoa app but have no problem using math functions (in C).
For example, I use atan() and don't get compiler errors, or run time errors.
Can you try this without including math.h at all?
First, you should add your code to your question, rather than posting it as an answer, so people can see what you're asking about. Second, you've got all sorts of weird problems with your memory management here - gFileByteCount is used to size a bunch of buffers, but it's set to zero, and doesn't appear to get re-set anywhere.
err = AudioFileReadPackets (fileID,
false, &bytesReturned, NULL,0,
&packets,(Byte *)rawAudio);
So, at this point, you pass a zero-sized buffer to AudioFileReadPackets, which prompty overruns the heap, corrupting the value of who knows what other variables...
fRawAudio =
malloc(gFileByteCount/(BITS/8)*sizeof(fRawAudio));
Here's another, minor error - you want sizeof(*fRawAudio) here, since you're trying to allocate an array of floats, not an array of float pointers. Fortunately, those entities are the same size, so it doesn't matter.
You should probably start with some example code that you know works (SpeakHere?), and modify it. I suspect there are other similar problems in the code yoou posted, but I don't have time to find them right now. At least get the rawAudio buffer appropriately-sized and use the values returned from AudioFileReadPackets appropriately.

Measuring performance after Tail Call Optimization(TCO)

I have an idea about what it is. My question is :-
1.) If i program my code which is amenable to Tail Call optimization(Last statement in a function[recursive function] being a function call only, no other operation there) then do i need to set any optimization level so that compiler does TCO. In what mode of optimization will compiler perform TCO, optimizer for space or time.
2.) How do i find out which all compilers (MSVC, gcc, ARM-RVCT) does support TCO
3.) Assuming some compiler does TCO, we enable it then, What is the way to find out that the compielr has actually done TCO? Will Code size, tell it or Cycles taken to execute it will tell that or both?
-AD
Most compilers support TCO, it is a relatively old technique. As far as how to enable it with a specific compiler, check the documentation for your compilers. gcc will enable the optimization at every optimization level except -O1, I think the specific option for this is -foptimize-sibling-calls. As far as how to tell how/if the compiler is doing TCO, look at the assembler output (gcc -S for example) or disassemble the object code.
Optimization is Compiler specific. Consult the documentation for the various optimization flags for them
You will find that in the Compilers documentation too. If you are curious, you can write a tail recursive function and pass it a big argument, and lookout for a stack-overflow. (tho checking the generated assembler might be a better choice, if you understand the code generated.)
You just use the debugger, and look out the address of function arguments/local variables. If they increase/decrease on each logical frame that the debugger shows (or if it actually only shows one frame, even though you did several calls), you know whether TCO was done or wasn't done.
If you want your compiler to do tail call optimization, just check either
a) the doc of the compiler at which optimization level it will be performed or
b) check the asm, if the function will call itself (you dont even need big asm knowledge to spot the just the symbol of the function again)
If you really really want tail recursion my question would be:
Why dont you perform the tail call removal yourself? It means nothing else than removing the recursion, and if its removable then its not only possible by the compiler on low level but also on algorithmic level by you, that you can programm it direct into your code (it means nothing else than go for a loop instead of a call to yourself).
One way to determine if tail-call is happening is to see if you can force a stack overflow. The following program does not produce a stack overflow using VC++ 2005 Express Edition and, even though its results exceed the capacity of long double rather quickly, you can tell that all of the iterations are being processed when TCO is happening:
/* FibTail.c 0.00 UTF-8 dh:2008-11-23
* --|----1----|----2----|----3----|----4----|----5----|----6----|----*
*
* Demonstrate Fibonacci computation by tail call to see whether it is
* is eliminated through compiler optimization.
*/
#include <stdio.h>
long double fibcycle(long double f0, long double f1, unsigned i)
{ /* accumulate successive fib(n-i) values by tail calls */
if (i == 0) return f1;
return fibcycle(f1, f0+f1, --i);
}
long double fib(unsigned n)
{ /* the basic fib(n) setup and return. */
return fibcycle(1.0, 0.0, n);
}
int main(int argc, char* argv[ ])
{ /* compute some fibs until something breaks */
int i;
printf("\n i fib(i)\n\n");
for (i = 1; i > 0; i+=i)
{ /* Do for powers of 2 until i flips negative
or stack overflow, whichever comes first */
printf("%12d %30.20LG \n", i, fib((unsigned) i) );
}
printf("\n\n");
return 0;
}
Notice, however, that the simplifications to make a pure tail-call in fibcycle is tantamount to figuring out an interative version that doesn't do a tail-call at all (and will work with or without TCO in the compiler.
It might be interesting to experiment in order to see how well the TCO can find optimizations that are not already near-optimal and easily replaced by iterations.