Print codepath for starting method and data - aop

Is there a tool/method to list all function calls(codepath) for a starting point consisting of a specific method and some data(both global and function argument)?
This is a Visual Studio MFC console C++ project.
I thought of using AOP to tackle this, but it would be my first try at AOP, and would prefer a proven solution.
Additional problem for profiling is that the app has an infinite while listener and is multithreaded + "inter-process communication" (so profiler would have to pickup on other process response, and filter calls within while loop).
Is static code analysis a viable solution for this, or should i continue looking for profiling and AOP to solve this?

Related

How to debug a freeRTOS application?

How do you debug an RTOS application? I am using KEIL µVision and when I hit debug, the program steps through the main function until the function that initializes the RTOS kernel and then you can't step any further. The code itself works though. It is not mine btw, but I have to work on it. Is this normal behavior with RTOS applications or is this related to the program?
Yes, this is normal. You need to set breakpoints in the source code for the tasks that were created in main(): the only purpose of main() in a FreeRTOS application is to :
initialize the hardware,
create the resources (timers, semaphores...) and tasks your application will need,
start the scheduler
The application should never return from vTaskStartScheduler() if they were enough resources available.
Put break-points at the entry point of each task you need to debug. When you step the over the scheduler start (or simply run) the debugger will halts at the first task that runs. When that task blocks, some other task will be selected to run according to the scheduling rules.
Generally when debugging and you reach a blocking call, step-over it, other tasks may run and the debugger will stop at the next line only when the task becomes ready (depending on the nature of the blocking call). Often you will want to predict what task will run as a result of the call, and put a breakpoint in that task. For example if you issue a message send, you might place a breakpoint after the message receive call of the receiving task.
The point is you cannot "step-through" a context switch unless you have the RTOS source or do it at the assembler level, which is seldom useful or productive, and will not work for preemption.
You get a somewhat better RTOS debug experience and tool support in Keil if you use Keil's own RTX5 RTOS rather then FreeRTOS, but all of the above remains true.
Yes, this is an expected behaviour. The best way to debug a RTOS application is to place breakpoints at all tasks, key function entry points and step debug.
The debugger supports various methods of single-stepping through an application as in below link.
http://www.keil.com/products/uvision/db_exe_step.asp
Typical challenges in debugging RTOS application can be dealing with interrupt handling, synchronization issues and register/memory corruption.
Keil µVision's System Analyzer enables one to view the program execution time frame, status of each thread. It shall also help in viewing interrupts, exceptions if tracer is enabled.

C++ | Adding workload to a existing thread from a injected DLL

in my project i injected a DLL(64-bit Windows 10) in to a external process with Manual-map & Thread-hijacking and i do some stuff in there.
In current state i use "RtlCreateUserThread" to create a new thread and do some extra workload in there to distribute it for better performance.
My question is now... Is it possible to access other threads from the current process (hijack it) and add your own workload/code there. Without creating a new thread?
I didn't found anything helpful yet in the internet and the code i used and modified for Thread-hijacking seems to only work for a DLL file. Because i am pretty new to C++ i am still learning i am already thankful for any help.
(If you want to see the source for injector Google GHInjector your find the library on github.)
It is possible, but so complicated and may not work in all cases.
You need to splice existing thread's machine codes, so you will need write access to code page memory.
Logic:
find thread id and thread handle, then suspend thread with SuspendThread WINAPI call
suspended thread can be in wait state or in system DLL call now, so you need to analyze current execution stack, backtrace it and find execution address from application space. You need API functions StackWalk, and PDB files in some cases. Also it depends on running architecture (x86, amd64, ...). Walk through stack until your EIP/RIP will not be in application memory address space
decode machine instruction (it will be 'call') and splice next instructions to your function call. You need to use __declspec(naked) declared function or ASM implemented one for execute your code and replaced instructions.
ResumeThread
This method may work only once because no guarantees that application code is executed in loop.

Unreleased DirectShow CSource filter makes program crash at process shutdown

I'm developing a DirectShow CSource capture filter. It works fine, but when I close the program that is using the filter (in this case I'm testing with VLC, but the same happens with other programs), the program crashes (if I'm debugging it in Visual Studio then a breakpoint is triggered).
I've been hunting down this problem for some time now and found that both, my source filter and my source stream are not being released; this is, their reference counter is 1 at the end of the program, DllCanUnloadNow() function reports that there are 2 objects still in use, and, when CoUninitialize() is invoked, the program crashes.
I'm pretty sure that the reference counters are being handled correctly since I'm using the base classes implementation. The only unusual thing in my software that I can think of is that I'm using my own version of DllGetClassObject(): I configured the .DEF file to have MyDllGetClassObject() exported instead of DllGetClassObject() so I could insert some code before invoking the default implementation. I don't think this is a problem since I've checked that the reference counter of all objects I return at the end of MyDllGetClassObject() is 1.
I guess I'm missing something about the lifecycle of the filter, but can't figure out what (this is the very first capture filter I'm developing). Any suggestion?
Thank you in advance,
Guillermo
I finally figured out what was going on. The static method InitializeInstance in my source filter is invoked with bLoading == false and rclsid == <the GUID of my filter> at process shutdown. That seems to be the appropriate place to release that remaining reference counter from the filter instance.
I got the key idea of how important is to release all COM objects before CoUninitialize some time ago from another post on StackOverflow entitled DirectShow code crashes after exit (PushSourceDesktop sample). All I needed was just a little bit more knowledge on DirectShow filters lifecycle.
Anyway, thank you for your efforts, Roman, I know how vague this thread sounded from the beginning :)

Simulating multiple instances of an embedded processor

I'm working on a project which will entail multiple devices, each with an embedded (ARM) processor, communicating. One development approach which I have found useful in the past with projects that only entailed a single embedded processor was develop the code using Visual Studio, divided into three portions:
Main application code (in unmanaged C/C++ [see note])
I/O-simulating code (C/C++) that runs under Visual Studio
Embedded I/O code (C), which Visual Studio is instructed not to build, runs on the target system. Previously this code was for the PIC; for most future projects I'm migrating to the ARM.
Feeding the embedded compiler/linker the code from parts 1 and 3 yields a hex file that can run on the target system. Running parts 1 and 2 together yields code which can run on the PC, with the benefit of better debugging tools and more precise control over I/O behavior (e.g. I can make the simulation code introduce certain types of random hiccups more easily than I can induce controlled hiccups on real hardware).
Target code is written in C, but the simulation environment uses C++ so as to simulate I/O registers. For example, I have a PortArray data structure; the header file for the embedded compiler includes a line like unsigned char LATA # 0xF89; and my header file for simulation includes #define LATA _IOBIT(f89,1) which in turn invokes a macro that accesses a suitable property of an I/O object, so a statement like LATA |= 4; will read the simulated latch, "or" the read value with 4, and write the new value. To make this work, the target code has to compile under C++ as well as under C, but this mostly isn't a problem. The biggest annoyance is probably with enum types (which behave as integers in C, but have to be coaxed to do so in C++).
Previously, I've used two approaches to making the simulation interactive:
Compile and link a DLL with target-application and simulation code, and have VB code in the same project which interacts with it.
Compile the target-application code and some simulation code to an EXE with instance of Visual Studio, and use a second instance of Visual Studio for the simulation-UI. Have the two programs communicate via TCP, so nearly all "real" I/O logic is in the simulation program. For example, the aforementioned `LATA |= 4;` would send a "read port 0xF89" command to the TCP port, get the response, process the received value, and send a "write port 0xF89" command with the result.
I've found the latter approach to run a tiny bit slower than the former in some cases, but it seems much more convenient for debugging, since I can suspend execution of the unmanaged simulation code while the simulation UI remains responsive. Indeed, for simulating a single target device at a time, I think the latter approach works extremely well. My question is how I should best go about simulating a plurality of target devices (e.g. 16 of them).
The difficulty I have is figuring out how to make each simulated instance get its own set of global variables. If I were to compile to an EXE and run one instance of the EXE for each simulated target device, that would work, but I don't know any practical way to maintain debugger support while doing that. Another approach would be to arrange the target code so that everything would compile as one module joined together via #include. For simulation purposes, everything could then be wrapped into a single C++ class, with global variables turning into class-instance variables. That would be a bit more object-oriented, but I really don't like the idea of forcing all the application code to live in one compiled and linked module.
What would perhaps be ideal would be if the code could load multiple instances of the DLL, each with its own set of global variables. I have no idea how to do that, however, nor do I know how to make things interact with the debugger. I don't think it's really necessary that all simulated target devices actually execute code simultaneously; it would be perfectly acceptable for simulation instances to use cooperative multitasking. If there were some way of finding out what range of memory holds the global variables, it might be possible to have the 'task-switch' method swap out all of the global variables used by the previously-running instance and swap in the contents applicable to the instance being switched in. Although I'd know how to do that in an embedded context, though, I'd have no idea how to do that on the PC.
Edit
My questions would be:
Is there any nicer way to allow simulation logic to be paused and examined in VS2010 debugger, while keeping a responsive UI for the simulator front-end, than running the simulator front end and the simulator logic in separate instances of VS2010, if the simulation logic must be written in C and the simulation front end in managed code? For example, is there a way to tell the debugger that when a breakpoint is hit, some or all other threads should be allowed to keep running while the thread that had hit the breakpoint sits paused?
If the bulk of the simulation logic must be source-code compatible with an embedded system written in C (so that the same source files can be compiled and run for simulation purposes under VS2010, and then compiled by the embedded-systems compiler for use in real hardware), is there any way to have the VS2010 debugger interact with multiple simulated instances of the embedded device? Assume performance is not likely to be an issue, but the number of instances will be large enough that creating a separate project for each instance would be likely be annoying in the absence of any way to automate the process. I can think of three somewhat-workable approaches, but don't know how to make any of them work really nicely. There's also an approach which would be better if it's possible, but I don't know how to make it work.
Wrap all the simulation code within a single C++ class, such that what would be global variables in the target system become class members. I'm leaning toward this approach, but it would seem to require everything to be compiled as a single module, which would annoyingly affect the design of the target system code. Is there any nice way to have code access class instance members as though they were globals, without requiring all functions using such instances to be members of the same module?
Compile a separate DLL for each simulated instance (so that e.g. if I want to run up to 16 instances, I would include 16 DLL's in the project, all sharing the same source files). This could work, but every change to the project configuration would have to be repeated 16 times. Really ugly.
Compile the simulation logic to an EXE, and run an appropriate number of instances of that EXE. This could work, but I don't know of any convenient way to do things like set a breakpoint common to all instances. Is it possible to have multiple running instances of an EXE attached to a single debugger instance?
Load multiple instances of a DLL in such a way that each instance gets its own global variables, while still being accessible in the debugger. This would be nicest if it were possible, but I don't know any way to do so. Is it possible? How? I've never used AppDomains, but my intuition would suggest that might be useful here.
If I use one VS2010 instance for the front-end, and another for the simulation logic, is there any way to arrange things so that starting code in one will automatically launch the code in the other?
I'm not particularly committed to any single simulation approach; while it might be nice to know if there's some way of slightly improving the above, I'd also like to know of any other alternative approaches that could work even better.
I would think that you'd still have to run 16 copies of your main application code, but that your TCP-based I/O simulator could keep a different set of registers/state for each TCP connection that comes in.
Instead of a bunch of global variables, put them into a single structure that encompasses the I/O state of a single device. Either spawn off a new thread for each socket, or just keep a list of active sockets and dedicate a single instance of the state structure for each socket.
the simulators I have seen that handle multiple instances of the instruction set/processor are designed that way. There is a structure usually that contains a complete set of registers, and a new pointer or an array of these structures are used to multiply them into multiple instances of the processor.

Monitoring application calls to DLL

In short: I want to monitor selected calls from an application to a DLL.
We have an old VB6 application for which we lost the source code (the company wasn't using source control back then..). This application uses a 3rd party DLL.
I want to use this DLL in a new C++ application. Unfortunately the DLL API is only partially documented, so I don't know how to call some functions. I do have the functions signature.
Since the VB6 application uses this DLL, I want to see how it calls several functions. So far I've tried or looked at -
APIHijack - requires me to write C++ code for each function. Since I only need to log the values, it seems like an overkill.
EasyHook - same as 1, but allows writing in the code in .NET language.
OllyDbg with uHooker - I still have to write code for each function, this time in Python. Also, I have to do many conversions in Python using the struct module, since most functions pass values using pointers.
Since I only need to log functions parameters I want a simple solution. Is there any automated tool, for which I could tell which functions to monitor and their signature, and then get a detailed log file?
A "static" solution (in the sense it can capture a stack trace on demand) would be Process Monitor.
A more dynamic solution would be ApiMonitor, but it may be too old to be compatible with the applications to monitor. Worth a try though.
Some more Google searching found what I was looking for: WinAPIOverride32. It allows writing text files such as:
CustomApi.dll|void NameOfFunction(long param1, double& param2);
Later on, these files can be used inside the program to log all calls to NameOfFunction. Now I just need to figure out how to log arrays and structs parameters.
Visual Studio Addin Runtime Flow here:
Runtime Flow in real time monitors and logs function calls and
function parameters in your running .NET application and shows a stack
trace tree. No instrumentation or source code required for monitoring.
If you just want to see the function interfaces of the DLL, you could try "Dependecies" (https://lucasg.github.io/Dependencies/). This is a nice remake of the DependencyWalker in as OpenSource.
This only allows you to see the dependencies of the DLL, with the corresponding function names (however, not the calling structure). Unfortunately, I don't believe it will tell you which specific functions in a DLL are being used by the calling DLL/EXE.