I am trying to implement a system so that it retrieves sound and extracts the mfcc of it. I'd like to implement my own mfcc function because librosa library wasn't implemented in C and other implementations of mfcc extractions doesn't yield the same outputs as librosa library does.
So I wrote a code, however, when I would like create hanning window, program doesn't take a step further and always stays the same statement while debugging. The statement is below:
float *mul = malloc(sizeof(float)*fftsize);
The whole code is as follows:
float* hanning(int fftsize){
float *mul = malloc(sizeof(float)*fftsize);
for (int i = 0; i<fftsize; i++){
mul[i] = 0.5 * (1 - cos(2*PI*i/(fftsize-1)));
}
return mul;
}
I put an LCD code to all error handler functions in stm32f7xx_it.c file to determine which fault I'm facing, and I see that it is hard_fault.
So what's the problem? I hope the issue is explained clearly. Due to the privacy, I couldn't put here whole code. Sorry for that. Thx in advance for your response.
Edit: I am chaning malloc to normal array with a variable length array. But still it takes me to HardFault_Handler function. SCB->SHCSR returns sometimes 65535 and sometimes 1.
I am using a Raspberry Pi 3 and basically stepped over a little tripwire.
I have a very big and complicated program that takes away a lot of memory and has a big CPU load. I thought it was normal that if I started the same process while the first one was still running, it would take the same amount of memory and especially double the CPU Load. I found out that it doesn't take away more memory and does not affect the CPU load.
To find out if this behavior came from my program, I wrote a tiny c++ program that has extremely high memory usage, here it is:
#include <iostream>
using namespace std;
int main()
{
for(int i = 0; i<100; i++) {
float a[100][100][100];
for (int i2 = 0; i2 < 99; ++i2) {
for (int i3 = 0; i3 < 99; ++i3){
for (int i4 = 0; i4 < 99; ++i4){
a[i2][i3][i4] = i2*i3*i4;
cout << a[i2][i3][i4] << endl;
}
}
}
}
return 0;
}
The CPU-load is about at 30 % of the max-level, I started the code in one terminal. Strangely, when I started it in another terminal at the same time, it didnt affect the CPU load. I concluded that this behaviour couldn't come from my program.
Now I want to know:
Is there a "lock" that ensures that a certain type of process does not grill your cores?
Why don't two identical processes double the CPU load?
Well, I found out that there is a "lock" that makes sure a process doesn't take away all memory and makes the CPU load go up to 100%. It seems that than more processes there are, than more CPU load is there, but not in a linear way.
Additionally, the code I wrote to look for the behaviour has only high memory usage, the 30% level came from the cout in the standard library . Multiple processes can use the command at the same time without increasing the CPU load, but it affects the speed of the printing.
When I found out that, I got suspicious about the programs speed. I used the analytics in my IDE for c++ to find out the duration of my original program, and indeed it was a bit more that two times slower.
That seems to be the solution I was looking for, but I think this is not really applicable to a large audience since the structure of the Raspberry Pi is very particular. I don't know how this works for other systems.
BTW: I could have guessed that there is a lock. I mean, if you start 10 processes that take away 15% of the CPU load max level, you would have 150% CPU usage. IMPOSSIBLE!
I need do many comparsions in opencl programm. Now i make it like this
int memcmp(__global unsigned char* a,__global unsigned char* b,__global int size){
for (int i = 0; i<size;i++){
if(a[i] != b[i])return 0;
}
return 1;
}
How i can make it faster? Maybe using vectors like uchar4 or somethins else? Thanks!
I guess that your kernel computes "size" elements for each thread. I think that your code can improve if your accesses are more coalesced. Thanks to the L1 caches of the current GPUs this is not a huge problem but it can imply a noticeable performance penalty. For example, you have 4 threads(work-items), size = 128, so the buffers have 512 uchars. In your case, thread #0 acceses to a[0] and b[0], but it brings to cache a[0]...a[63] and the same for b. thread #1 wich belongs to the same warp (aka wavefront) accesses to a[128] and b[128], so it brings to cache a[128]...a[191], etc. After thread #3 all the buffer is in the cache. This is not a problem here taking into account the small size of this domain.
However, if each thread accesses to each element consecutively, only one "cache line" is necessary all the time for your 4 threads execution (the accesses are coalesced). The behavior will be better when more threads per block are considered. Please, try it and tell me your conclusions. Thank you.
See: http://www.nvidia.com/content/cudazone/download/opencl/nvidia_opencl_programmingguide.pdf Section 3.1.2.1
It is a bit old but their concepts are not so old.
PS: By the way, after this I would try to use uchar4 as you commented and also the "loop unrolling".
This is a very basic question but please bear with me!
I got this code in a question as part of a quiz I was doing earlier and just didn't know if I might be missing something. I typed it into the editor and it would not run and it does appear to be incomplete. Had it been if (k) it would have made more sense.
But, as I have heard that you can leave out components of a for loop, I was just wondering if there is any time you would see the likes of for (k)?
int k = 0;
for (k) {
printf ("hello");
}
for(int k; ;)
/*this is the correct syntax of a for loop without conditional statement and incrementation/decrementation statement*\
Remember,those semi-colons within the paranthesis is important(without that the program wouldn't compile).
Now,to answer some of the questions you asked me in the earlier answer-
for(int k; ;)
{
printf("infinite loop");
}
When will this loop come to an end?
This loop will never come to an end.It is an infinite loop.It will keep printing infinite loop forever.
Is it possible to bring this loop to an end?
Yes,it is.It can be brought to an end using break statement.
for(int k; ;)
//or for( ; ; )
{
printf("infinite loop");
break;
}
Prints infinte loop only once.It will encounter the break statement and the control will move outside the loop.
Possible application.
It's used when you actually have no idea about when a loop should come to an end.
int i=0; //to take user input
for(int k; ;)
{
//accept the value of k from user.
/*You want the user to enter 1 as the input*/
if(k==1)
{
printf("entered 1,moving out of loop");
break;
}
}
What is the meaning of above loop?
- This loop keeps running until the user enters '1'.This is important in cases where you are giving the user options and the options are limited and so you don't want the user to give an invalid input.It runs until a valid input is entered(you can add more if statements with break statement).
Menu: 1)Pizza
2)Burger
3)Quit buying!
for(int k;k<10;k++)
/* this is a finite loop and this isn't suitable for the above requirement because you are not sure if the user will give the valid input within 10 iterations.*/
When k becomes 10,the control will move out irrespective of whether the user has entered a valid input or not.What if the user inputs 8 when k=9? The control will move out of the loop at k=10.As a result,your program will not work efficiently because i=8 is not an input you expected.You wanted 1,2 or 3 as input.
So,an infinite loop is used when you are not sure about how many iterations are required.You will actually be using a break statement to exit such a loop.
Is this the only option for an infinite loop?Why not while() ?Isn't while() with no condition an infinite loop?
while();// invalid in C.
//objective-C follows C-standards.
while("condition"); //valid
Some valid for loop declarations in C
for(k; ;) // infinite loop
for(; ;) // infinite loop
for(; k<0;)// valid
So,I think that sums up a small explanation.
Remember,semi-colons are important(irrespective of whether a condition is given or not).
And of course,you have other options to keep running or taking user input unless a valid input is given.But above one was just an application I could figure out to show that an infinite loop could be cool!
If you find any error or doubt,please comment.
Well,even I am not too good in C.But yeah since java is somewhat similar,I figured it out.
GDB has a new version out that supports reverse debug (see http://www.gnu.org/software/gdb/news/reversible.html). I got to wondering how that works.
To get reverse debug to work it seems to me that you need to store the entire machine state including memory for each step. This would make performance incredibly slow, not to mention using a lot of memory. How are these problems solved?
I'm a gdb maintainer and one of the authors of the new reverse debugging. I'd be happy to talk about how it works. As several people have speculated, you need to save enough machine state that you can restore later. There are a number of schemes, one of which is to simply save the registers or memory locations that are modified by each machine instruction. Then, to "undo" that instruction, you just revert the data in those registers or memory locations.
Yes, it is expensive, but modern cpus are so fast that when you are interactive anyway (doing stepping or breakpoints), you don't really notice it that much.
Note that you must not forget the use of simulators, virtual machines, and hardware recorders to implement reverse execution.
Another solution to implement it is to trace execution on physical hardware, such as is done by GreenHills and Lauterbach in their hardware-based debuggers. Based on this fixed trace of the action of each instruction, you can then move to any point in the trace by removing the effects of each instruction in turn. Note that this assumes that you can trace all things that affect the state visible in the debugger.
Another way is to use a checkpoint + re-execution method, which is used by VmWare Workstation 6.5 and Virtutech Simics 3.0 (and later), and which seems to be coming with Visual Studio 2010. Here, you use a virtual machine or a simulator to get a level of indirection on the execution of a system. You regularly dump the entire state to disk or memory, and then rely on the simulator being able to deterministically re-execute the exact same program path.
Simplified, it works like this: say that you are at time T in the execution of a system. To go to time T-1, you pick up some checkpoint from point t < T, and then execute (T-t-1) cycles to end up one cycle before where you were. This can be made to work very well, and apply even for workloads that do disk IO, consist of kernel-level code, and performs device driver work. The key is to have a simulator that contains the entire target system, with all its processors, devices, memories, and IOs. See the gdb mailinglist and the discussion following that on the gdb mailing list for more details. I use this approach myself quite regularly to debug tricky code, especially in device drivers and early OS boots.
Another source of information is a Virtutech white paper on checkpointing (which I wrote, in full disclosure).
During an EclipseCon session we also asked how they do this with the Chronon Debugger for Java. That one does not allow you to actually step back, but can play back a recorded program execution in such a way that it feels like reverse debugging. (The main difference is that you cannot change the running program in the Chronon debugger, while you can do that in most other Java debuggers.)
If I understood it correctly, it manipulates the byte code of the running program, such that every change of an internal state of the program is recorded. External states don't need to be recorded additionally. If they influence your program in some way, then you must have an internal variable matching that external state (and therefore that internal variable is enough).
During playback time they can then basically recreate every state of the running program from the recorded state changes.
Interestingly the state changes are much smaller than one would expect on first look. So if you have a conditional "if" statement, you would think that you need at least one bit to record whether the program took the then- or the else-statement. In many cases you can avoid even that, like in the case that those different branches contain a return value. Then it is enough to record only the return value (which would be needed anyway) and to recalculate the decision about the executed branch from the return value itself.
Although this question is old, most of the answers are too, and as reverse-debugging remains an interesting topic, I'm posting a 2015 answer. Chapters 1 and 2 of my MSc thesis, Combining reverse debugging and live programming towards visual thinking in computer programming, covers some of the historical approaches to reverse debugging (especially focused on the snapshot-(or checkpoint)-and-replay approach), and explains the difference between it and omniscient debugging:
The computer, having forward-executed the program up to some point, should really be able to provide us with information about it. Such an improvement is possible, and is found in what are called omniscient debuggers. They are usually classified as reverse debuggers, although they might more accurately be described as "history logging" debuggers, as they merely record information during execution to view or query later, rather than allow the programmer to actually step backwards in time in an executing program. "Omniscient" comes from the fact that the entire state history of the program, having been recorded, is available to the debugger after execution. There is then no need to rerun the program, and no need for manual code instrumentation.
Software-based omniscient debugging started with the 1969 EXDAMS system where it was called "debug-time history-playback". The GNU debugger, GDB, has supported omniscient debugging since 2009, with its 'process record and replay' feature. TotalView, UndoDB and Chronon appear to be the best omniscient debuggers currently available, but are commercial systems. TOD, for Java, appears to be the best open-source alternative, which makes use of partial deterministic replay, as well as partial trace capturing and a distributed database to enable the recording of the large volumes of information involved.
Debuggers that do not merely allow navigation of a recording, but are actually able to step backwards in execution time, also exist. They can more accurately be described as back-in-time, time-travel, bidirectional or reverse debuggers.
The first such system was the 1981 COPE prototype ...
mozilla rr is a more robust alternative to GDB reverse debugging
https://github.com/mozilla/rr
GDB's built-in record and replay has severe limitations, e.g. no support for AVX instructions: gdb reverse debugging fails with "Process record does not support instruction 0xf0d at address"
Upsides of rr:
much more reliable currently. I have tested it relatively long runs of several complex software.
also offers a GDB interface with gdbserver protocol, making it a great replacement
small performance drop for most programs, I haven't noticed it myself without doing measurements
the generated traces are small on disk because only very few non-deterministic events are recorded, I've never had to worry about their size so far
rr achieves this by first running the program in a way that records what happened on every single non-deterministic event such as a thread switch.
Then during the second replay run, it uses that trace file, which is surprisingly small, to reconstruct exactly what happened on the original non-deterministic run but in a deterministic way, either forwards or backwards.
rr was originally developed by Mozilla to help them reproduce timing bugs that showed up on their nightly testing the following day. But the reverse debugging aspect is also fundamental for when you have a bug that only happens hours inside execution, since you often want to step back to examine what previous state led to the later failure.
The following example showcases some of its features, notably the reverse-next, reverse-step and reverse-continue commands.
Install on Ubuntu 18.04:
sudo apt-get install rr linux-tools-common linux-tools-generic linux-cloud-tools-generic
sudo cpupower frequency-set -g performance
# Overcome "rr needs /proc/sys/kernel/perf_event_paranoid <= 1, but it is 3."
echo 'kernel.perf_event_paranoid=1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
Test program:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int f() {
int i;
i = 0;
i = 1;
i = 2;
return i;
}
int main(void) {
int i;
i = 0;
i = 1;
i = 2;
/* Local call. */
f();
printf("i = %d\n", i);
/* Is randomness completely removed?
* Recently fixed: https://github.com/mozilla/rr/issues/2088 */
i = time(NULL);
printf("time(NULL) = %d\n", i);
return EXIT_SUCCESS;
}
compile and run:
gcc -O0 -ggdb3 -o reverse.out -std=c89 -Wextra reverse.c
rr record ./reverse.out
rr replay
Now you are left inside a GDB session, and you can properly reverse debug:
(rr) break main
Breakpoint 1 at 0x55da250e96b0: file a.c, line 16.
(rr) continue
Continuing.
Breakpoint 1, main () at a.c:16
16 i = 0;
(rr) next
17 i = 1;
(rr) print i
$1 = 0
(rr) next
18 i = 2;
(rr) print i
$2 = 1
(rr) reverse-next
17 i = 1;
(rr) print i
$3 = 0
(rr) next
18 i = 2;
(rr) print i
$4 = 1
(rr) next
21 f();
(rr) step
f () at a.c:7
7 i = 0;
(rr) reverse-step
main () at a.c:21
21 f();
(rr) next
23 printf("i = %d\n", i);
(rr) next
i = 2
27 i = time(NULL);
(rr) reverse-next
23 printf("i = %d\n", i);
(rr) next
i = 2
27 i = time(NULL);
(rr) next
28 printf("time(NULL) = %d\n", i);
(rr) print i
$5 = 1509245372
(rr) reverse-next
27 i = time(NULL);
(rr) next
28 printf("time(NULL) = %d\n", i);
(rr) print i
$6 = 1509245372
(rr) reverse-continue
Continuing.
Breakpoint 1, main () at a.c:16
16 i = 0;
When debugging complex software, you will likely run up to a crash point, and then fall inside a deep frame. In that case, don't forget that to reverse-next on higher frames, you must first:
reverse-finish
up to that frame, just doing the usual up is not enough.
The most serious limitations of rr in my opinion are:
https://github.com/mozilla/rr/issues/2089 you have to do a second replay from scratch, which can be costly if the crash you are trying to debug happens, say, hours into execution
https://github.com/mozilla/rr/issues/1373 x86 only
UndoDB is a commercial alternative to rr: https://undo.io Both are trace / replay based, but I'm not sure how they compare in terms of features and performance.
Nathan Fellman wrote:
But does reverse debugging only allow you to roll back next and step commands that you typed, or does it allow you to undo any number of instructions?
You can undo any number of instructions. You're not restricted to, for instance,
only stopping at the points where you stopped when you were going forward. You can
set a new breakpoint and run backwards to it.
For instance, if I set a breakpoint on an instruction and let it run until then, can I then roll back to the previous instruction, even though I skipped over it?
Yes. So long as you turned on recording mode before you ran to the breakpoint.
Here is how another reverse-debugger called ODB works. Extract:
Omniscient Debugging is the idea of
collecting "time stamps" at each
"point of interest" (setting a value,
making a method call,
throwing/catching an exception) in a
program and then allowing the
programmer to use those time stamps to
explore the history of that program
run.
The ODB ... inserts
code into the program's classes as
they are loaded and when the program
runs, the events are recorded.
I'm guessing the gdb one works in the same kind of way.
Reverse debugging means you can run the program backwards, which is very useful to track down the cause of a problem.
You don't need to store the complete machine state for each step, only the changes. It is probably still quite expensive.