CUDA fmaf function [closed]

CUDA fmaf function [closed] - optimization

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am trying to optimize a CUDA code. I replaced expression
result = x*y+z
with
result = fmaf(x,y,z)
But, it gives an error - CUDA error: kernel launch failure (7): too many resources requested for launch

As #JackOLantern indicated, it's likely the device code compiler will make this kind of optimization for you. You can compare the two cases to see what kind of code has been emitted by using:
nvcc -ptx -arch... mycode.cu
to see what kind of PTX code got emitted in each case, or:
cuobjdump -sass myapp
to see what kind of SASS (device machine code) got emitted in each case.
You haven't supplied any actual code, but the "too many resources requested for launch" in the context of this question is most likely due to requesting too many registers per threadblock ((registers per thread) * (threads per block) should be less than the maximum registers allowable per block, i.e. per multiprocessor).
You can determine the maximum registers allowable per block for your device using the deviceQuery sample code or from the programming guide. (registers per multiprocessor)
You can find out how many registers per thread the compiler is using by specifying:
-Xptxas -v
as additional command-line switches when compiling your code.
You can use the launch bounds qualifier to instruct the compiler to use fewer registers per thread.

Related

Bitdefender detects my console application as Gen:Variant.Ursu.56053 [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I've developed a console application that does a lot of routines, but the Antivirus detected it as a malware of type Gen:Variant.Ursu.56053.
How can I fix this without touching the antivirus policy because it's not allowed for us to create any exceptions for any found threat.
I'd like also to mention that If i changed the assembly name the antivirus is no longer consider the new file virus, but it looks that it considers it virus because I invoke it many times, with different parameters.
Any suggestions, I'm really suffering from this,

I know this thread is very old, but for people which will come here - to fix this issue simply add icon to the program, im not even joking, it works.

FALSE +VE ALERT!!! Many antivirus engines have name pattern matching as their Swiss-knife to detect malicious files,If any of them matches the name they have in their Database then you can't do much about it. Its simply became a False +ve !!! Also your assembly name should consist of the technology area and component description, or company name and technology area (depending on your preferance). So try changing it to more specific one. :)
Assuming that you are talking about .NET (with relation to Visual Studio) For Ex:
Project: Biometric Device Access
Assembly: BiometricFramework.DeviceAccess.dll
Namespace: ACME.BiometricFramework.DeviceAccess

I had the same problem with Bitdefender, but mine is a Gen:Variant.Ursu.787553 when I tried creating a .exe file from my C program.
I simply moved it out of quarantine manually, and it worked well. You might have to that every time you build a new program. Hope this helps!

How does a hardware interrupt trigger software handlers without any prior setup [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am currently learning about processor interrupts, and have run into some confusions. From what I understand, a processor has a set of external interrupts for peripherals. That way manufactures can provide a way to interrupt the processor via their own peripherals. I know that with this particular processor (ARM Cortex M0+) that once an external interrupt line is triggered, it will go to it's vector table and corresponding interrupt request offset and (I could be wrong here) will execute the ARM thumb code at that address.
And if I understand correctly, some processors will look at the value at said IRQ address, which will point to the address of the interrupt handler.
Question 1
While learning about the ARM Cortex M0+ vector table, what is the thumb code doing at that address? I am assuming it is doing something like setting the PC register to the interrupt handler address, but that is just a stab in the dark.
Question 2
Also the only way that I have found so far to handle the EIC interrupts is to use this following snippet
void EIC_Handler() {
// Code to handle interrupt
}
I am perplexed how this function is called without setup or explicit reference to it in my actual c code. How does the program go from vector table look up to calling this function?
EDIT #1:
I was wrong about the vector table containing thumb code. The vector table contains addresses to the exception handlers.
EDIT #2:
Despite getting the answer I was looking for, my question apparently wasn't specific enough or was "off-topic", so let me clarify.
While reading/learning from multiple resources on how to handle external interrupts in software, I noticed every source was saying to just add the code snippet above. I was curious how the interrupt went from hardware, all the way to calling my EIC_Handler() without me setting anything up other than defining the function and the EIC. So I researched what a vector table is and how the processor will go to certain parts of it when different interrupts happen. That still didn't answer my question, as I wasn't setting up the vector table myself, yet my EIC_Handler() function was still being called.
So somehow at compile time, the vector table had to be created and the corresponding IRQ handle pointing to my EIC_Handler(). I searched through
a good amount of SAML22 and Cortex M0+ documentation (and mis-read that the vector table contained thumb code) but couldn't find anything on how the vector table was being set up, which is why I decided to look for an answer here. And I got one!
I found that the IDE (Atmel studio) and the project configuration I had chosen came along with a little file defining weak functions, implementation of the reset handler, and the vector table. There was also a custom linker script grabbing the addresses to the functions and putting them into the vector table, which if a weak function was implemented, it would point to that implementation and call it when the appropriate interrupt request occurred.

For the Cortex M0 (and other cortexes? corticies?) the vector table doesn't contain thumb code, it is a list of addresses of functions which are the implementation of your exception handlers.
When the processor gets an exception it first pushes a stack frame (xPSR, PC, LR, R12, R3-R0) to the currently active stack pointer (MSP or PSP), it then fetches the address of the exception handler from the vector table, and then starts running code from that location.
When there is a POP instruction which loads the PC, or a BX instruction from within the exception handler the processor returns from the exception handler, it destacks the stack frame which was pushed and carries on executing from where it left off. This process is explained in the Cortex M0+ User Guide - Exception Entry And Exit
For question 2, the vector table in the Cortex M0/M0+ is usually located at address 0x00000000. Some Cortex M0/M0+ implementations allow remapping of the vector table using a vector table offset register within the system control block, others allow you to remap which memory is available at address 0x00000000.
Depending on which tool set/library you're using there are different ways of defining the vector table, and saying where it should live in memory.
There are usually weakly linked functions with the name of the exceptions available for your microcontroller, which when you implement them in your source files are linked instead of the weak functions, and their addresses get put into the vector table.
I have no experience with Atmel based ARMs, but #Lundin in the comments says the vector table is located in a "startup_samxxx.c" file. If you've started from scratch it is up to you to ensure you have a suitable vector table, and it's located in a sensible place.

CC2540 SPI SD card

I am working on CC2540 with 128K Flash. My target is to build an SPI interface between CC2540 and a SD card. By now I have built the interface using Chan's library and the SimpleBLEPeripheral example, without errors and warnings. But when I am trying to call SD_SPI_initialization() from osal_init_tasks or Periodic function, then everything stops.
I need to understand some basic points in order to proceed! Has anyone achieved an interface like this in order to give some guidelines?
I have already asked about in TI forum, but none answers.
I also thought to use the HostTestRelease sample/project and especially, the CC2540SPI version but it gives some errors concerning stack or OSAL_CB_TIMER.

why is LaTeX / pdflatex compiler so 'funky' with multiple compiles necessary and bogus error messages, etc, compared to c++? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Is there a simple explanation for why the latex / pdflatex compiler is funky in the following two ways:
1) N multiple compiles are necessary until you reach a "steady state" version. N seems to grow up to around 5 or 6 if I use many packages and references.
2) Error messages are almost always worthless. The actual error is not flagged. Example:
\begin{itemize} % Line 499
\begin{enumerate}
% Comment: error: forgot to close the enumerate block
\item This is a bullet point.
\end{itemize} % Line 503
result: "Error on line 1 while scanning \begin{document}", not very useful.
I realize there is a separate "tex exchange" but I'm wondering if someone knowledgeable about c++, java, or other compilers can provide some insight on how those seem to support single-compile and proper error localization.
Edit: this document seems like a rant justifying the hacks in latex's implementation, but what about latex's syntax/language properties make the weird implementation necessary? http://tug.org/texlive/Contents/live/texmf-dist/doc/generic/knuth/tex/tex.pdf

From a LaTeX point of view:
You should at most require 3 (...maybe 4) to reach a steady state. This depends not on the number of packages, but possible layout changes within your document. Layout changes cause references to move, and these references need to be correct (hence the recompile until they don't move).
Nesting of environments is allowed (although this does not address your problem directly). Also, macro definitions act as replacement text for your input. So, even though you write \end{itemize}, it is actually transformed into a bunch of other/different (primitive) macros, removing the obvious-to-humans structure and consequently also the bizarre error message. That's why some of the error messages are difficult to interpret.

wrt. point (2):
Considering that most of the errors are picked up while parsing macro defenitions that get expanded, My guess is that errors wouldn't be useful to the user even if they contained locale and specific causes, because they don't translate well into what you see when you view the code.
Still, it would be useful if they were just a little bit more explicit :/

LGPL grammar file licensing [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
Given a LGPL'ed grammar file, is the source generated by a compiler-compiler for the grammar a derivative works? What about if the grammar file was modified before it was given as input to the compiler-compiler? There isn't any linking, at least not in the conventional sense.
If the output is a derivitive work, must I then simply provide the (modified) grammer sources making any best efforts to ensure the grammar will function without dependencies imposed by the program/library using it? Or are there more restrictions which must be resolved?

1) Since the grammar contains the essence of the resulting code, it definitely belongs to "all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities" and is not a part of "the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work". In brief, LGPLv3 applies.
So, you need to convey the "Minimal Corresponding Source" (the one used to build the version in the Combined Work) according to sec.4 d) 0) or GPLv3 sec.6, mark it as modified if it is and possibly include custom tools if required by GPL's definition of "Corresponding Source". (In general, as sec.0 says, LGPLv3 is effectively GPLv3 with a few additional provisions.)
2) It might be a derivative work of the generator used as well if the latter inserts parts of itself into the code (see FSF FAQ#Can I use GPL-covered tools... to compile...?) - check the generator's workings and licensing terms if necessary. If it is, you'll have to satisfy both LGPLv3 and the generator's terms that apply to the results of its work.

The best answer, and which everyone should be giving you is as follows:
Contact a lawyer

Disclaimer: IANAL and if you want something "official" you should talk to one. That said...
A common-sense approach says that yes, the result of compilation of something that is compilable is a derivative work. For instance, the compiled version of an LGPL library is still LGPL - you can't say that you obtained a compiled version of the library and never compiled it yourself and somehow dodge providing the source code that way.
Thus, the LGPL would require you to distribute the (potentially modified) source of the original LGPL work, such that if an individual wanted to further modify the work, they could.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas