How is heuristic-based virus detection possible? - malware

The Halting Problem states that it is impossible for one program to predict the output of another, or whether it will terminate.
That got me thinking... how do heuristics based-scanners decide whether a given executable program's instructions are "virus-like", seeing as that would entirely involve predicting what the program is going to do?

Usually viruses use some kind of "pattern" in their code, like opening some special registry keys or execution of rare used system functions, or self-code modifications, so analyzer can "see" these actions and mark such program as potentially virus, of course it has some percentage of false alarm

Related

What is the difference between Symbolic and Concrete model checking when the search is bounded in time?

Could someone please spend a few words to explain to someone who does not come from a formal methods background what is the difference between verifying a specification using Symbolic Model Checking and doing the same using Concrete Model Checking, when the search is bounded in time? I am referring to the definition of SMC and Concrete MC made in UPPAAL.
In particular, I wrote a program that uses UPPAAL Java API to verify a query against a network of timed automata. If the query is verified, UPPAAL returns a symbolic trace to parse or something else if it is not. If the verification takes too long I decided to halt the verification process, return a message and move on with the next query to verify. Everything is good so far.
Recently, I have been playing around with UPPAAL Stratego which natively offers the possibility of choosing a maximum time or depth of exploration to bound the search. However, this options can be applied only when the verification is carried out using Concrete Model Checking.
My question is : is there any difference in halting the symbolic verification process, as I am doing in my Java program and what UPPAAL Stratego does natively? In both case I don't get an answer (or a trace) but what about the "reliability" of the exploration?
Which would be better (i.e. more complete) between the two options? Halting the symbolic verification or halting the concrete verification?
My understanding so far is that in Symbolic Model Checking, the possible states are defined by using intervals of variables whilst in Concrete Model Checking variables assume an actual value. My view is that, in terms of "completeness", halting the SMC after some time is more "complete" since the exploration of the state space happens systematically using BFS or DFS algorithm and, if I use BFS, I can be "sure" that within N steps nothing bad happens. But again, my background in model checking is not rich, so I might have get it completely wrong.
Also, if could drop any reference to the strategies, it would be really appreciated.
Thanks!

What is the purpose of Vulkan's new extension VK_KHR_vulkan_memory_model?

Khronos just released their new memory model extension, but there is yet to be an informal discussion, example implementation, etc. so I am confused about the basic details.
https://www.khronos.org/blog/vulkan-has-just-become-the-worlds-first-graphics-api-with-a-formal-memory-model.-so-what-is-a-memory-model-and-why-should-i-care
https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#memory-model
What do these new extensions try to solve exactly? Are they trying to solve synchronization problems at the language level (to say remove onerous mutexes in your C++ code), or is it a new and complex set of features to give you more control over how the GPU deals with sync internally?
(Speculative question) Would it be a good idea to learn and incorporate this new model in the general case or would this model only apply to certain multi-threaded patterns and potentially add overhead?
Most developers won't need to know about the memory model in detail, or use the extensions. In the same way that most C++ developers don't need to be intimately familiar with the C++ memory model (and this isn't just because of x86, it's because most programs don't need anything beyond using standard library mutexes appropriately).
But the memory model allows specifying Vulkan's memory coherence and synchronization primitives with a lot less ambiguity -- and in some cases, additional clarity and consistency. For the most part the definitions didn't actually change: code that was data-race-free before continues to be data-race-free. For a few developers doing advanced or fine-grained synchronization, the additional precision and clarity allows them to know exactly how to make their programs data-race-free without using expensive overly-strong synchronization.
Finally, in building the model the group found a few things that were poorly-designed or broken previously (many of them going all the way back into OpenGL). They've been able to now say precisely what those things do, whether or not they're still useful, and build replacements that do what was actually intended.
The extension advertises that these changes are available, but even more than that, once the extension is final instead of provisional, it will mean that the implementation has been validated to actually conform to the memory model.
Among other things, it enables the same kind of fine grained memory ordering guarantees for atomic operations that are described for C++ here. I would venture to say that many, perhaps even most, PC C++ developers don't really know much about this because the x86 architecture basically makes the memory ordering superfluous.
However, GPUs are not x86 architecture and compute operations executed massively parallel on GPU shader cores probably can, and in some cases must use explicit ordering guarantees to be valid when working against shared data sets.
This video is a good presentation on atomics and ordering as it applies to C++.

Optimizing branch predictions: how to generalize code that could run wth different compiler, interperter, and hardware prediction?

I ran into some slow downs on a tight loop today caused by an If statement, which surprised me some because I expected branch prediction to successfully pipeline the particular statement to minimize the cost of the conditional.
When I sat down to think more about why it wasn't better handled I realized I didn't know much about how branch prediction was being handled at all. I know the concept of branch prediction quite well and it's benefits, but the problem is that I didn't know who was implementing it and what approach they were utilizing for predicting the outcome of a conditional.
Looking deeper I know branch prediction can be done at a few levels:
Hardware itself with instruction pipelining
C++ style compiler
Interpreter of interpreted language.
half-compiled language like java may do two and three above.
However, because optimization can be done in many areas I'm left uncertain as to how to anticipate branch prediction. If I'm writing in Java, for example, is my conditional optimized when compiled, when interpreted, or by the hardware after interpretation!? More interesting, does this mean if someone uses a different runtime enviroment? Could a different branch prediction algorithm used in a different interpreter result in a tight loop based around a conditional showing significant different performance depending on which interpreter it's run with?
Thus my question, how does one generalize an optimization around branch prediction if the software could be run on very different computers which may mean different branch prediction? If the hardware and interpreter could change their approach then profiling and using whichever approach proved fastest isn't a guarantee. Lets ignore C++ where you have compile level ability to force this, looking at the interpreted languages if someone still needed to optimize a tight loop within them.
Are there certain presumptions that are generally safe to make regardless of interpreter used? Does one have to dive into the intricate specification of a language to make any meaningful presumption about branch prediction?
Short answer:
To help improve the performance of the branch predictor try to structure your program so that conditional statements don't depend on apparently random data.
Details
One of the other answers to this question claims:
There is no way to do anything at the high level language to optimize for branch prediction, caching sure, sometimes you can, but branch prediction, no not at all.
However, this is simply not true. A good illustration of this fact comes from one of the most famous questions on Stack Overflow.
All branch predictors work by identifying patterns of repeated code execution and using this information to predict the outcome and/or target of branches as necessary.
When writing code in a high-level language it's typically not necessary for an application programmer to worry about trying to optimizing conditional branches. For instance gcc has the __builtin_expect function which allows the programmer to specify the expected outcome of a conditional branch. But even if an application programmer is certain they know the typical outcome of a specific branch it's usually not necessary to use the annotation. In a hot loop using this directive is unlikely to help improve performance. If the branch really is strongly biased the the predictor will be able to correctly predict the outcome most of the time even without the programmer annotation.
On most modern processors branch predictors perform incredibly well (better than 95% accurate even on complex workloads). So as a micro-optimization, trying to improve branch prediction accuracy is probably not something that an application programmer would want to focus on. Typically the compiler is going to do a better job of generating optimal code that works for the specific hardware platform it is targeting.
But branch predictors rely on identifying patterns, and if an application is written in such a way that patterns don't exist, then the branch predictor will perform poorly. If the application can be modified so that there is a pattern then the branch predictor has a chance to do better. And that is something you might be able to consider at the level of a high-level language, if you find a situation where a branch really is being poorly predicted.
branch prediction like caching and pipelining are things done to make code run faster in general overcoming bottlenecks in the system (super slow cheap dram which all dram is, all the layers of busses between X and Y, etc).
There is no way to do anything at the high level language to optimize for branch prediction, caching sure, sometimes you can, but branch prediction, no not at all. in order to predict, the core has to have the branch in the pipe along with the instructions that preceed it and across architectures and implementations not possible to find one rule that works. Often not even within one architecture and implementation from the high level language.
you could also easily end up in a situation where tuning for branch predictions you de-tune for cache or pipe or other optimizations you might want to use instead. and the overall performance first and foremost is application specific then after that something tuned to that application, not something generic.
For as much as I like to preach and do optimizations at the high level language level, branch prediction is one that falls into the premature optimization category. Just enable it it in the core if not already enabled and sometimes it saves you a couple of cycles, most of the time it doesnt, and depending on the implementation, it can cost more cycles than it saves. Like a cache it has to do with the hits vs misses, if it guesses right you have code in a faster ram sooner on its way to the pipe, if it guesses wrong you have burned bus cycles that could have been used by code that was going to be run.
Caching is usually a benefit (although not hard to write high level code that shows it costing performance instead of saving) as code usually runs linearly for some number of instructions before branching. Likewise data is accessed in order often enough to overcome the penalties. Branching is not something we do every instruction and where we branch to does not have a common answer.
Your backend could try to tune for branch prediction by having the pre-branch decisions happen a few cycles before the branch but all within a pipe size and tuned for fetch line or cache line alignments. again this messes with tuning for other features in the core.

what is the difference between synthesis and simulation (VHDL)

Im am working on a VHDL project that includes an fsm.
Some states change according to a counter. It dit not work until i put 'clk' in the sensitivity list, besides the current state and the input.
I know that during synthesis, the sensitivity not used, or discarded. But how can that have such an impact on the result in the simulation? if a leave this 'clk', would the fsm perform as i want op an FPGA?
thanks,
David
This is the simple explanation:
The simulator uses the sensitivity list to figure out when it needs to run the process. The reason why the simulator needs hints to figure out when to run the process is because computer processors can only do one (or only a few in multicore systems) thing at a time and the processor will have to take turns running each part of your design. The sensitivity list allows simulation to run in a reasonable time frame.
When you synthesize code into an ASIC or FPGA, the process is always "running" since it has dedicated hardware.
When you simulate a state machine without the clock in the sensitivity list, the process will never run on the clock edges, but only on changes to your input. If you have the state transition implemented as a flip flop (if clk'event and clk = '1') then your state transition will never be simulated unless you happen to change your input at the same time as the clock's rising edge.
You should probably leave the clock in the sensitivity list, assuming the FSM changes on clock edges.
Also, try to proofread your questions.
Synthesis tools focus on logic design (FPGA, ASIC) and ignore sensitivity lists because there are only three basic types of logic: Combinational logic, edge sensitive storage (flip-flops and some RAM), and level sensitive storage (latches and some RAM).
Combinational logic requires all input signals to be on the sensitivity list. From a synthesis tool perspective, if one is missing, they can either ignore the sensitivity list and treat it as if all inputs were on the sensitivity list, or produce some complicated combination of flip-flops and combinational logic that probably will not do what the user wanted anyway. Both of these have an implementation cost to the vendor, hence, why invest money (development time) to create something that is not useful. As a result, the only good investment is to simplify and ignore the sensitivity list.
Simulators on the other hand, have a bigger perspective than just logic design. The language defines sensitivity lists as to indicate when the code should run. So simulators implement that semantic with a high fidelity.
Long term it may make you happy to know that VHDL-2008 allows the keyword "all" to be used in a sensitivity list to replace the signal inputs there. This is intended to simplify the modeling of combinational logic. Syntax is as follows:
MyStateMachine : process(all)
begin
-- my statemachine logic
end process MyStateMachine ;
Turn on the VHDL-2008 switch and it out in your synthesis tool. If it does not work, be sure to submit a bug against the tool.

disambiguating HPCT artificial intelligence architecture

I am trying to construct a small application that will run on a robot with very limited sensory capabilities (NXT with gyroscope/ultrasonic/touch) and the actual AI implementation will be based on hierarchical perceptual control theory. I'm just looking for some guidance regarding the implementation as I'm confused when it comes to moving from theory to implementation.
The scenario
My candidate scenario will have 2 behaviors, one is to avoid obstacles, second is to drive in circular motion based on given diameter.
The problem
I've read several papers but could not determine how I should classify my virtual machines (layers of behavior?) and how they should communicating to lower levels and solving internal conflicts.
These are the list of papers I've went through to find my answers but sadly could not
pct book
paper on multi-legged robot using hpct
pct alternative perspective
and the following ideas are the results of my brainstorming:
The avoidance layer would be part of my 'sensation layer' and that is because it only identifies certain values like close objects e.g. ultrasonic sensor specific range of values. The other second layer would be part of the 'configuration layer' as it would try to detect the pattern in which the robot is driving like straight line, random, circle, or even not moving at all, this is using the gyroscope and motor readings. 'Intensity layer' represents all sensor values so it's not something to consider as part of the design.
Second idea is to have both of the layers as 'configuration' because they would be responding to direct sensor values from 'intensity layer' and they would be represented in a mesh-like design where each layer can send it's reference values to the lower layer that interface with actuators.
My problem here is how conflicting behavior would be handled (maneuvering around objects and keep running in circles)? should be similar to Subsumption where certain layers get suppressed/inhibited and have some sort of priority system? forgive my short explanation as I did not want to make this a lengthy question.
/Y
Here is an example of a robot which implements HPCT and addresses some of the issues relevant to your project, http://www.youtube.com/watch?v=xtYu53dKz2Q.
It is interesting to see a comparison of these two paradigms, as they both approach the field of AI at a similar level, that of embodied agents exhibiting simple behaviors. However, there are some fundamental differences between the two which means that any comparison will be biased towards one or the other depending upon the criteria chosen.
The main difference is of biological plausibility. Subsumption architecture, although inspired by some aspects of biological systems, is not intended to theoretically represent such systems. PCT, on the hand, is exactly that; a theory of how living systems work.
As far as PCT is concerned then, the most important criterion is whether or not the paradigm is biologically plausible, and criteria such as accuracy and complexity are irrelevant.
The other main difference is that Subsumption concerns action selection whereas PCT concerns control of perceptions (control of output versus control of input), which makes any comparison on other criteria problematic.
I had a few specific comments about your dissertation on points that may need
clarification or may be typos.
"creatures will attempt to reach their ultimate goals through
alternating their behaviour" - do you mean altering?
"Each virtual machine's output or error signal is the reference signal of the machine below it" - A reference signal can be a function of one or more output signals from higher-level systems, so more strictly this would be, "Each virtual machine's output or error signal contributes to the reference signal of a machine at a lower level".
"The major difference here is that Subsumption does not incorporate the ideas of 'conflict' " - Well, it does as the purpose of prioritising the different layers, and sub-systems, is to avoid conflict. Conflict is implicit, as there is not a dedicated system to handle conflicts.
"'reorganization' which require considering the goals of other layers." This doesn't quite capture the meaning of reorganisation. Reorganisation happens when there is prolonged error in perceptual control systems, and is a process whereby the structure of the systems changes. So rather than just the reference signals changing the connections between systems or the gain of the systems will change.
"Design complexity: this is an essential property for both theories." Rather than an essential property, in the sense of being required, it is a characteristic, though it is an important property to consider with respect to the implementation or usability of a theory. Complexity, though, has no bearing on the validity of the theory. I would say that PCT is a very simple theory, though complexity arises in defining the transfer functions, but this applies to any theory of living systems.
"The following step was used to create avoidance behaviour:" Having multiple nodes for different speeds seem unnecessarily complex. With PCT it should only be necessary to have one such node, where the distance is controlled by varying the speed (which could be negative).
Section 4.2.1 "For example, the avoidance VM tries to respond directly to certain intensity values with specific error values." This doesn't sound like PCT at all. With PCT, systems never respond with specific error (or output) values, but change the output in order to bring the intensity (in this case) input in to line with the reference.
"Therefore, reorganisation is required to handle that conflicting behaviour. I". If there is conflict reorganisation may be necessary if the current systems are not able to resolve that conflict. However, the result of reorganisation may be a set of systems that are able to resolve conflict. So, it can be possible to design systems that resolve conflict but do not require reorganisation. That is usually done with a higher-level control system, or set of systems; and should be possible in this case.
In this section there is no description of what the controlled variables are, which is of concern. I would suggest being clear about what are goal (variables) of each of the systems.
"Therefore, the designed behaviour is based on controlling reference values." If it is only reference values that are altered then I don't think it is accurate to describe this as 'reorganisation'. Such a node would better be described as a "conflict resolution" node, which should be a higher-level control system.
Figure 4.1. The links annotated as "error signals" are actually output signals. The error signals are the links between the comparator and the output.
"the robot never managed to recover from that state of trying to reorganise the reference values back and forth." I'd suggest the way to resolve this would be to have a system at a level above the conflicted systems, and takes inputs from one or both of them. The variable that it controls could simply be something like, 'circular-motion-while-in-open-space', and the input a function of the avoidance system perception and then a function of the output used as the reference for the circular motion system, which may result in a low, or zero, reference value, essentially switching off the system, thus avoiding conflict, or interference. Remember that a reference signal may be a weighted function of a number of output signals. Those weights, or signals, could be negative so inhibiting the effect of a signal resulting in suppression in a similar way to the Subsumption architecture.
"In reality, HPCT cannot be implemented without the concept of reorganisation because conflict will occur regardless". As described above HPCT can be implemented without reorganisation.
"Looking back at the accuracy of this design, it is difficult to say that it can adapt." Provided the PCT system is designed with clear controlled variables in mind PCT is highly adaptive, or resistant to the effects of disturbances, which is the PCT way of describing adaption in the present context.
In general, it may just require clarification in the text, but as there is a lack of description of controlled variables in the model of the PCT implementation and that, it seems, some 'behavioural' modules used were common to both implementations it makes me wonder whether PCT feedback systems were actually used or whether it was just the concept of the hierarchical architecture that was being contrasted with that of the Subsumption paradigm.
I am happy to provide more detail of HPCT implementation though it looks like this response is somewhat overdue and you've gone beyond that stage.
Partial answer from RM of the CSGnet list:
https://listserv.illinois.edu/wa.cgi?A2=ind1312d&L=csgnet&T=0&P=1261
Forget about the levels. They are just suggestions and are of no use in building a working robot.
A far better reference for the kind of robot you want to develop is the CROWD program, which is documented at http://www.livingcontrolsystems.com/demos/tutor_pct.html.
The agents in the CROWD program do most of what you want your robot to do. So one way to approach the design is to try to implement the control systems in the CROWD programs using the sensors and outputs available for the NXT robot.
Approach the design of the robot by thinking about what perceptions should be controlled in order to produce the behavior you want to see the robot perform. So, for example, if one behavior you want to see is "avoidance" then think about what avoidance behavior is (I presume it is maintaining a goal distance from obstacles) and then think about what perception, if kept under control, would result in you seeing the robot maintain a fixed distance from objects. I suspect it would be the perception of the time delay between sending and receiving of the ultrasound pulses.Since the robot is moving in two-space (I presume) there might have to be two pulse sensors in order to sense the two D location of objects.
There are potential conflicts between the control systems that you will need to build; for example, I think there could be conflicts between the system controlling for moving in a circular path and the system controlling for avoiding obstacles. The agents in the CROWD program have the same problem and sometimes get into dead end conflicts. There are various ways to deal with conflicts of this kind;for example, you could have a higher level system monitoring the error in the two potentially conflicting systems and have it make reduce the the gain in one system or the other if the conflict (error) persists for some time.