Difference between Logarithmic and Uniform cost criteria - optimization

I'v got some problem to understand the difference between Logarithmic(Lcc) and Uniform(Ucc) cost criteria and also how to use it in calculations.
Could someone please explain the difference between the two and perhaps show how to calculate the complexity for a problem like A+B*C
(Yes this is part of an assignment =) )
Thx for any help!
/Marthin

Uniform Cost Criteria assigns a constant cost to every machine operation regardless of the number of bits involved WHILE Logarithm Cost Criteria assigns a cost to every machine operation proportional to the number of bits involved

Problem size influence complexity
Since complexity depends on the size of the
problem we define complexity to be a function
of problem size
Definition: Let T(n) denote the complexity for
an algorithm that is applied to a problem of
size n.
The size (n) of a problem instance (I) is the
number of (binary) bits used to represent the
instance. So problem size is the length of the
binary description of the instance.
This is called Logarithmic cost criteria
Unit Cost Criteria
If you assume that:
- every computer instruction takes one time
unit,
- every register is one storage unit
- and that a number always fits in a register
then you can use the number of inputs as
problem size since the length of input (in bits)
will be a constant times the number of inputs.

Uniform cost criteria assume that every instruction takes a single unit of time and that every register requires a single unit of space.
Logarithmic cost criteria assume that every instruction takes a logarithmic number of time units (with respect to the length of the operands) and that every register requires a logarithmic number of units of space.
In simpler terms, what this means is that uniform cost criteria count the number of operations, and logarithmic cost criteria count the number of bit operations.
For example, suppose we have an 8-bit adder.
If we're using uniform cost criteria to analyze the run-time of the adder, we would say that addition takes a single time unit; i.e., T(N)=1.
If we're using logarithmic cost criteria to analyze the run-time of the adder, we would say that addition takes lg⁡n time units; i.e., T(N)=lgn, where n is the worst case number you would have to add in terms of time complexity (in this example, n would be 256). Thus, T(N)=8.
More specifically, say we're adding 256 to 32. To perform the addition, we have to add the binary bits together in the 1s column, the 2s column, the 4s column, etc (columns meaning the bit locations). The number 256 requires 8 bits. This is where logarithms come into our analysis. lg256=8. So to add the two numbers, we have to perform addition on 8 columns. Logarithmic cost criteria say that each of these 8 addition calculations takes a single unit of time. Uniform cost criteria say that the entire set of 8 addition calculations takes a single unit of time.
Similar analysis can be made in terms of space as well. Registers either take up a constant amount of space (under uniform cost criteria) or a logarithmic amount of space (under uniform cost criteria).

I think you should do some research on Big O notation... http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions
If there is a part of the description you find difficult edit your question.

Related

An example to show that amortized analysis and average-case analysis may give asymptotically different results

I have read many explanations of amortized analysis and how it differs from average-case analysis. However, I have not found a single explanation that showed how, for a particular example for which both kinds of analysis are sensible, the two would give asymptotically different results.
The most wide-spread example of amortized running time analysis shows that appending an element to a dynamic array takes O(1) amortized time (where the running time of the operation is O(n) if the array's length is an exact power of 2, and O(1) otherwise). I believe that, if we consider all array lengths equally likely, then the average-case analysis will give the same O(1) answer.
So, could you please provide an example to show that amortized analysis and average-case analysis may give asymptotically different results?
Consider a dynamic array supporting push and pop from the end. In this example, the array capacity will double when push is called on a full array and halve when pop leaves the array size 1/2 of the capacity. pop on an empty array does nothing.
Note that this is not how dynamic arrays are "supposed" to work. To maintain O(1) amortized complexity, the array capacity should only halve when the size is alpha times the capacity, for alpha < 1/2.
In the bad dynamic array, when considering both operations, neither has O(1) amortized complexity, because alternating between them when the capacity is near 2x the size can produce Ω(n) time complexity for both operations repeatedly.
However, if you consider all sequences of push and pop to be equally likely, both operations have O(1) average time complexity, for two reasons:
First, since the sequences are random, I believe the size of the array will mostly be O(1). This is a random walk on the natural numbers.
Second, the array will be near size a power of 2 only rarely.
This shows an example where amortized complexity is strictly greater than average complexity.
They never have different asymptotically different results. average-case means that weird data might not trigger the average case and might be slower. asymptotic analysis means that even weird data will have the same performance. But on average they'll always have the same complexity.
Where they differ is the worst-case analysis. For algorithms where slowdowns come every few items regardless of their values, then the worst-case and the average-case are the same, and we often call this "asymptotic analysis". For algorithms that can have slowdowns based on the data itself, the worst-case and average-case are different, and we do not call either "asymptotic".
In "Pairing Heaps with Costless Meld", the author gives a priority queue with O(0) time per meld. Obviously, the average time per meld is greater than that.
Consider any data structure with worst-case and best-case inserts and removes taking I and R time. Now use the physicist's argument and give the structure a potential of nR, where n is the number of values in the structure. Each insert increases the potential by R, so the total amortized cost of an insert is I+R. However, each remove decreases the potential by R. Thus, each removal has an amortized cost of R-R=0!
The average cost is R; the amortized cost is 0; these are different.

How do I determine "Big-oh" time complexity given an excel sheet showing various input size values VS their corresponding run time values

I have been given a list of input sizes and their corresponding runtime values for a given algorithm A. How should I go about computing the "Big-oh" time complexity of algorithm A given these values?
Try playing around with the numbers and see if they approximately fit one of the "standard" complexity functions, e.g. n, n^2, n^3, 2^n, log(n).
For example, if the ratio between value and input is nearly constant, it's likely O(n). If the ratio between value and input grows linearly (or doubling the input quadruples the value etc.), it is O(n^2). If it grows quadratically, it's O(n^3). If adding a constant to the input results in multiplicative change in its value, it's exponential. And if it's the reverse relationship, it's log(n).
If it's just slightly but consistently growing more quickly than a line, it's probably O(n log(n)).
You can also plot the graph of your values (input numbers vs runtime values) in Excel and overlay it with the graph of the function you guessed may fit, and then try to tweak the parameters (e.g. for O(n^2), plot a graph of a*x^2 + b, and tweak a and b).
To make it more precise (e.g. to calculate the uncertainty), you could apply regression analysis (search for non-linear regression analysis in Excel).

Time complexity with respect to input

This is a constant doubt I'm having. For example, I have a 2-d array of size n^2 (n being the number of rows and columns). Suppose I want to print all the elements of the 2-d array. When I calculate the time complexity of the algorithm with respect to n it's O(n^2 ). But if I calculated the time with respect to the input size (n^2 ) it's linear. Are both these calculations correct? If so, why do people only use O(n^2 ) everywhere regarding 2-d arrays?
That is not how time complexity works. You cannot do "simple math" like that.
A two-dimensional square array of extent x has n = x*x elements. Printing these n elements takes n operations (or n/m if you print m items at a time), which is O(N). The necessary work increases linearly with the number of elements (which is, incidentially, quadratic in respect of the array extent -- but if you arranged the same number of items in a 4-dimensional array, would it be any different? Obviously, no. That doesn't magically make it O(N^4)).
What you use time complexity for is not stuff like that anyway. What you want time complexity to tell you is an approximate idea of how some particular algorithm may change its behavior if you grow the number of inputs beyond some limit.
So, what you want to know is, if you do XYZ on one million items or on two million items, will it take approximately twice as long, or will it take approximately sixteen times as long, for example.
Time complexity analysis is irrespective of "small details" such as how much time an actual operations takes. Which tends to make the whole thing more and more academic and practically useless in modern architectures because constant factors (such as memory latency or bus latency, cache misses, faults, access times, etc.) play an ever-increasing role as they stay mostly the same over decades while the actual cost-per-step (instruction throughput, ALU power, whatever) goes down steadily with every new computer generation.
In practice, it happens quite often that the dumb, linear, brute force approach is faster than a "better" approach with better time complexity simply because the constant factor dominates everything.

Computing the approximate LCM of a set of numbers

I'm writing a tone generator program for a microcontroller.
I use an hardware timer to trigger an interrupt and check if I need to set the signal to high or low in a particular moment for a given note.
I'm using pretty limited hardware, so the slower I run the timer the more time I have to do other stuff (serial communication, loading the next notes to generate, etc.).
I need to find the frequency at which I should run the timer to have an optimal result, which is, generate a frequency that is accurate enough and still have time to compute the other stuff.
To achieve this, I need to find an approximate (within some percent value, as the higher are the frequencies the more they need to be imprecise in value for a human ear to notice the error) LCM of all the frequencies I need to play: this value will be the frequency at which to run the hardware timer.
Is there a simple enough algorithm to compute such number? (EDIT, I shall clarify "simple enough": fast enough to run in a time t << 1 sec. for less than 50 values on a 8 bit AVR microcontroller and implementable in a few dozens of lines at worst.)
LCM(a,b,c) = LCM(LCM(a,b),c)
Thus you can compute LCMs in a loop, bringing in frequencies one at a time.
Furthermore,
LCM(a,b) = a*b/GCD(a,b)
and GCDs are easily computed without any factoring by using the Euclidean algorithm.
To make this an algorithm for approximate LCMs, do something like round lower frequencies to multiples of 10 Hz and higher frequencies to multiples of 50 Hz. Another idea that is a bit more principled would be to first convert the frequency to an octave (I think that the formula is f maps to log(f/16)/log(2)) This will give you a number between 0 and 10 (or slightly higher --but anything above 10 is almost beyond human hearing so you could perhaps round down). You could break 0-10 into say 50 intervals 0.0, 0.2, 0.4, ... and for each number compute ahead of time the frequency corresponding to that octave (which would be f = 16*2^o where o is the octave). For each of these -- go through by hand once and for all and find a nearby round number that has a number of smallish prime factors. For example, if o = 5.4 then f = 675.58 -- round to 675; if o = 5.8 then f = 891.44 -- round to 890. Assemble these 50 numbers into a sorted array, using binary search to replace each of your frequencies by the closest frequency in the array.
An idea:
project the frequency range to a smaller interval
Let's say your frequency range is from 20 to 20000 and you aim for a 2% accurary, you'll calculate for a 1-50 range. It has to be a non-linear transformation to keep the accurary for lower frequencies. The goal is both to compute the result faster and to have a smaller LCM.
Use a prime factors table to easily compute the LCM on that reduced range
Store the pre-calculated prime factors powers in an array (size about 50x7 for range 1-50), and then use it for the LCM: the LCM of a number is the product of multiplying the highest power of each prime factor of the number together. It's easy to code and blazingly fast to run.
Do the first step in reverse to get the final number.

How to design acceptance probability function for simulated annealing with multiple distinct costs?

I am using simulated annealing to solve an NP-complete resource scheduling problem. For each candidate ordering of the tasks I compute several different costs (or energy values). Some examples are (though the specifics are probably irrelevant to the question):
global_finish_time: The total number of days that the schedule spans.
split_cost: The number of days by which each task is delayed due to interruptions by other tasks (this is meant to discourage interruption of a task once it has started).
deadline_cost: The sum of the squared number of days by which each missed deadline is overdue.
The traditional acceptance probability function looks like this (in Python):
def acceptance_probability(old_cost, new_cost, temperature):
if new_cost < old_cost:
return 1.0
else:
return math.exp((old_cost - new_cost) / temperature)
So far I have combined my first two costs into one by simply adding them, so that I can feed the result into acceptance_probability. But what I would really want is for deadline_cost to always take precedence over global_finish_time, and for global_finish_time to take precedence over split_cost.
So my question to Stack Overflow is: how can I design an acceptance probability function that takes multiple energies into account but always considers the first energy to be more important than the second energy, and so on? In other words, I would like to pass in old_cost and new_cost as tuples of several costs and return a sensible value .
Edit: After a few days of experimenting with the proposed solutions I have concluded that the only way that works well enough for me is Mike Dunlavey's suggestion, even though this creates many other difficulties with cost components that have different units. I am practically forced to compare apples with oranges.
So, I put some effort into "normalizing" the values. First, deadline_cost is a sum of squares, so it grows exponentially while the other components grow linearly. To address this I use the square root to get a similar growth rate. Second, I developed a function that computes a linear combination of the costs, but auto-adjusts the coefficients according to the highest cost component seen so far.
For example, if the tuple of highest costs is (A, B, C) and the input cost vector is (x, y, z), the linear combination is BCx + Cy + z. That way, no matter how high z gets it will never be more important than an x value of 1.
This creates "jaggies" in the cost function as new maximum costs are discovered. For example, if C goes up then BCx and Cy will both be higher for a given (x, y, z) input and so will differences between costs. A higher cost difference means that the acceptance probability will drop, as if the temperature was suddenly lowered an extra step. In practice though this is not a problem because the maximum costs are updated only a few times in the beginning and do not change later. I believe this could even be theoretically proven to converge to a correct result since we know that the cost will converge toward a lower value.
One thing that still has me somewhat confused is what happens when the maximum costs are 1.0 and lower, say 0.5. With a maximum vector of (0.5, 0.5, 0.5) this would give the linear combination 0.5*0.5*x + 0.5*y + z, i.e. the order of precedence is suddenly reversed. I suppose the best way to deal with it is to use the maximum vector to scale all values to given ranges, so that the coefficients can always be the same (say, 100x + 10y + z). But I haven't tried that yet.
mbeckish is right.
Could you make a linear combination of the different energies, and adjust the coefficients?
Possibly log-transforming them in and out?
I've done some MCMC using Metropolis-Hastings. In that case I'm defining the (non-normalized) log-likelihood of a particular state (given its priors), and I find that a way to clarify my thinking about what I want.
I would take a hint from multi-objective evolutionary algorithm (MOEA) and have it transition if all of the objectives simultaneously pass with the acceptance_probability function you gave. This will have the effect of exploring the Pareto front much like the standard simulated annealing explores plateaus of same-energy solutions.
However, this does give up on the idea of having the first one take priority.
You will probably have to tweak your parameters, such as giving it a higher initial temperature.
I would consider something along the lines of:
If (new deadline_cost > old deadline_cost)
return (calculate probability)
else if (new global finish time > old global finish time)
return (calculate probability)
else if (new split cost > old split cost)
return (calculate probability)
else
return (1.0)
Of course each of the three places you calculate the probability could use a different function.
It depends on what you mean by "takes precedence".
For example, what if the deadline_cost goes down by 0.001, but the global_finish_time cost goes up by 10000? Do you return 1.0, because the deadline_cost decreased, and that takes precedence over anything else?
This seems like it is a judgment call that only you can make, unless you can provide enough background information on the project so that others can suggest their own informed judgment call.