I have done a course on Computer Architecture and it was mentioned that on the most efficient processors with n bit architecture word size the addition/subtraction of two words has a time complexity of O(log n) while multiplication/division has a time complexity of O(n).
If you do not consider any particular architecture word size the best time complexity of addition/subtraction is O(n) (https://www.academia.edu/42811225/Fast_Arithmetic_Speeding_up_Multiplication_Division_and_Addition_of_n_Bit_Numbers) and multiplication/division seems to be O(n log n log log n) (Strassen https://en.m.wikipedia.org/wiki/Multiplication_algorithm).
Is this correct?
O(log n) is the latency of addition if you can use n-bit wide parallel hardware with stuff like carry-select or carry-lookahead.
O(n) is the total amount of work that needs doing, and thus the time complexity with a fixed-width ALU for arbitrary bigint problems as n tends towards infinity.
For a multiply, there are n partial products in an n-bit multiply, so adding them all (with a Dadda tree for example) takes on the order of O(log n) gate delays of latency. Integer addition is associative, so you can do that in parallel, e.g. (a+b) + (c+d) is 3 with the latency of 2, and it gets better from there.
Dadda trees can avoid some of the carry-propagation latency so I guess it avoids the extra factor of log n you'd get if you you just used normal addition of each partial product separately.
See Differences between Wallace Tree and Dadda Multipliers for more about practical considerations for huge Dadda trees.
Related
I have read many explanations of amortized analysis and how it differs from average-case analysis. However, I have not found a single explanation that showed how, for a particular example for which both kinds of analysis are sensible, the two would give asymptotically different results.
The most wide-spread example of amortized running time analysis shows that appending an element to a dynamic array takes O(1) amortized time (where the running time of the operation is O(n) if the array's length is an exact power of 2, and O(1) otherwise). I believe that, if we consider all array lengths equally likely, then the average-case analysis will give the same O(1) answer.
So, could you please provide an example to show that amortized analysis and average-case analysis may give asymptotically different results?
Consider a dynamic array supporting push and pop from the end. In this example, the array capacity will double when push is called on a full array and halve when pop leaves the array size 1/2 of the capacity. pop on an empty array does nothing.
Note that this is not how dynamic arrays are "supposed" to work. To maintain O(1) amortized complexity, the array capacity should only halve when the size is alpha times the capacity, for alpha < 1/2.
In the bad dynamic array, when considering both operations, neither has O(1) amortized complexity, because alternating between them when the capacity is near 2x the size can produce Ω(n) time complexity for both operations repeatedly.
However, if you consider all sequences of push and pop to be equally likely, both operations have O(1) average time complexity, for two reasons:
First, since the sequences are random, I believe the size of the array will mostly be O(1). This is a random walk on the natural numbers.
Second, the array will be near size a power of 2 only rarely.
This shows an example where amortized complexity is strictly greater than average complexity.
They never have different asymptotically different results. average-case means that weird data might not trigger the average case and might be slower. asymptotic analysis means that even weird data will have the same performance. But on average they'll always have the same complexity.
Where they differ is the worst-case analysis. For algorithms where slowdowns come every few items regardless of their values, then the worst-case and the average-case are the same, and we often call this "asymptotic analysis". For algorithms that can have slowdowns based on the data itself, the worst-case and average-case are different, and we do not call either "asymptotic".
In "Pairing Heaps with Costless Meld", the author gives a priority queue with O(0) time per meld. Obviously, the average time per meld is greater than that.
Consider any data structure with worst-case and best-case inserts and removes taking I and R time. Now use the physicist's argument and give the structure a potential of nR, where n is the number of values in the structure. Each insert increases the potential by R, so the total amortized cost of an insert is I+R. However, each remove decreases the potential by R. Thus, each removal has an amortized cost of R-R=0!
The average cost is R; the amortized cost is 0; these are different.
I was working on a problem where we are supposed to give an example of an algorithm whose time complexity is O(n^2), but whose amortized time complexity is less than that. My immediate thought is nested loops, but I'm not exactly sure of what an example of that would look like where the result was amortized. Any insights would be greatly appreciated!
Consider the Add method on a Vector (resizable array) data structure. Once the current capacity of the array is exceeded, we must increase the capacity by making a larger array and copying stuff over. Typically, you'd just double the capacity in such cases, giving rise to a worst-case O(n) Add, but an O(1) amortized Add. Instead of doubling, we're of course free to increase it by squaring (provided the initial capacity is greater than one). This means that, every now and then, an add will take O(n^2) time; but such an increasingly large majority of them will take O(1) time that the amortized complexity will be O(1) as well.
Combining variations on this idea with the multiplicative effect on complexity of putting code into loops, it's probably possible to find an example where the worst-case time complexity is O(f) and the amortized complexity is O(g), for and f and g where g is O(f).
I know, we should drop the non-dominant terms when calculating time complexity of an algorithm. I am wondering if we should drop them when calculating space complexity. For example, if I have a string of N letters, I'd like to:
construct a list of letters from this string -> Space: O(N);
sort this list -> Worst-case space complexity for Timsort (I use Python): O(N).
In this case, would the entire solution take O(N) + O(N) space or just O(N)?
Thank you.
Welcome to SO!
First of all, I think you do misunderstand complexity: Complexity is defined independently of constant factors. It depends only on the large scale behavior of the data set size N. Thus, O(N) + O(N) is the same complexity as O(N).
Thus, your question might have been:
If I construct a list of letters using an algorithm with O(N) space complexity, followed by a sort algorithm with O(N) space complexity, would the entire solution use twice as much space?
But this question cannot be answered, since a complexity does not give you any measure how much space is actually used.
A well-known example: A brute force sorting algorithm, BubbleSort, with time complexity O(N^2) is faster for small data sets than a very good sorting algorithm, QuickSort, with average time complexity O(Nlog(N)).
EDIT:
It is no contradiction, that one can compute a space complexity, and that it does not say how much space is actually used.
A simple example:
Say, for a certain problem algorithm 1 has linear space complexity O(n), and algorithm 2 space complexity O(n^2).
One could thus assume (but this is wrong) that algorithm 1 would always use less space than algorithm 2.
First, it is clear that for large enough n algorithm 2 will use more space than algorithm 1, because n^2 grows faster than n.
However, consider the case where n is small enough, say n = 1, and algorithm 1 is implemented on a computer that uses storage in doubles (64 bits), whereas algorithm 2 is implemented on a computer that uses bytes (8 bits). Then, obviously, the O(n^2) algorithm uses less space than the O(n) algorithm.
This is a constant doubt I'm having. For example, I have a 2-d array of size n^2 (n being the number of rows and columns). Suppose I want to print all the elements of the 2-d array. When I calculate the time complexity of the algorithm with respect to n it's O(n^2 ). But if I calculated the time with respect to the input size (n^2 ) it's linear. Are both these calculations correct? If so, why do people only use O(n^2 ) everywhere regarding 2-d arrays?
That is not how time complexity works. You cannot do "simple math" like that.
A two-dimensional square array of extent x has n = x*x elements. Printing these n elements takes n operations (or n/m if you print m items at a time), which is O(N). The necessary work increases linearly with the number of elements (which is, incidentially, quadratic in respect of the array extent -- but if you arranged the same number of items in a 4-dimensional array, would it be any different? Obviously, no. That doesn't magically make it O(N^4)).
What you use time complexity for is not stuff like that anyway. What you want time complexity to tell you is an approximate idea of how some particular algorithm may change its behavior if you grow the number of inputs beyond some limit.
So, what you want to know is, if you do XYZ on one million items or on two million items, will it take approximately twice as long, or will it take approximately sixteen times as long, for example.
Time complexity analysis is irrespective of "small details" such as how much time an actual operations takes. Which tends to make the whole thing more and more academic and practically useless in modern architectures because constant factors (such as memory latency or bus latency, cache misses, faults, access times, etc.) play an ever-increasing role as they stay mostly the same over decades while the actual cost-per-step (instruction throughput, ALU power, whatever) goes down steadily with every new computer generation.
In practice, it happens quite often that the dumb, linear, brute force approach is faster than a "better" approach with better time complexity simply because the constant factor dominates everything.
What I have done:
I measured the time spent processing 100, 1000, 10000, 100000, 1000000 items.
Measurements here: https://github.com/DimaBond174/cache_single_thread
.
Then I assumed that O(n) increases in proportion to n, and calculated the remaining algorithms with respect to O(n) ..
Having time measurements for processing 100, 1000, 10000, 100000, 1000000 items how can we now attribute the algorithm to O(1), O(log n), O(n), O(n log n), or O(n^2) ?
Let's define N as one of the possible inputs of data. An algorithm can have different Big O values depending on which input you're referring to, but generally there's only one big input that you care about. Without the algorithm in question, you can only guess. However there are some guidelines that will help you determine which it is.
General Rule:
O(1) - the speed of the program barely changes regardless of size of data. To get this, a program must not have loops operating on the data in question at all.
O(log N) - the program slows down slightly when N increases dramatically, in a logarithmic curve. To get this, loops must only go through a fraction of the data. (for example, binary search).
O(N) - the program's speed is directly proportional to the size of the data input. If you perform an operation on each unit of the data, you get this. You must not have any kind of nested loops (that act on the data).
O(N log N)- the program's speed is significantly reduced by larger input. This occurs when you have a O(logN) operation NESTED in a loop that would otherwise be O(N). So for example, you had a loop that did a binary search for each unit of data.
O(N^2) - The program will slow down to a crawl with larger input and eventually stall with large enough data. This happens when you have NESTED loops. Same as above, but this time the nested loop is O(N) instead of O(log N)
So, try to think of a looping operation as O(N) or O(log N). Then, whenever you have nesting, multiply them together. If the loops are NOT nested, they are not multiplied like this. So two loops separate from each other would simply be O(2N) and not O(N^2).
Also remember that you may have loops under the hood, so you should think about them too. For example, if you did something like Arrays.sort(X) in Java, that would be a O(N logN) operation. So if you have that inside a loop for some reason, your program is going to be a lot slower than you think.
Hope that answers your question.