Haskell tail recursion predictability - optimization

One of the biggest issues I have with haskell is to be able to (correctly) predict the performance of haskell code. While I have some more difficult problems, I realize I have almost no understanding.
Take something simple like this:
count [] = 0
count (x:xs) = 1 + count xs
As I understand it, this isn't strictly a tail call (it should need to keep 1 on the stack), so looking at this definition -- what can I reason about it? A count function should obviously have O(1) space requirements, but does this one? And can I be guaranteed it will or won't?

if you want to reason more easily about recursive functions, use higher order functions with known time nd space complexity. If you use foldl or foldr you know that their space complexity cannot be O(1). But if you use foldl' from Data.List as in
count = foldl' (\acc x -> acc + 1) 0
your function will be O(1) in space complexity as foldl' is tail recursive per definition.
HTH Chris

Related

Time Complexity of Algorithms With Addition [duplicate]

I'm learning a course about big O notation on Coursera. I watched a video about the big O of a Fibonacci algorithm (non-recursion method), which is like this:
Operation Runtime
create an array F[0..n] O(n)
F[0] <-- 0 O(1)
F[1] <-- 1 O(1)
for i from 2 to n: Loop O(n) times
F[i] <-- F[i-1] + F[i-2] O(n) => I don't understand this line, isn't it O(1)?
return F[n] O(1)
Total: O(n)+O(1)+O(1)+O(n)*O(n)+O(1) = O(n^2)
I understand every part except F[i] <-- F[i-1] + F[i-2] O(n) => I don't understand this line, isn't it O(1) since it's just a simple addition? Is it the same with F[i] <-- 1+1?
The explanation they give me is:"But the addition is a bit worse. And normally additions are constant time. But these are large numbers. Remember, the nth Fibonacci number has about n over 5 digits to it, they're very big, and they often won't fit in the machine word."
"Now if you think about what happens if you add two very big numbers together, how long does that take? Well, you sort of add the tens digit and you carry, and you add the hundreds digit and you carry, and add the thousands digit, you carry and so on and so forth. And you sort of have to do work for each digits place.
And so the amount of work that you do should be proportional to the number of digits. And in this case, the number of digits is proportional to n, so this should take O(n) time to run that line of code".
I'm still a bit confusing. Does it mean a large number affects time complexity too? For example a = n+1 is O(1) while a = n^50+n^50 isn't O(1) anymore?
Video link for anyone who needed more information (4:56 to 6:26)
Big-O is just a notation for keeping track of orders of magnitude. But when we apply that in algorithms, we have to remember "orders of magnitude of WHAT"? In this case it is "time spent".
CPUs are set up to execute basic arithmetic on basic arithmetic types in constant time. For most purposes, we can assume we are dealing with those basic types.
However if n is a very large positive integer, we can't assume that. A very large integer will need O(log(n)) bits to represent. Which, whether we store it as bits, bytes, etc, will need an array of O(log(n)) things to store. (We would need fewer bytes than bits, but that is just a constant factor.) And when we do a calculation, we have to think about what we will actually do with that array.
Now suppose that we're trying to calculate n+m. We're going to need to generate a result of size O(log(n+m)), which must take at least that time to allocate. Luckily the grade school method of long addition where you add digits and keep track of carrying, can be adapted for big integer libraries and is O(log(n+m)) to track.
So when you're looking at addition, the log of the size of the answer is what matters. Since log(50^n) = n * log(50) that means that operations with 50^n are at least O(n). (Getting 50^n might take longer...) And it means that calculating n+1 takes time O(log(n)).
Now in the case of the Fibonacci sequence, F(n) is roughly φ^n where φ = (1 + sqrt(5))/2 so log(F(n)) = O(n).

Can I represent time-complexity as a summation (complexity of elements of different length)

Let's say I have to iterate over every character in an array of strings, in which every string has a different length, so arr[0].length != arr[1].length and so on, as this for example:
#prints every char in all the array
for str in arr:
for c in str:
print(c)
How should the time complexity of an algorithm of this nature be represented? A summation of every length of the element in the array? or just like O(N*M), taking N as number of elements and M as max length of array, which it overbounds accordingly?
There is a precise mathematical theory called complexity theory which answers your question and many more. In complexity theory, we have what is called a Turing machine which is a type of computer. The time complexity of a Turing machine doing a computation is then defined as the function f defined on natural numbers such that f(n) is the worst case running time of the machine on inputs of length n. In your case it just needs to copy its input into somewhere else, which is clearly has O(n) time complexity (n here is the combined length of your array). Since NM is greater than n, it means that your Turing machine doing the algorithm you described will not run longer than some constant times NM but it may halt sooner due to irregularities of the lengths of elements of the array.
If you are interested in learning about complexity theory, I recommend the book Introduction to the Theory of Computation by Michael Sipser, which explains these concepts from scratch.
There are many ways you could do this. Your bound of O(NM) is a conservative upper bound. You could also define a parameter L indicating the total length of all the strings and say that the runtime is Θ(N + L), which is essentially your sum idea made a bit cleaner by assigning a name to the summation. That’s a more precise bound that more clearly indicates where the work is being done.

How can I compare the time-complexity O(n^2) with O(N+log(M))?

My Lua function:
for y=userPosY+radius,userPosY-radius,-1 do
for x=userPosX-radius,userPosX+radius,1 do
local oneNeighborFound = redis.call('lrange', userPosZone .. x .. y, '0', '0')
if next(oneNeighborFound) ~= nil then
table.insert(neighborsFoundInPosition, userPosZone .. x .. y)
neighborsFoundInPositionCount = neighborsFoundInPositionCount + 1
end
end
end
Which leads to this formula: (2n+1)^2
As I understand it correctly, that would be a time complexity of O(n^2).
How can I compare this to the time complexity of the GEORADIUS (Redis) with O(N+log(M))? https://redis.io/commands/GEORADIUS
Time complexity: O(N+log(M)) where N is the number of elements inside the bounding box of the circular area delimited by center and radius and M is the number of items inside the index.
My time complexity does not have a M. I do not know how many items are in the index (M) because I do not need to know that. My index changes often, almost with every request and can be large.
Which time complexity is when better?
Assuming N and M were independent variables, I would treat O(N + log M) the same way you treat O(N3 - 7N2 - 12N + 42): the latter becomes O(N3) simply because that's the term that has most effect on the outcome.
This is especially true as time complexity analysis is not really a case of considering runtime. Runtime has to take into account the lesser terms for specific limitations of N. For example, if your algorithm runtime can be expressed as runtime = N2 + 9999999999N, and N is always in the range [1, 4], it's the second term that's more important, not the first.
It's better to think of complexity analysis as what happens as N approaches infinity. With the O(N + log M) one, think about what happens when you:
double N?
double M?
The first has a much greater impact so I would simply convert the complexity to O(N).
However, you'll hopefully have noticed the use of the word "independent" in my first paragraph. The only sticking point to my suggestion would be if M was actually some function of N, in which case it may become the more important term.
Any function that reversed the impact of the log M would do this, such as the equality M = 101010N.

Median of Medians using blocks of 3 - why is it not linearic?

I understand why, in worst case, where T is the running time of the algorithm, that using the median of medians algorithm with blocks of size three gives a recurrence relation of
T(n) = T(2n / 3) + T(n / 3) + O(n)
The Wikipedia article for the median-of-medians algorithm says that with blocks of size three the runtime is not O(n) because it still needs to check all n elements. I don't quite understand this explanation, and in my homework it says I need to show it by induction.
How would I show that median-of-medians takes time Ω(n log n) in this case?
Since this is a homework problem I'm going to let you figure out a rigorous proof of this result on your own, but it might be helpful to think about this one by looking at the shape of the recursion tree, which will be something like this:
n Total work: n
2n/3 n/3 Total work: n
4n/9 2n/9 2n/9 n/9 Total work: n
Essentially, each node's children collectively will do the exact same amount of work as the node itself, so if you sum up the work done across the layers, you should see roughly linear work done per level. It won't be exactly linear work per level because eventually the smaller call starts to bottom out, but for the top layers you'll see this pattern hold.
You can formalize this by induction by guessing that the runtime is something of the form cn log n, possibly with some lower-order terms added in, but (IMHO) it's more important and instructive to see where the runtime comes from than it is to be able to prove it inductively.
If we add the fractional parts of T(2n/3) and T(n/3), get T(n). Then, using the Master theorem, we have n^(log_(b)(a)) = n^(log_(1)(1)) = n. We also have f(n) = O(n). So n^(log_(b)(a)) = O(n) = Theta(f(n)), thus Case 2 of the Master theorem applies. Thus T(n) = Theta(n^(log_(b)(a)) * log(n)) = Theta(n*log(n)).

Practical difference between O(n) and O(1 + n)?

Isn't O(n) an improvement over O(1 + n)?
This is my conception of the difference:
O(n):
for i=0 to n do ; print i ;
O(1 + n):
a = 1;
for i=0 to n do ; print i+a ;
... which would just reduce to O(n) right?
If the target time complexity is O(1 + n), but I have a solution in O(n),
does this mean I'm doing something wrong?
Thanks.
O(1+n) and O(n) are mathematically identical, as you can straightforwardly prove from the formal definition or using the standard rule that O( a(n) + b(n) ) is equal to the bigger of O(a(n)) and O(b(n)).
In practice, of course, if you do n+1 things it'll (usually, dependent on compiler optimizations/etc) take longer than if you only do n things. But big-O notation is the wrong tool to talk about those differences, because it explicitly throws away differences like that.
It's not an improvement because BigO doesn't describe the exact running time of your algorithm but rather its growth rate. BigO therefore describes a class of functions, not a single function. O(n^2) doesn't mean that your algorithms for input of size 2 will run in 4 operations, it means that if you were to plot the running time of your application as a function of n it would be asymptotically upper bound by c*n^2 starting at some n0. This is nice because we know how much slower our algorithm will be for each input size, but we don't really know exactly how fast it will be. Why use the c? Because as I said we don't care about exact numbers but more about the shape of the function - when we multiply by a constant factor the shape stays the same.
Isn't O(n) an improvement over O(1 + n)?
No, it is not. Asymptotically these two are identical. In fact, O(n) is identical to O(n+k) where k is any constant value.