What is the worst case time complexity of median of medians quicksort? - time-complexity

What is the worst case time complexity of median of medians quicksort ( pivot is determined by the median of medians which take O(n) time to find )?

According to Wiki,
The approximate median-selection algorithm can also be used as a pivot strategy in quicksort, yielding an optimal algorithm, with worst-case complexity O(n log n).
This is because the median of medians algorithm prevents the bad partitioning that would occur in naive quicksort on an already sorted array.

Related

kNN-DTW time complexity

I found from various online sources that the time complexity for DTW is quadratic. On the other hand, I also found that standard kNN has linear time complexity. However, when pairing them together, does kNN-DTW have quadratic or cubic time?
In essence, does the time complexity of kNN solely depend on the metric used? I have not found any clear answer for this.
You need to be careful here. Let's say you have n time series in your 'training' set (let's call it this, even though you are not really training with kNN) of length l. Computing the DTW between a pair of time series has a asymptotic complexity of O(l * m) where m is your maximum warping window. As m <= l also O(l^2) holds. (although there might be more efficient implementations, i don't think they are actually faster in practice in most cases, see here). Classifying a time series using kNN requires you to compute the distance between that time series and all time series in the training set which would mean n comparisons, linear with respect to n.
So your final complexity would be in O(l * m * n) or O(l^2 * n). In words: the complexity is quadratic with respect to time series length and linear with respect to the number of training examples.

How to prove the time complexity of quicksort is O(nlogn)

I don't understand the proof given in my textbook that the time complexity of Quicksort is O(n log n). Can anyone explain how to prove it?
Typical arguments that Quicksort's average case running time is O(n log n) involve arguing that in the average case each partition operation divides the input into two equally sized partitions. The partition operations take O(n) time. Thus each "level" of the recursive Quicksort implementation has O(n) time complexity (across all the partitions) and the number of levels is however many times you can iteratively divide n by 2, which is O(log n).
You can make the above argument rigorous in various ways depending on how rigorous you want it and the background and mathematical maturity etc, of your audience. A typical way to formalize the above is to represent the the number of comparisons required by the average case of a Quicksort call as a recurrence relation like
T(n) = O(n) + 2 * T(n/2)
which can be proved to be O(n log n) via the Master Theorem or other means.

Space Complexity - Dropping the non-dominant terms

I know, we should drop the non-dominant terms when calculating time complexity of an algorithm. I am wondering if we should drop them when calculating space complexity. For example, if I have a string of N letters, I'd like to:
construct a list of letters from this string -> Space: O(N);
sort this list -> Worst-case space complexity for Timsort (I use Python): O(N).
In this case, would the entire solution take O(N) + O(N) space or just O(N)?
Thank you.
Welcome to SO!
First of all, I think you do misunderstand complexity: Complexity is defined independently of constant factors. It depends only on the large scale behavior of the data set size N. Thus, O(N) + O(N) is the same complexity as O(N).
Thus, your question might have been:
If I construct a list of letters using an algorithm with O(N) space complexity, followed by a sort algorithm with O(N) space complexity, would the entire solution use twice as much space?
But this question cannot be answered, since a complexity does not give you any measure how much space is actually used.
A well-known example: A brute force sorting algorithm, BubbleSort, with time complexity O(N^2) is faster for small data sets than a very good sorting algorithm, QuickSort, with average time complexity O(Nlog(N)).
EDIT:
It is no contradiction, that one can compute a space complexity, and that it does not say how much space is actually used.
A simple example:
Say, for a certain problem algorithm 1 has linear space complexity O(n), and algorithm 2 space complexity O(n^2).
One could thus assume (but this is wrong) that algorithm 1 would always use less space than algorithm 2.
First, it is clear that for large enough n algorithm 2 will use more space than algorithm 1, because n^2 grows faster than n.
However, consider the case where n is small enough, say n = 1, and algorithm 1 is implemented on a computer that uses storage in doubles (64 bits), whereas algorithm 2 is implemented on a computer that uses bytes (8 bits). Then, obviously, the O(n^2) algorithm uses less space than the O(n) algorithm.

Can i calculate the median of numbers in less than Big O(n)

I want to calculate the median of running sequence in less than O(n). better if in python. Thanks.
If you want to find real median, there is no way to do it faster than in O(n) (in unstructured array) - because you have to go through all numbers. You never know if the last number you find will be median or not.
For some alghoritms there is heuristic used for finding median, which means you dont get exact median, but you have some approach how to get (with some probability) close to the real one.
For example quicksort requires some median value, but going through whole array would increase average complexity to O(n^2) (and the reason to use quicksort is that it has average complexity of O(n*log n). In such case, you can for example choose 10 random numbers and count the median for them. Then you can say that real median will be probably about this value.

Quicksort omega-notation

Best case for Quicksort is n log(n) but everyone uses Big-O notation to describe the best case as O(n log (n)). From my understanding of notations, Quicksort has Big-Omega(n log (n)) and O(n^2). Is this correct or am I misunderstanding Big-Omega notation?
Big-O and Big-Omega are ways of describing functions, not algorithms. Saying Quicksort is O(n^2) is ambiguous because you're not saying what property of the algorithm you're describing.
Algorithms can have best-case time complexities, and worst-case time complexities. These are the time complexities of the algorithm if the best or worst performing inputs are used for each input size.
This is different from Big-O, and Big-Omega which describe upper and lower bounds of a function.
The time-complexities are given as a function of the input size, which can have their own upper and lower bounds.
For example, if you knew the best-case wasn't any worse than nlogn, then you could say the best-case time complexity is O(nlogn). If you knew it was exactly nlogn then it would be more precise to say Theta(nlogn).
You have the details incorrect. Big-O is technically supposed to be worst case. QuickSort has an average case of O(n log n) and a worst case of O(n^2). Going by the technical definition, the correct Big-O of QuickSort is then n^2. But the chances of randomly getting that performance is practically zero (unless you run QuickSort on an already sorted list) so QuickSort is given the Big-O of n log n even though it is not technically correct.
Big-Omega on the other hand is technically defined as "Takes at least this long" so it is a lower bound. That means QuickSort has Big-Omega(n log n) and n^2 has nothing to to with QuickSort and Big-Omega.