Im sort of confused on finding the time complexity of this recursion - time-complexity

# class ListNode:
# def __init__(self, val=0, next=None):
# self.val = val
# self.next = next
class Solution:
def mergeTwoLists(self, list1: Optional[ListNode], list2: Optional[ListNode]) -> Optional[ListNode]:
if(list1 is None and list2 is not None):
return list2;
elif(list1 is not None and list2 is None):
return list1;
elif(list1 is not None and list2 is not None):
if(list1.val > list2.val):
return ListNode(list2.val,self.mergeTwoLists(list1,list2.next));
elif(list2.val>=list1.val):
return ListNode(list1.val,self.mergeTwoLists(list1.next,list2));
else:
return None
So I am not sure if my understanding is correct for finding the time complexity of this recursion, I initially thought it was O(n), since I am only going through each node, but then got confused on how the recursion would affect it.I know its a stupid question, but any tips on understanding time complexity for recursive functions is much appreciated. Also, if it was the case that it was O(n), is there any benefit to doing recursion over an iterative method?

Time Complexity:
According to me it should be O(n+m) where n is the size of first list and m is the size of second list.
Benefit to doing recursion over an iterative method?
Recursion adds clarity and reduces(sometimes) the time needed to
write and debug code.
Note: Using recursion you are using extra stack memory but with the
Iterative approach you will not use stack memory.

Related

Is it possible to compute the sign of a permutation in linear time?

I was just wondering if there's a way to compute the sign of a permutation within linear (or at least better than n^2?) time
For example, let's say I have an array of n numbers and I permute two elements within this array which would flip the sign of the permutation. I have a function that can compute this in n^2 time, however, it seems there might be a more efficient algorithm.
I've attached a minimal reproducible example of computing in quadratic time,
import numpy as np
vals = np.arange(1,6,1)
pvals = np.arange(1,6,1)
pvals[0], pvals[1] = pvals[1], pvals[0] #swap
def quadratic(vals):
sgn_matrix = np.sign(np.expand_dims(vals, -1) - np.expand_dims(vals, -2))
return np.prod(np.tril(np.ones_like(sgn_matrix)) + np.triu(sgn_matrix, 1))
def sub_quadratic(vals):
#algorithm quicker than quadratic time?
sgn = quadratic(vals)
print(sgn) #prints +1
psgn = quadratic(pvals)
print(psgn) #prints -1 (because one permutation)
I have had a look around SO (here for example) and people keep talking about cyclic permutations which apparently can compute in linear time but it's something I'm unaware of completely and can't find much of myself.
TL;DR Does anyone know of a method for computing the sign of a permutation in sub-quadratic time ?
Just decompose it into transpositions and check whether you needed an even or odd number of transpositions:
def permutation_sign(perm):
parity = 1
perm = perm.copy()
for i in range(len(perm)):
while perm[i] != i+1:
parity *= -1
j = perm[i] - 1
# Note: if you try to inline the j computation into the next line,
# you'll get evaluation order bugs.
perm[i], perm[j] = perm[j], perm[i]
return parity

Canonical Tensorflow "for loop"

What is the canonical way of running a Tensorflow "for loop"?
Specifically, suppose we have some body function which does NOT depend on the loop iteration, but must be run n times.
One might think that a good method might be to run this inside of a tf.while_loop like this:
def body(x):
return ...
def while_body(i,x):
return i+1, body(x)
i, x = tf.while_loop(lambda i: tf.less(i, n), while_body, [tf.constant(0),x])
In fact, that is precisely what the highest rated answer in this question suggests:
How can I run a loop with a tensor as its range? (in tensorflow)
However, the tf.while_loop docs specifically say
For correct programs, while_loop should return the same result for any parallel_iterations > 0.
If you put a counter in the body, then it seems that that condition is violated. So it seems that there must be a different way of setting up a "for loop".
Furthermore, even if there is no explicit error, doing so seems like it will create a dependency between iterations meaning that I do not think they will run in parallel.
After some investigation, it seems that the tf.while_loop idiom used above is quite common. Alternatively, one can use tf.scan:
def body( x ):
return ...
def scan_body( previous_output, iteration ):
return body( ... )
x = tf.scan( scan_body, tf.range(n), initializer = [x] )
although I have no idea if one is preferable from a performance point of view. Note in the above that we have to wrap the body function to accept the previous output.

efficient way of doing matrix addition in numpy

I have many matrices to add. Let's say that the matrices are [M1, M2..., M_n]. Then, a simple way is
X = np.zeros()
for M in matrices:
X += M
In the operation, X += M, does Python create a new memory for X every time += is executed? If that's the case, that seems to be inefficient. Is there any way of doing an in-place operation without creating a new memory for X?
This works but is not faster on my machine:
numpy.sum(matrices, axis=0)
Unless you get MemoryError, trying to second guess memory usage in numpy is not worth the effort. Leave that to the developers who know the compiled code.
But we can perform some time tests - that's what really matters, doesn't it?
I'll test adding a good size array 100 times.
In [479]: M=np.ones((1000,1000))
Your iterative approach with +=
In [480]: %%timeit
...: X=np.zeros_like(M)
...: for _ in range(100): X+=M
...:
1 loop, best of 3: 627 ms per loop
Or make an array of size (100, 1000, 1000) and apply np.sum across the first axis.
In [481]: timeit np.sum(np.array([M for _ in range(100)]),axis=0)
1 loop, best of 3: 1.54 s per loop
and using the np.add ufunc. With reduce we can apply it sequentially to all values in a list.
In [482]: timeit np.add.reduce([M for _ in range(100)])
1 loop, best of 3: 1.53 s per loop
The np.sum case gives me a MemoryError if I use range(1000). I don't have enough memory to hold a (1000,1000,1000) array. Same for the add.reduce, which builds an array from the list.
What += does under the cover is normally hidden, and of no concern to us - usually. But for a peak under covers look at ufunc.at: https://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.at.html#numpy.ufunc.at
Performs unbuffered in place operation on operand ‘a’ for elements specified by ‘indices’. For addition ufunc, this method is equivalent to a[indices] += b, except that results are accumulated for elements that are indexed more than once.
So X+=M does write the sum to a buffer, and then copies that buffer to X. There is a temporary buffer, but final memory usage does not change.
But that buffer creation and copying is done in fast C code.
np.add.at was added to deal with the case where that buffered action creates some problems (duplicate indices).
So it avoids that temporary buffer - but at a considerable speed cost. It's probably the added indexing capability that slows it down. (There may be a fairer add.at test; but it certainly doesn't help in this case.)
In [491]: %%timeit
...: X=np.zeros_like(M)
...: for _ in range(100): np.add.at(X,(slice(None),slice(None)),M)
1 loop, best of 3: 19.8 s per loop

What makes dynamic_rnn faster to compile?

I have a complicated network with many repeated RNN steps. Compiling takes a long time (30+ minutes, mostly stuck at the gradient step) and I found that this issue might be related, which mentions dynamic_rnn as a much faster way to compile:
Looking over dynamic_rnn, I then reformatted my network to include a while_loop, like so:
#input: tensor with 1000 time steps
def body(i, prev_state):
inp = tf.slice(input, i, 1)
new_state = cell(tf.squeeze(int), prev_state) # Includes scope reuse
return [tf.add(i, tf.constant(1)), new_state]
def cond(i):
return some_cond(i)
tf.while_loop(cond, body, [tf.constant(0), initial_state])
But this didn't seem to help. What besides simply putting the cell call in a loop makes dynamic_rnn so much faster to compile?

Haskell tail recursion predictability

One of the biggest issues I have with haskell is to be able to (correctly) predict the performance of haskell code. While I have some more difficult problems, I realize I have almost no understanding.
Take something simple like this:
count [] = 0
count (x:xs) = 1 + count xs
As I understand it, this isn't strictly a tail call (it should need to keep 1 on the stack), so looking at this definition -- what can I reason about it? A count function should obviously have O(1) space requirements, but does this one? And can I be guaranteed it will or won't?
if you want to reason more easily about recursive functions, use higher order functions with known time nd space complexity. If you use foldl or foldr you know that their space complexity cannot be O(1). But if you use foldl' from Data.List as in
count = foldl' (\acc x -> acc + 1) 0
your function will be O(1) in space complexity as foldl' is tail recursive per definition.
HTH Chris