What is an efficient to find if a given sub sequence exists in an infinite sequence? - sequence

There is an infinite sequence b, such that b[i] = 1 , if the binary representation of i has odd number of 1s, and b[i] = 0 , otherwise.
Given a sub sequence of 0s and 1s, I am required to find the starting index of a match, if one exists or state that a match doesn't exist if it doesn't.
What is an efficient way to do it?

Related

How can I improve Insertionsort by the following argument ? The correct answer is b. Can someone CLEARLY explain every answer?

A person claims that they can improve InsertionSort by the following argument. In the innermost loop of InsertionSort, instead of looping over all entries in the already sorted array in order to insert the j’th observed element, simply perform BinarySearch in order to sandwich the j’th element in its correct position in the list A[1, ... , j−1]. This person claims that their resulting insertion sort is asymptotically as good as mergesort in the worst case scenario. True or False and why? Circle the one correct answer from the below:
a. True: In this version, the while loop will iterate log(n), but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards across the median elements and so this shift will still require log(n) in the worst case scenario. Adding up, Insertion Sort will significantly improve in this case to continue to require n log(n) in the worst case scenario like mergesort.
b. False: In this version, the while loop will iterate log(n), but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards and so this shift will still require n in the worst case scenario. Adding up, Insertion Sort will continue to require n² in the worst case scenario which is orders of magnitude worse than mergesort.
c. False: In this version, the while loop will iterate n, but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards and so this shift will still require log(n) in the worst case scenario. Adding up, Insertion Sort will continue to require n log(n) in the worst case scenario which is orders of magnitude worse than mergesort.
d. True: In this version, the while loop will iterate log(n), but in each such iteration elements in the left side of the list have to be shifted to allow room for the key to propagate downwards and so this shift will still require n in the worst case scenario. Adding up, Insertion Sort will continue to require n log(n) in the worst case scenario which is orders of magnitude worse than mergesort.
b is correct, with some assumptions about compiler optimizations.
Consider a reverse sorted array,
8 7 6 5 4 3 2 1
and that insertion sort is half done so it is
5 6 7 8 4 3 2 1
The next step:
normal insertion sort sequence assuming most recent value read kept in register:
t = a[4] = 4 1 read
compare t and a[3] 1 read
a[4] = a[3] = 8 1 write
compare t and a[2] 1 read
a[3] = a[2] = 7 1 write
compare t and a[1] 1 read
a[2] = a[1] = 6 1 write
compare t and a[0] 1 read
a[1] = a[0] = 5 1 write
a[0] = t = 4 1 write
---------------
5 read 5 write
binary search
t = a[4] 1 read
compare t and a[1] 1 read
compare t and a[0] 1 read
a[4] = a[3] 1 read 1 write
a[3] = a[2] 1 read 1 write
a[2] = a[1] 1 read 1 write
a[1] = a[0] 1 read 1 write
a[0] = t 1 write
----------------
7 read 5 write
If a compiler re-read data with normal insertion sort it would be
9 read 5 write
In which case the binary search would save some time.
The expected answer to this question is b), but the explanation is not precise enough:
locating the position where to insert the j-th element indeed requires log(j) comparisons instead of j comparisons for regular Insertion Sort.
inserting the elements requires j element moves in the worst case for both implementations (reverse sorted array).
Summing these over the whole array produces:
n log(n) comparisons for this modified Insertion Sort idea in all cases vs: n2 comparisons in the worst case (already sorted array) for the classic implementation.
n2 element moves in the worst case in both implementations (reverse sorted array).
note that in the classic implementation the sum of the number of comparisons and element moves is constant.
Merge Sort on the other hand uses approximately n log(n) comparisons and n log(n) element moves in all cases.
Therefore the claim the resulting insertion sort is asymptotically as good as mergesort in the worst case scenario is False, indeed because the modified Insertion Sort method still performs n2 element moves in the worst case, which is asymptotically much worse than n log(n) moves.
Note however that depending on the relative cost of comparisons and element moves, the performance of this modified Insertion Sort approach may be much better than the classic implementation, for example sorting an array of string pointers containing URLs to the same site, the cost of comparing strings with a long initial substring is much greater than moving a single pointer.

How do I calculate the sum efficiently?

Given an integer n such that (1<=n<=10^18)
We need to calculate f(1)+f(2)+f(3)+f(4)+....+f(n).
f(x) is given as :-
Say, x = 1112222333,
then f(x)=1002000300.
Whenever we see a contiguous subsequence of same numbers, we replace it with the first number and zeroes all behind it.
Formally, f(x) = Sum over all (first element of the contiguous subsequence * 10^i ), where i is the index of first element from left of a particular contiguous subsequence.
f(x)=1*10^9 + 2*10^6 + 3*10^2 = 1002000300.
In, x=1112222333,
Element at index '9':-1
and so on...
We follow zero based indexing :-)
For, x=1234.
Element at index-'0':-4,element at index -'1':3,element at index '2':-2,element at index 3:-1
How to calculate f(1)+f(2)+f(3)+....+f(n)?
I want to generate an algorithm which calculates this sum efficiently.
There is nothing to calculate.
Multiplying each position in the array od numbers will yeild thebsame number.
So all you want to do is end up with 0s on a repeated number
IE lets populate some static values in an array in psuedo code
$As[1]='0'
$As[2]='00'
$As[3]='000'
...etc
$As[18]='000000000000000000'```
these are the "results" of 10^index
Given a value n of `1234`
```1&000 + 2&00 +3 & 0 + 4```
Results in `1234`
So, if you are putting this on a chip, then probably your most efficient method is to do a bitwise XOR between each register and the next up the line as a single operation
Then you will have 0s in all the spots you care about, and just retrive the values in the registers with a 1
In code, I think it would be most efficient to do the following
```$n = arbitrary value 11223334
$x=$n*10
$zeros=($x-$n)/10```
Okay yeah we can just do bit shifting to get a value like 100200300400 etc.
To approach this problem, it could help to begin with one digit numbers and see what sum you get.
I mean like this:
Let's say, we define , then we have:
F(1)= 45 # =10*9/2 by Euler's sum formula
F(2)= F(1)*9 + F(1)*100 # F(1)*9 is the part that comes from the last digit
# because for each of the 10 possible digits in the
# first position, we have 9 digits in the last
# because both can't be equal and so one out of ten
# becomse zero. F(1)*100 comes from the leading digit
# which is multiplied by 100 (10 because we add the
# second digit and another factor of 10 because we
# get the digit ten times in that position)
If you now continue with this scheme, for k>=1 in general you get
F(k+1)= F(k)*100+10^(k-1)*45*9
The rest is probably straightforward.
Can you tell me, which Hackerrank task this is? I guess one of the Project Euler tasks right?

Longest common sub sequence recurrence

The recurrence for lcs is:
L[i,j] = max(L[i-1,j], L[i,j-1]) if a[i] != a[j]
Can you tell me why it is i-1 or j-1? Why isn't L[i,j] = L[i-1,j-1] correct?
You are considering the case where a[i] != a[j], which means that the letters you are currently comparing of the two sequences A and B are different. Therefore, the length of the longest common subsequence is one of two things :
the longest common subsequence of the current substring of A minus its first character and B, i.e. L[i-1,j] ;
the longest common subsequence of A and the current substring of B minus its first character, i.e. L[i,j-1].
If L[i-1,j-1] were correct, it would mean that the current characters in both A and B do not count, they don’t get a "chance" to be part of the subsequence.
See for example this explanation (note that it works forward in the sequences instead of backward).

Correct loop invariant?

I am trying to find the loop invariant in the following code:
Find Closest Pair Iter(A) :
# Precondition: A is a non-empty list of 2D points and len(A) > 1.
# Postcondition: Returns a pair of points which are the two closest points in A.
min = infinity
p = -1
q = -1
for i = 0,...,len(A) - 1:`=
for j = i + 1,...,len(A) - 1:
if Distance(A[i],A[j]) < min:
min = Distance(A[i],A[j])
p = i
q = j
return (A[p],A[q])
I think the loop invariant is min = Distance(A[i],A[j]) so closest point in A is A[p] and a[q] .
I'm trying to show program correctness. Here I want to prove the inner loop by letting i be some constant, then once I've proven the inner loop, replace it by it's loop invariant and prove the outer loop. By the way this is homework. Any help will be much appreciated.
I'm not sure I fully understand what you mean by replacing the inner loop by its loop invariant. A loop invariant is a condition that holds before the loop and after every iteration of the loop (including the last one).
That being said, I wouldn't like to spoil your homework, so I'll try my best to help you without giving too much of the answer away. Let me try:
There are three variables in your algorithm that hold very important values (min, p and q). You should ask yourself what is true about these values as the algorithm goes through each pair of points (A[i], A[j])?
In a simpler example: if you were designing an algorithm to sum values in a list, you would create a variable called sum before the loop and assign 0 to it. You would then sum the elements one by one through a loop, and then return the variable sum.
Since it is true that this variable holds the sum of every single element "seen" in the loop, and since after the main loop the algorithm will have "seen" every element in the list, the sum variable necessarily holds the sum of all values in the list. In this case the loop invariant would be: The sum variable holds the sum of every element "seen" so far.
Good luck with your homework!

Number of possible binary strings of length k

One of my friends was asked this question recently:
You have to count how many binary strings are possible of length "K".
Constraint: Every 0 has a 1 in its immediate left.
This question can be reworded:
How many binary sequences of length K are posible if there are no two consecutive 0s, but the first element should be 1 (else the constrains fails). Let us forget about the first element (we can do it bcause it is always fixed).
Then we got a very famous task that sounds like this: "What is the number of binary sequences of length K-1 that have no consecutive 0's." The explanation can be found, for example, here
Then the answer will be F(K+1) where F(K) is the K`th fibonacci number starting from (1 1 2 ...).
∑ From n=0 to ⌊K/2⌋ of (K-n)Cn; n is the number of zeros in the string
The idea is to group every 0 with a 1 and find the number of combinations of the string, for n zeros there will be n ones grouped to them so the string becomes (k-n) elements long. There can be no more than of K/2 zeros as there would not have enough ones to be to the immediate left of each zero.
E.g. 111111[10][10]1[10] for K = 13, n = 3