How does Basic Paxos proposer know when to increment roundId / proposal number - paxos

Looking at the screenshot from this video, around 27:20
https://www.youtube.com/watch?v=JEpsBg0AO6o
When server S5 sends out the prepare RPC, it uses the roundId (proposal number) 4 and server number 5 (hence 4.5) as well as value "Y".
But how does it know 4 is the the roundId to use? Earlier, S1 used up roundId 3, but there's no way for S5 to know about that, as there hasn't been communication between S5 and anybody else at the time S5 chose 4 as roundId.

In theory, there is no need to know what is the latest number as every proposer will keep increasing the round number until it gets it right.
In your example S5 knows nothing, so it will start with smallest number and then keep going up.
In practical application, when a proposer proposes a number, and if the proposal is declined by an acceptor, the declined message will contain the largest round number seen so far by that acceptor; this will help the proposer to retry with a larger round number (instead of increasing their current number one by one).
-- Edit: Max posted a link with an answer claiming (at of now) that N has to be unique per proposer
Let me explain why there is no global uniqueness requirement by an example.
Let's say we have a system with two proposers and three accepters(and few learners):
Both proposers sent PREPARE(1) - same number - to all acceptors.
Based on paxos rules, only one of proposers will get the majority of PROMISE messages - this is based on the rule that acceptor promises if PREPARE has number strongly greater than any other previously seen by the acceptor.
Now we are in a state where one proposer has two (or three) PROMISES for N=1 and the other proposer has one (or zero) PROMISE with N = 1.
Only first proposer may issue ACCEPT(1, V) as it got majority. The other proposer does not have the majority of PROMISES and has to retry with a larger N.
After the other proposer retries, it will use N larger than any other it saw before - hence it will try with N=2
From now on, it's all works the same way - proposer PREPARES and if get majority of PROMISE for its N, then the proposer issues ACCEPT(N, VALUE_ACCORDING_TO_PROTOCOL)
The key understanding for paxos is there is no way to have two ACCEPT(N, V) messages being sent where same N will have different V, hence there is no issue that two proposer use the same N.
As for initiating every node with some unique ID - that's ok; will it improve the performance - it's a big question and I haven't see a formal proof for that yet.

Related

Determining a program's execution time by its length in bits?

This is a question popped into my mind while reading the halting problem, collatz conjecture and Kolmogorov complexity. I have tried to search for something similar but I was unable to find a particular topic maybe because it is not of great value or it could just be a trivial question.
For the sake of simplicity I will give three examples of programs/functions.
function one(s):
return s
function two(s):
while (True):
print s
function three(s):
for i from 0 to 10^10:
print(s)
So my questions is, if there is a way to formalize the length of a program (like the bits used to describe it) and also the internal memory used by the program, to determine the minimum/maximum number of time/steps needed to decide whether the program will terminate or run forever.
For example, in the first function the program doesn't alter its internal memory and halts after some time steps.
In the second example, the program runs forever but the program also doesn't alter its internal memory. For example, if we considered all the programs with the same length as with the program two that do not alter their state, couldn't we determine an upper bound of steps, which if surpassed we could conclude that this program will never terminate ? (If not why ?)
On the last example, the program alters its state (variable i). So, at each step the upper bound may change.
[In short]
Kolmogorov complexity suggests a way of finding the (descriptive) complexity of an object such as a piece of text. I would like to know, given a formal way of describing the memory-space used by a program (computed in runtime), if we could compute a maximum number of steps, which if surpassed would allow us to know whether this program will terminate or run forever.
Finally, I would like to suggest me any source that I might find useful and help me figure out what I am exactly looking for.
Thank you. (sorry for my English, not my native language. I hope I was clear)
If a deterministic Turing machine enters precisely the same configuration twice (which we can detect b keeping a trace of configurations seen so far), then we immediately know the TM will loop forever.
If it known in advance that a deterministic Turing machine cannot possibly use more than some fixed constant amount of its input tape, then the TM must explicitly halt or eventually enter some configuration it has already visited. Suppose the TM can use at most k tape cells, the tape alphabet is T and the set of states is Q. Then there are (|T|+1)^k * |Q| unique configurations (the number of strings over (T union blank) of length k times the number of states) and by the pigeonhole principle we know that a TM that takes that many steps must enter some configuration it has already been to before.
one: because we are given that this function does not use internal memory, we know that it either halts or loops forever.
two: because we are given that this function does not use internal memory, we know that it either halts or loops forever.
three: because we are given that this function only uses a fixed amount of internal memory (like 34 bits) we can tell in fewer than 2^34 iterations of the loop whether the TM will halt or not for any given input s, guaranteed.
Now, knowing how much tape a TM is going to use, or how much memory a program is going to use, is not a problem a TM can solve. But if you have an oracle (like a person who was able to do a proof) that tells you a correct fixed upper bound on memory, then the halting problem is solvable.

Could the STAN number be repeteable and random?

I'm developing a Connector with some bank, and we're using the ISO8583 protocol, right now, i'm setting the STAN(field 11) with some random number generated with a random generator but sometimes I have some number collisions, the question is, could I safely use this generator or do I need to make the STAN a sequential number?
Thanks in advance.
The System Trace Audit Number (STAN) ISO-8583 number has different values and is maintained basically between relationships within the transaction. That is it can stay the same or the same transaction will have many STANs over its transaction path but it SHOULD be the same between two end point and it is usually controlled in settings whos STAN to use.
For Example:
Terminal -> Terminal Driver -> Switch 1->Switch 2->Issuer
The STAN is say assign by the terminal driver and then remains constant at minimum for the following relationships... though may change for each relationship.
Terminal Driver - Switch 1
Switch 1 -> Switch 2
Switch 2 -> Issuer
Note that internally within each system to the STAN may be unique as well but it needs to keep a unique STAN for each relationship.. and it shouldn't change between the request and response as it is needed for multi-part transactions (Single PA, Multiple Completions & Multi-PA, Single Completion) as well as for reversals and such in Data Element 90.
Depends on your remote endpoint, but I've seen many requiring sequential numbers, and detecting duplicates.
Usually STAN is the number increased for each request.
Random STAN generation is not the best case for network messages sequences.
The duplication of STANs can be due to different sources, i.e. Host clients or Terminals.
STAN itself cannot be the only field to detect unique transaction requests. It must be mixed together with other fields like RRN, Terminal ID, Merchant ID.
See also "In ISO message, what's the use of stan and rrn ?"

bitcoin block solving, all nonces used but no hit

I'm trying to understand how bitcoin block solving attempts works.
I see a nonce is a 32-bit number, so around 4 billion values to try.
Also, I saw a famous mining pool having 500 Ph/s power at hand. And I found there one particular block solved in 40 minutes.
So, that is (40 x 3600) x (500 x 10^15) = 7.2 x 10^22 hashes calculated
on that pool, to solve one block.
That means the nonces has been "cycled" 16763 billion times during those 40 minutes.
So I'm wondering what are those 16763 billion more things done after each nonce cycle? ("1 cycle of nonces" is going from 0 to 4294967295) ?
I see that we can change the timestamp at a certain proportion, and the merkel root hash also.
Aren't merkel hashes and timestamps more strict to calculate and use than nonces?
Those 16763 billion things are changes of the timestamp and merkel only? Can we have as much unique merkel hashes re-generated and timestamps changes as needed?
Can you give me examples? sorry if my view is a bit biased, I'm starting with this.
Apparently, I've found that when the nonces have cycled (overflow), an extraNonce value is incremented, and that requires the Merkel hash to be recalculated based on that extraNonce value.
a link here

Implementing Round Robin insertions to oracle rac db with the help of sequence

Problem
My system inserts records to oracle rac DB at a rate of 600tps. During the insertion-procedure-call each record is assigned a sequence, so that each record should get distributed among 20 different batch ids (implementation of a round robin mechanism).
Following is the step for selecting batch
1) A record comes. Assigns nextValue from a sequence.
2) Do MOD(sequence,20). It gives values from 0 to 19.
Issue:
3 records comes to DB simultaneously and hits 3 different nodes in RAC
Comes out with sequences 2,102,1002.
MOD for all happens to be same.
All try to get into the same batch.
Round Robin fails here.
Please help to resolve the issue.
This is due to the implementation of Sequences on RAC. When a node is first asked for the next value of a sequence it get a bunch of them (e.g. 100 to 119) and then hands them out until it needs a new lot, when it gets another bunch (160 - 179). While Node 1 is handing out 100 then 101, Node 2 will be handing out 121, 122 etc etc.
The size of the 'bunch' is controlled by as I remember the Cache size defined on a Sequence. If you set a cache size of 0, then you will get no caching, and the sequences will be handed out sequentially. However, doing that will involve the Nodes is management overhead while they work out what the next one actually is, and with 600tps this might not be a good idea: you'd have to try it and see,

Storage algorithm question - verify sequential data with little memory

I found this on an "interview questions" site and have been pondering it for a couple of days. I will keep churning, but am interested what you guys think
"10 Gbytes of 32-bit numbers on a magnetic tape, all there from 0 to 10G in random order. You have 64 32 bit words of memory available: design an algorithm to check that each number from 0 to 10G occurs once and only once on the tape, with minimum passes of the tape by a read head connected to your algorithm."
32-bit numbers can take 4G = 2^32 different values. There are 2.5*2^32 numbers on tape total. So after 2^32 count one of numbers will repeat 100%. If there were <= 2^32 numbers on tape then it was possible that there are two different cases – when all numbers are different or when at least one repeats.
It's a trick question, as Michael Anderson and I have figured out. You can't store 10G 32b numbers on a 10G tape. The interviewer (a) is messing with you and (b) is trying to find out how much you think about a problem before you start solving it.
The utterly naive algorithm, which takes as many passes as there are numbers to check, would be to walk through and verify that the lowest number is there. Then do it again checking that the next lowest is there. And so on.
This requires one word of storage to keep track of where you are - you could cut down the number of passes by a factor of 64 by using all 64 words to keep track of where you're up to in several different locations in the search space - checking all of your current ones on each pass. Still O(n) passes, of course.
You could probably cut it down even more by using portions of the words - given that your search space for each segment is smaller, you won't need to keep track of the full 32-bit range.
Perform an in-place mergesort or quicksort, using tape for storage? Then iterate through the numbers in sequence, tracking to see that each number = previous+1.
Requires cleverly implemented sort, and is fairly slow, but achieves the goal I believe.
Edit: oh bugger, it's never specified you can write.
Here's a second approach: scan through trying to build up to 30-ish ranges of contiginous numbers. IE 1,2,3,4,5 would be one range, 8,9,10,11,12 would be another, etc. If ranges overlap with existing, then they are merged. I think you only need to make a limited number of passes to either get the complete range or prove there are gaps... much less than just scanning through in blocks of a couple thousand to see if all digits are present.
It'll take me a bit to prove or disprove the limits for this though.
Do 2 reduces on the numbers, a sum and a bitwise XOR.
The sum should be (10G + 1) * 10G / 2
The XOR should be ... something
It looks like there is a catch in the question that no one has talked about so far; the interviewer has only asked the interviewee to write a program that CHECKS
(i) if each number that makes up the 10G is present once and only once--- what should the interviewee do if the numbers in the given list are present multple times? should he assume that he should stop execting the programme and throw exception or should he assume that he should correct the mistake by removing the repeating number and replace it with another (this may actually be a costly excercise as this involves complete reshuffle of the number set)? correcting this is required to perform the second step in the question, i.e. to verify that the data is stored in the best possible way that it requires least possible passes.
(ii) When the interviewee was asked to only check if the 10G weight data set of numbers are stored in such a way that they require least paases to access any of those numbers;
what should the interviewee do? should he stop and throw exception the moment he finds an issue in the algorithm they were stored in, or correct the mistake and continue till all the elements are sorted in the order of least possible passes?
If the intension of the interviewer is to ask the interviewee to write an algorithm that finds the best combinaton of numbers that can be stored in 10GB, given 64 32 Bit registers; and also to write an algorithm to save these chosen set of numbers in the best possible way that require least number of passes to access each; he should have asked this directly, woudn't he?
I suppose the intension of the interviewer may be to only see how the interviewee is approaching the problem rather than to actually extract a working solution from the interviewee; wold any buy this notion?
Regards,
Samba