Bitcoin Blockchain - Verification process - cryptography

As the title states, basically my question is about the blockchain verification. I know whats a block chain and basically understood how mining is working, except once simple thing.
Let's say we have 2 guys, Bob and Adam.
Blockchain:
|1|-|2|-|3|-{4} - Bob Chain
|1|-|2|-|3|-{4} - Adam Chain
Assume both Bob and Adam found a new block, but it will not be verified until someone finds a next block. So my questions is whats is happening in a situation if Adam will find a block |5| first. Will Bob get his reward for finding a block? Or it means if Adam found one block, he has to find the next one which is extremely difficult without a huge network of computational resources in order to verify his previous block |4| and get reward for block 4 of 12.5 bitcoins, because the nodes will only accept the longest blockchain? I hope I clearly illustrated the picture. I've tried to find the answer in different videos and materials but somehow this aspect was put aside. If my assumption is true, it means there is no way how can a single person to earn anything from mining without a huge network ?

First of all, in Bitcoin when someone creates a block, he broadcasta it to the rest ot the network. As you said, if there are two people that create the block at the same time, they will broadcast it. So, you will get two blocks at the same time. Although you save both blocks, you will try to mine one of them. After some time, one of the two branches will be longer, so you will delete the second one.
The miners of the Blockchain will create some blocks and a branch will be longer after some time.
In Blockchain, a block is considered well when it has 100 blocks (I d0n't know exactly how many) above of it. So, the reward is taken after 100 blocks, not before.

Who out of Adam or Bob gets the reward depends on whose block remains as part of the 'Best chain' eventually. This in turn partly depends on consensus rules and partly on how the things transpired. This is explained as follows
Adam and Bob claimed that they found the block first by broadcasting to peers almost at same time.
Let's presume that a peer named 'ITWala' saw these 2 blocks which would be at same height. Assuming Adam's block reached ITWala's node first. So this would lead to something known as 'forking' in block chain terminology and is quite normal to occur.
**Chain status on ITWala's node leading to forking **
Block1 --> Block 2 --> Block 3 --> Block 4 (Adam's Block)
|
|--- Block 4 (Bob's Block)
One of following can happen:
Case 1 - For sake of simplicity, assume that Bob is the only one making claim for block 5. Now 'ITWala' receives block 5. He tries to make the chain longer by trying to fit at one end of the fork created from Block 4 from Adam. It does not fit because the hash of the previous block would not match.
Result
Fork at the end of Adam's block is discarded. Fork with Bob's block becomes active chain and thus Bob becomes the winner of 4 as well as 5 for award.
Case 2:
Block 5 is created by 'ITWala' or some peer of ITWala who synced up with copy on ITWala's node.
Result: In this case, ITWala would use Adam's block to activate the best chain as it arrived first making him the winner for block 4. Block 5 is awarded to ITWala
There can be more combinations. However point here is that the block that remains in best chain wins the award.

Related

misunderstanding how the threads work

I have a problem, big problem with threads in vb.net
First of all I want to tell that I didn't work with threads before (just on the school), I read lot of pages about it, but none of them could help me for my problem.
My main question here is understand the logic, and after if is possible, solve the problem that I have, both are related, then I going to explain the problem.
The code hasn't comments and/or documentation related, and is a program developed lot of years ago and the guy that did this is not working on the office, nobody knows how it works :S
I have a list called listOfProccess, and when is only 1 works fine.
in the callback function in QueueUserWorkItem fill the information about p, then execute the thread, I suppose
That list contain an array with information type
listOfProccess[].type = 'a/b/c/d/e/f/g/'
also the list include an ID.
Code:
If listOfProccess.Count > 0 Then
Threading.ThreadPool.SetMinThreads(1, 1)
Threading.ThreadPool.SetMaxThreads(4, 4)
For Each p In listOfProccess
Try
Threading.ThreadPool.QueueUserWorkItem(New Threading.WaitCallback(Object p.function))
Catch e As Exception
sendMail("mail#mail", "alerts#mail.ie", "", e.StackTrace)
End Try
Next
Problems:
I have two problems here:
Sometimes execute an item in the list i.e. 'a' in a infinite loop, and expends all resources of the machine, but if i close and restart again, works, I didn't know if is a problem with the threads or not, sincerely I think that is other thing, because this problem starts two or tree weeks ago, and the program still running during a year.
This one i think is related about threads, if I have two (or more) p on the list like this:
p[1].type = 'a/b/c/d/e/f/g/'
p[1].ID = 1
p[2].type = 'ww/xx/ff/yy/aa/rr/'
p[2].ID =2
when the system execute something like this, the way that it follows is 'random' ie. takes for the first one, a, b,c and after this it does ww, and come back to the first. the problem is bigger if I have more items in the list, like 4 or 5; this is not a very big problem, because the program works, it not works 100% fine, but works, is more to try the understand why it works on this way.
Any help is welcome.
The second problem is a race condition problem, as you can't guarantee the order of the execution for the threads, there is a non-zero probability that your threads will replace each other's values. There are a lot of wayss to solve this issue: algorythms (either lock-oriented and lock-free), synchronization techniques and so on, and there is no silver-bullet solution for this.
First problem is unclear for me, as I can't understand what exactly do you mean by a infinite loop - may be this can happen if you link items to deleted (from other thread) ones and there is no way to go out of your task, as the links in a list are broken. This still should be solved with sychronization.
I think your should start with MSDN articles or some book about multithreading in general, and after that you should split your program step by step for a small parts you understand.
Update:
p.function - about the infinite loop you should consider the code of this delegate. If there is a condition to restart work, you should check a recursion limit. For example, if there is an optimistic locking algorythm, your code can find out that the update it tried isn't valid, and restart it. And if the links are broken, it will never end it's task, and will stay forever in an infinite loop.

Prolog game programming board evaluation

I have created a game, (4 in a row), in prolog. My heuristic function requires me to know how many Player's and Opponent's chips are in each possible 4-row combination on the board. The method I am using is as follows (in psuedocodish):
I have 1 list of all possible fours of the board (ComboList) =of the form==> [[A,B,C,D]|Rest].
I have 1 list of all the moves of the 1st player (List1) =of the form==> [[1],[7],[14]]
And 1 for opponent's moves (List2).
Step 1: obtain the first combo from ComboList, 2:
Check all of List1 to see how many are in this combo, 3:
Check all of List2 to see how many are in this combo,
Move onto the next combo from ComboList and start over...
This PROCESS takes waay too much runtime for what is required.
Please can someone suggest something better and more efficient! Much thanks in advance!
The following code uses member/3, which has also to become
known as nth1/3. See here:
http://storage.developerzen.com/fourrow.pro.txt
The predicate is nowadays found in library(lists) and has
possibly native support or a fast implementation:
http://www.swi-prolog.org/pldoc/man?predicate=nth1/3
But I guess asserting some facts and relying on argument
indexing might get you an even better result. See for
example here:
http://www.mxro.de/applications/four-in-a-row
Hope this helps.
Bye

Understanding branches in gcov files

I'm trying to understand the output of the gcov tool. Running it with no options makes sense, but I'm wanting to try and understand the branch coverage options. Unfortunately it's hard to make sense of what the branches do and why they aren't taken. Below is the output for a method (compile using the latest LLVM/Clang build).
function -[TestCoverageAppDelegate loopThroughArray:] called 5 returned 100% blocks executed 88%
5: 30:- (NSInteger)loopThroughArray:(NSArray *)array {
5: 31: NSInteger i = 0;
22: 32: for (NSString *string in array) {
branch 0 taken 0
branch 1 taken 7
-: 33:
22: 34: }
branch 0 taken 4
branch 1 taken 3
branch 2 taken 0
branch 3 taken 3
5: 35: return i;
-: 36:}
I've run 5 test through this, passing in nil, an empty array, an array with 1 object, and array with 2 objects and an array with 4 objects. I can guess that in the first case, branch 1 means "go into the loop" but I haven't a clue what branch 0 is. In the second case branch 0 seems to be loop through again, branch 1 seems to be end the loop and branch 3 is continue/exit the loop, but I have no idea what branch 2 is or why/when it would be executed.
If anyone knows how to decipher the branch info, or knows of any detailed documentation on what it all means, I'd appreciate the help.
Gcov works by instrumenting (while compiling) every basic block of machine commands (you can think about assembler). Basic block means a linear section of code, which have no branches inside it and no lables inside it. So, If and only if you start running a basic block, you will reach end of basic block. Basic blocks are organized in CFG (Control flow graph, think about it as directed graph), which shows relations between basicblocks (edge from V1 to V2 is V1 calls V2; and V2 is called by V1). So, profile-arcs mode of compiler and gcov want to get execution count for every line and do this via counting basic block executions. Some of edges in CFG are instrumented and some are not, because there are algebraic relations between basic blocks in graph.
Your ObjC construction (for..in) is lowered (converted in early compilation) to several basic blocks. So, gcov sees 4 branches, because it sees only lowered BBs. It knows nothing about this lowering, but it knows which line corresponds to every assembler instruction (this is debug info). So, branches are edges of CFG.
If you want to see basic blocks, you should do an assembler dump of compiled program or disassemble a binary or dump CFG from compiler. You can do this both for profile-arcs and non-profile-arcs modes and compare them.
profile-arcs mode will have a lot calls and increments of something like "__llvm_gcov_ctr" or "__llvm_gcda_edge" - it is an actual instrumentation of basic blocks.

State based testing(state charts) & transition sequences

I am really stuck with some state based testing concepts...
I am trying to calculate some checking sequences that will cover all transitions from each state and i have the answers but i dont understand them:
alt text http://www.gam3r.co.uk/1m.jpg
Now the answers i have are :
alt text http://www.gam3r.co.uk/2m.jpg
I dont understand it at all. For example say we want to check transition a/x from s1, wouldnt we do ab only? As we are already in s1, we do a/x to test the transition to s2, then b to check we are in the previous right state(s1)? I cant understand why it is aba or even bb for s1...
Can anyone talk me through it?
Thanks
There are 2 events available in each of 4 states, giving 8 transitions, which the author has decided to test in 8 separate test sequences. Each sequence (except the S1 sequences - apparently the machine's initial state is S1) needs to drive the machine to the target state and then perform either event a or event b.
The sequences he has chosen are sufficient, in that each transition is covered. However, they are not unique, and - as you have observed - not minimal.
A more obvious choice would be:
a b
ab aa
aaa aab
ba bb
I don't understand the author's purpose in adding superfluous transitions at the end of each sequence. The system is a Mealy machine - the behaviour of the machine is uniquely determined by the current state and event. There is no memory of the path leading to the current state; therefore the author's extra transitions give no additional coverage and serve only to confuse.
You are also correct that you could cover the all transitions with a shorter set of paths through the graph. However, I would be disinclined to do that. Clarity is more important than optimization for test code.

What are the most hardcore optimisations you've seen?

I'm not talking about algorithmic stuff (eg use quicksort instead of bubblesort), and I'm not talking about simple things like loop unrolling.
I'm talking about the hardcore stuff. Like Tiny Teensy ELF, The Story of Mel; practically everything in the demoscene, and so on.
I once wrote a brute force RC5 key search that processed two keys at a time, the first key used the integer pipeline, the second key used the SSE pipelines and the two were interleaved at the instruction level. This was then coupled with a supervisor program that ran an instance of the code on each core in the system. In total, the code ran about 25 times faster than a naive C version.
In one (here unnamed) video game engine I worked with, they had rewritten the model-export tool (the thing that turns a Maya mesh into something the game loads) so that instead of just emitting data, it would actually emit the exact stream of microinstructions that would be necessary to render that particular model. It used a genetic algorithm to find the one that would run in the minimum number of cycles. That is to say, the data format for a given model was actually a perfectly-optimized subroutine for rendering just that model. So, drawing a mesh to the screen meant loading it into memory and branching into it.
(This wasn't for a PC, but for a console that had a vector unit separate and parallel to the CPU.)
In the early days of DOS when we used floppy discs for all data transport there were viruses as well. One common way for viruses to infect different computers was to copy a virus bootloader into the bootsector of an inserted floppydisc. When the user inserted the floppydisc into another computer and rebooted without remembering to remove the floppy, the virus was run and infected the harddrive bootsector, thus permanently infecting the host PC. A particulary annoying virus I was infected by was called "Form", to battle this I wrote a custom floppy bootsector that had the following features:
Validate the bootsector of the host harddrive and make sure it was not infected.
Validate the floppy bootsector and
make sure that it was not infected.
Code to remove the virus from the
harddrive if it was infected.
Code to duplicate the antivirus
bootsector to another floppy if a
special key was pressed.
Code to boot the harddrive if all was
well, and no infections was found.
This was done in the program space of a bootsector, about 440 bytes :)
The biggest problem for my mates was the very cryptic messages displayed because I needed all the space for code. It was like "FFVD RM?", which meant "FindForm Virus Detected, Remove?"
I was quite happy with that piece of code. The optimization was program size, not speed. Two quite different optimizations in assembly.
My favorite is the floating point inverse square root via integer operations. This is a cool little hack on how floating point values are stored and can execute faster (even doing a 1/result is faster than the stock-standard square root function) or produce more accurate results than the standard methods.
In c/c++ the code is: (sourced from Wikipedia)
float InvSqrt (float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;
i = 0x5f3759df - (i>>1); // Now this is what you call a real magic number
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
A Very Biological Optimisation
Quick background: Triplets of DNA nucleotides (A, C, G and T) encode amino acids, which are joined into proteins, which are what make up most of most living things.
Ordinarily, each different protein requires a separate sequence of DNA triplets (its "gene") to encode its amino acids -- so e.g. 3 proteins of lengths 30, 40, and 50 would require 90 + 120 + 150 = 360 nucleotides in total. However, in viruses, space is at a premium -- so some viruses overlap the DNA sequences for different genes, using the fact that there are 6 possible "reading frames" to use for DNA-to-protein translation (namely starting from a position that is divisible by 3; from a position that divides 3 with remainder 1; or from a position that divides 3 with remainder 2; and the same again, but reading the sequence in reverse.)
For comparison: Try writing an x86 assembly language program where the 300-byte function doFoo() begins at offset 0x1000... and another 200-byte function doBar() starts at offset 0x1001! (I propose a name for this competition: Are you smarter than Hepatitis B?)
That's hardcore space optimisation!
UPDATE: Links to further info:
Reading Frames on Wikipedia suggests Hepatitis B and "Barley Yellow Dwarf" virus (a plant virus) both overlap reading frames.
Hepatitis B genome info on Wikipedia. Seems that different reading-frame subunits produce different variations of a surface protein.
Or you could google for "overlapping reading frames"
Seems this can even happen in mammals! Extensively overlapping reading frames in a second mammalian gene is a 2001 scientific paper by Marilyn Kozak that talks about a "second" gene in rat with "extensive overlapping reading frames". (This is quite surprising as mammals have a genome structure that provides ample room for separate genes for separate proteins.) Haven't read beyond the abstract myself.
I wrote a tile-based game engine for the Apple IIgs in 65816 assembly language a few years ago. This was a fairly slow machine and programming "on the metal" is a virtual requirement for coaxing out acceptable performance.
In order to quickly update the graphics screen one has to map the stack to the screen in order to use some special instructions that allow one to update 4 screen pixels in only 5 machine cycles. This is nothing particularly fantastic and is described in detail in IIgs Tech Note #70. The hard-core bit was how I had to organize the code to make it flexible enough to be a general-purpose library while still maintaining maximum speed.
I decomposed the graphics screen into scan lines and created a 246 byte code buffer to insert the specialized 65816 opcodes. The 246 bytes are needed because each scan line of the graphics screen is 80 words wide and 1 additional word is required on each end for smooth scrolling. The Push Effective Address (PEA) instruction takes up 3 bytes, so 3 * (80 + 1 + 1) = 246 bytes.
The graphics screen is rendered by jumping to an address within the 246 byte code buffer that corresponds to the right edge of the screen and patching in a BRanch Always (BRA) instruction into the code at the word immediately following the left-most word. The BRA instruction takes a signed 8-bit offset as its argument, so it just barely has the range to jump out of the code buffer.
Even this isn't too terribly difficult, but the real hard-core optimization comes in here. My graphics engine actually supported two independent background layers and animated tiles by using different 3-byte code sequences depending on the mode:
Background 1 uses a Push Effective Address (PEA) instruction
Background 2 uses a Load Indirect Indexed (LDA ($00),y) instruction followed by a push (PHA)
Animated tiles use a Load Direct Page Indexed (LDA $00,x) instruction followed by a push (PHA)
The critical restriction is that both of the 65816 registers (X and Y) are used to reference data and cannot be modified. Further the direct page register (D) is set based on the origin of the second background and cannot be changed; the data bank register is set to the data bank that holds pixel data for the second background and cannot be changed; the stack pointer (S) is mapped to graphics screen, so there is no possibility of jumping to a subroutine and returning.
Given these restrictions, I had the need to quickly handle cases where a word that is about to be pushed onto the stack is mixed, i.e. half comes from Background 1 and half from Background 2. My solution was to trade memory for speed. Because all of the normal registers were in use, I only had the Program Counter (PC) register to work with. My solution was the following:
Define a code fragment to do the blend in the same 64K program bank as the code buffer
Create a copy of this code for each of the 82 words
There is a 1-1 correspondence, so the return from the code fragment can be a hard-coded address
Done! We have a hard-coded subroutine that does not affect the CPU registers.
Here is the actual code fragments
code_buff: PEA $0000 ; rightmost word (16-bits = 4 pixels)
PEA $0000 ; background 1
PEA $0000 ; background 1
PEA $0000 ; background 1
LDA (72),y ; background 2
PHA
LDA (70),y ; background 2
PHA
JMP word_68 ; mix the data
word_68_rtn: PEA $0000 ; more background 1
...
PEA $0000
BRA *+40 ; patched exit code
...
word_68: LDA (68),y ; load data for background 2
AND #$00FF ; mask
ORA #$AB00 ; blend with data from background 1
PHA
JMP word_68_rtn ; jump back
word_66: LDA (66),y
...
The end result was a near-optimal blitter that has minimal overhead and cranks out more than 15 frames per second at 320x200 on a 2.5 MHz CPU with a 1 MB/s memory bus.
Michael Abrash's "Zen of Assembly Language" had some nifty stuff, though I admit I don't recall specifics off the top of my head.
Actually it seems like everything Abrash wrote had some nifty optimization stuff in it.
The Stalin Scheme compiler is pretty crazy in that aspect.
I once saw a switch statement with a lot of empty cases, a comment at the head of the switch said something along the lines of:
Added case statements that are never hit because the compiler only turns the switch into a jump-table if there are more than N cases
I forget what N was. This was in the source code for Windows that was leaked in 2004.
I've gone to the Intel (or AMD) architecture references to see what instructions there are. movsx - move with sign extension is awesome for moving little signed values into big spaces, for example, in one instruction.
Likewise, if you know you only use 16-bit values, but you can access all of EAX, EBX, ECX, EDX , etc- then you have 8 very fast locations for values - just rotate the registers by 16 bits to access the other values.
The EFF DES cracker, which used custom-built hardware to generate candidate keys (the hardware they made could prove a key isn't the solution, but could not prove a key was the solution) which were then tested with a more conventional code.
The FSG 2.0 packer made by a Polish team, specifically made for packing executables made with assembly. If packing assembly isn't impressive enough (what's supposed to be almost as low as possible) the loader it comes with is 158 bytes and fully functional. If you try packing any assembly made .exe with something like UPX, it will throw a NotCompressableException at you ;)