Is it always true that the CRC of a buffer that has the CRC appended to the end is always 0? - error-handling

Example of the hypothesis...
Is it always true that the CRC of a buffer that has the CRC appended to the end is always 0?
extern uint16_t CRC16(uint_8* buffer, uint16_t size); // From your favorite library
void main() {
uint16_t crc;
uint8_t buffer[10] = {1,2,3,4,5,6,7,8};
crc = CRC16(buffer,8);
buffer[8]= crc>>8; // This may be endian-dependent code
buffer[9]= crc & 0xff; // Ibid.
if (CRC16(buffer,10) != 0)
printf("Should this ever happen???\n");
else
printf("It never happens!\n");
}

If the CRC is modified after it's calculated, such as some CRCs that post complement the CRC after it's generated, then generating a new CRC using data + appended CRC will result in a non-zero but constant value. If the CRC is not post modified, then the result will be zero, regardless if the CRC is initialized to zero or non-zero value before generation.

Is it always true that the CRC of a buffer that has the CRC appended to the end is always 0 ?
Depends on the CRC, and how it is appended. For the 16-bit and 32-bit CRC "CCITT" used in networking (Ethernet, V42..), no: the final CRC (appended in the specified order) is a constant, but not zero: 47 0F for the 16-bit CRC, 1C DF 44 21 for the 32-bit CRC. 16-bit example
-------- message -------- CRC-CCITT-16
01 02 03 04 05 06 07 08 D4 6D
01 02 03 04 05 06 07 08 D4 6D 47 0F
DE AD BE EF CB E5
DE AD BE EF CB E5 47 0F
That comes handy in telecom, where often the layer handling receive knows that the frame ends only after receiving the CRC, which was already entered in the hardware checking the CRC.
The underlying reason is that the 16-bit CRC of 8-byte message m0 m1 … m6 m7 is defined as the remainder of the sequence of /m0 /m1 m2 m3 … m6 m7 FF FF by the generating polynomial.
When we compute the CRC of the message followed by the original CRC r0 r1, the new CRC is thus the remainder of the sequence /m0 /m1 m2 m3 … m6 m7 r0 r1 FF FF by the generating polynomial, and therefore the remainder of the sequence FF FF FF FF by the generating polynomial, thus a constant, but has no reason to be zero.
Try it online in Python!. Includes 16-bit and 32-bit, "by hand" and using external library including one where the constant is zero.
For CRCs that do not append the right number of bits, or output the CRC wrong-endian (these variants are legion), the result depends on the message. This is a sure sign that something is wrong, and that correspondingly
the receiver can no longer enter the message's CRC into the CRC checker and compare the outcome to a constant to check the integrity of the message.
it is lost the the desirable property of CRC that any error concentrated on a sequence of bits no longer than the CRC is caught (if we do not get the CRC straight, an error that overlaps the end of the message and the CRC can sometime remain undetected)

Related

When BHE' signal in 8086 is activated or deactivated?

I'm studying hardware specification for 8086 and I'm wondering What BHE' signal do? when is activated? deactivated?
The 8086 can address bytes (8 bits) and words (16 bits) in memory.
To access a byte at an even address, the A0 signal will be logically 0 and the BHE signal will be 1.
To access a byte at an odd address, the A0 signal will be logically 1 and the BHE signal will be 0.
To access a word at an even address, the A0 signal will be logically 0 and the BHE signal will also be 0.
instruction
A0
BHE
cycles
mov al, [1234h]
0
1
10
mov al, [1235h]
1
0
10
mov ax, [1234h]
0
0
10
To access a word at an odd address, the processor will need to address the bytes separately. This will incur a penalty of 4 cycles!
The instruction mov ax, [1235h] will take 14 cycles.

How is this crc calculated correctly?

I'm looking for help. The chip I'm using via SPI (MAX22190) specifies:
CRC polynom: x5 + x4 + x2 + x0
CRC is calculated using the first 19 data bits padded with the 5-bit initial word 00111.
The 5-bit CRC result is then appended to the original data bits to create the 24-bit SPI data frame.
The CRC result I calculated with multiple tools is: 0x18
However, the chip shows an CRC error on this. It expects: 0x0F
Can anybody tell me where my calculations are going wrong?
My input data (19 data bits) is:
19-bit data:
0x04 0x00 0x00
0000 0100 0000 0000 000
24-bit, padded with init value:
0x38 0x20 0x00
0011 1000 0010 0000 0000 0000
=> Data sent by me: 0x38 0x20 0x18
=> Data expected by chip: 0x38 0x20 0x0F
The CRC algorithm is explained here.
I think your error comes from 00111 padding that must be padded on the right side instead on the left.

What does the tx req option disable ack exactly in the xbee api?

The question, as mentioned in the title, is what causes the tx option field with the value 0x01 (disable ack) exactly. I assumed it disables the aps layer acknowledgement and the additional aps retries. But they occur in any way with aps acknowledgment disabled, too. The retry counter of the tx status frame counts still, sometimes till 60. I think this is a bit too much for the mac layer retries. Or there are also retries in nwk layer?
Regards Toby
the Option 0x01 on TX Request (API frame) doesn't disable the acknowledgment, it does disable the retries (up to 3). The following is a example of a TX Request frame with retries disable:
7E 00 0F 10 01 00 13 A1 00 40 AA D0 06 FF FE 00 01 04 78
For you to disable the acknowledgement you need to set 0x00 on Frame ID of TX Request. Here is an example:
7E 00 0F 10 00 00 13 A1 00 40 AA D0 06 FF FE 00 00 04 7A
I guess the Transmit Retry Count (from ZigBee Transmit Status frame) is related to CSMA-CA.

addressing mode efficiency

Can someone tell me if 'immediate' adressing mode is more
efficient than addresing through [eax] or any other way.
Lets say that I have long function with some reads and
some writes (say 5 reads, five writes) to some int value in memory
mov some_addr, 12 //immediate adressing
//other code
mov eax, aome_addr
//other code
mov some_addr, eax // and so on
versus
mov eax, some_addr
mov [eax], 12 // addressing thru [eax]
//other code
mov ebx, [eax]
//other code
mov [eax], ebx // and so on
which one is faster ?
Probably the register indirect access is slightly faster, but for sure it is shorter in its encoding, for example (warning — gas syntax)
67 89 18 mov %ebx, (%eax)
67 8b 18 mov (%eax), %ebx
vs.
89 1c 25 00 00 00 00 mov %ebx, some_addr
8b 1c 25 00 00 00 00 mov some_addr, %ebx
So it has some implications while loading the instr., the use of cache etc, so that it is probably a bit faster, but in a long function with some reads and writes — I don't think it of much importance...
(The zeros in the hex code are supposed to be filled in by the linker (just to have said this).)
[update date="2012-09-30 ~21h30 CEST":
I have run some tests and I really wonder what they revealed. So much that I didn't investigate further :-)
48 8D 04 25 00 00 00 00 leaq s_var,%rax
C7 00 CC BA ED FE movl $0xfeedbacc,(%rax)
performs in most runs better than
C7 04 25 00 00 00 00 CC movl $0xfeedbacc,s_var
BA ED FE
I'm really surprised, and now I'm wondering how Maratyszcza would explain this. I have an idea already, but I'm willing ... what the fun... no, seeing these (example) results
movl to s_var
All 000000000E95F890 244709520
Avg 00000000000000E9 233
Min 00000000000000C8 200
Max 0000000000276B22 2583330
leaq s_var, movl to (reg)
All 000000000BF77C80 200768640
Avg 00000000000000BF 191
Min 00000000000000AA 170
Max 00000000001755C0 1529280
might for sure be supportive for his statement that the instruction decoder takes a max of 8 bytes per cycle, but it doesn't show how many bytes are really decoded.
In the leaq/movl pair, each instruction is (incl. operands) less than 8 bytes, so it is likely the case that each instruction is dispatched within one cycle, while the single movl is to be divided into two. Still I'm convinced that it is not the decoder slowing things down, since even with the 11 byte movl its work is done after the third byte — then it just has to wait for the pipeline streaming in the address and the immediate, both of which need no decoding.
Since this is 64 bit mode code, I also tested with the 1 byte shorter rip-relative addressing — with (almost) the same result.
Note: These measurements might heavily depend on the (micro-) architecture which they are run on. The values above are given running the testing code on an Atom N450 (constant TSC, boot#1.6GHz, fixed at 1.0GHz during test run), which is unlikely to be representative for the whole x86(-64) platform.
Note: The measurements are taken at runtime, with no further analysis such as occurring task/context switches or other intervening interrupts!
/update]
Addressing using registers are fastest.
Please see Register addressing mode vs Direct addressing mode

x86 binary bloat - 32-bit offsets when 8-bits would do

I'm using clang+LLVM 2.9 to compile various workloads for x86 with the -Os option. Small binary size is important and I must use static linking. All binaries are 32-bit.
I notice that many instructions use addressing modes with 32-bit displacements when only 8 bits are actually used. For example:
89 84 24 d4 00 00 00 mov %eax,0xd4(%esp)
Why didn't the compiler/assembler choose the compact 8-bit displacement?
89 44 24 d4 mov %eax,0xd4(%esp)
In fact, these wasted addressing bytes are over 2% of my entire binary!
I looked at LLVM's link time optimization and tried --emit-llvm, but it didn't mention or help this issue.
Is there some link-time optimization that can use knowledge of the actual displacements to choose the smaller instruction form?
Thanks for any help!
In x86, offsets are signed. This allows you to access data on both sides of the base address. Therefore, the range of an 8 bit offset is -128 to 127. Your instruction is referencing data 212 bytes forward (the value 0xD4 in decimal). If it had been encoded using an 8 bit offset, it would be -44 in decimal, which is not what you wanted.