CAN bus arbitration backoff time - embedded

I am aware of the way CAN bus does its arbitration. In a nutshell the CAN node ID having more '0' 's in its indentifier wins the rite to transmit on the bus and the rest of contending nodes back off.
But i dont find any details of how long the backed out node waits before re-trying to win the bus back. I consulted a few sources but still cant find the answer. Any experimental evidence for this ?
Bosch CAN
Introduction to the Controller Area Network

It is free to try again after the winning frame has been transmitted and no dominant bit has been found in the "intermission field" at the end of the CAN frame. You'll probably find a formal definition of this if you search the spec for "intermission field", see for example 3.1.5 of the old (obsolete) Bosch spec you linked.
The important part here is to realize that every CAN controller listens to every single frame, even if it isn't interested in it. This is how you achieve collision avoidance, rather than collision detection.

As mentioned in the Bosch CAN specification document all the CAN nodes can start to send pending frames when Bus Idle condition occurs (no dominant bit found on the bus). During the intermission period in the Interframe spacing no node can transmit (Overload frames can be transmitted but not Data or Remote frames). CAN nodes must wait for 3 recessive bits during this period. All nodes can start transmitting right after this intermission period.
If multiple nodes start at once after intermission period then the lowest identifier frame will win the arbitration. If the remote and data frames (both have same identifier) from different nodes start then the data frame will win the arbitration.

I agree with the answers above but i was looking for more mathematical analysis of the CAN bus timings. I found this excellent lecture notes : Time analysis of CAN messages
. Chapter 3

Related

Does "error frames" on CAN bus delay/ impair the communication?

The quote below is from a document by Texas Instruments.
The error frame is a special message that violates the formatting
rules of a CAN message. It is transmitted when a node detects an error
in a message, and causes all other nodes in the network to send an
error frame as well. The original transmitter then automatically
retransmits the message. An elaborate system of error counters in the
CAN controller ensures that a node cannot tie up a bus by repeatedly
transmitting error frames.
Also, this wikipedia page provides more information on error frames.
As mentioned in several answers (link1, link2), CAN bus is half-duplex, that is, the nodes cannot transmit and receive data at the same time.
In general, a modern car contains more than 50 ECUs (nodes) on a CAN network. In case of an error, " if " the nodes would send error frames one after another, the CAN BUS would be occupied for a quite long time.
So, what do I miss here? Do the nodes send their error frames at the same time/ simultaneously and the hardware solves that issue? What happens if a node transmitted a different or corrupted error frame?
The other nodes will not send error-frames one after the other in most cases. If there is an error on the bus, then it is very very likely that all nodes will perceive the error. They will all then send their error-frames at (close to) the same time. As they all expect to see "their" error-frame, it does not matter who gets there first.
In the (unusual case) of an error only being noted by one node (perhaps some transient within the ECU) it will transmit an error-frame. The other ECUs will react to this error-frame (which is "simply" a violation of the stuffing rules) with their own error-frames. But again, they will all see it at the same time and so the case described above applies. They will all transmit their "own" error-frame at about the same time.
As noted by #Lundin in the question comments, error-frames are very unusual, so the impact on the bus-loading is not of major concern.
I do not understand this part of your question:
What happens if a node transmitted a different or corrupted error frame?
A node "cannot" transmit a different error-frame - it would not be an error-frame. An error-frame being corrupted is very unlikely as it is a string of dominant bits, which are driven hard, and usually by several to many ECUs at a time. If it were to happen, I think (but would have to check the spec) the ECUs would notice this as another error and transmit another error-frame.
A node that repeatedly sends active error frames first goes into the "Warning" state and then later into the "Bus Off" state. This prevents a broken node from becoming a "bubbling idiot".
See Bosch CAN specification page 63

Constant carrier digital transmission in GNURadio with USRP

I'm trying to implement the UPLINK of a Ground Station controlling a small satellite. The idea is that the link should stay always active in between each transmitted telecommand. For this, I need to insert some DUMMY or IDLE sequence bytes such as 0xAA or similar.
I have found that some people already faced a similar issue and posted their questions here:
https://www.ruby-forum.com/t/constant-carrier-digital-transmission/163379
https://lists.gnu.org/archive/html/discuss-gnuradio/2016-08/msg00148.html
So far, the best I could achieve was to modify the EventStream Source block from https://github.com/osh/gr-eventstream in order to preload the vectors with my dummy sequence (i.e. 0xAA) instead of preloading them with zeroes. This is a general overview of the GNURadio graph I'm using:
GNURadio Flowgraph Picture
This solution however introduces a huge latency and the sent message does not appear at the output until a huge amount of time has expired (in the order of several seconds).
Is there a way of programming the USRP using GNURadio so that it constantly sends a fixed sequence which should only be interrupted when an incoming message is passed? I assume that the USRP has the ability of reading tagged streams in order to schedule transmissions. However, I'm not sure how to fit this in my specific application.
Thanks beforehand!
Joa
I believe this could be done using a TCP or UDP source block.
Your control information could be sent to the socket using TCP/UDP. GNU Radio would then collect and transmit the packets. Your master control program would then have to handle the IDLE stuffing but solving the problem external to GNU Radio is easier.
Your master control program would basically do the following:
1. tx control data as needed
2. if no control data ready before next packet must be sent send an IDLE packet

NTPD Pseudo Servers

Good evening!
I'm configuring NTP on an embedded Linux system connected with an U-Blox GPS receiver. I've used NTPD and GPSD.
I would like to know what's the technical difference between:
PPS Signal provided by the GPSD shared memory SHM, (Pseudo IP Address 127.127.28.1);
PPS Signal "Stand Alone", but always connected in some way I would like to understand, with GPS (Pseudo IP Address 127.127.22.0)
It is critical for me to understand because I really need an high level synchronization and I would like the right information from the receiver.
Searching all over the web I've found only confused answers to my doubt...
Thanks in advance!
FL
The SHM driver is not designed to provide a PPS signal by itself. So maybe your notion here is misguided.
A PPS signal is used for getting a (precise) notion of the
frequency of the local clock (the one used for measuring external signals), as it just provides a well known timing distance of the "pulses" (1s in this case). Actually pps is a frequency source.
GPSD on the other hand is communicating with some device (could be built into your HW). It then proovides the time data read from the GPS source via shared memory to ntp. This provisioning of data does not guarantee any timing relation (delay). (E.g. could occur earlier or later within the second due to load or scheduling)
From the perspective of ntp, you will have a true date/time label, but you might not know exactly when the related point in time did occur related to your local clock. (Usually not precisely enough for common ntp use cases.) This is where PPS kicks in.
Depending on how the GPS device is being connected to your local machine (parallel, serial, internal bus) you will have some way of getting an interrupt on the pulse from the pps signal. (e.g. with serial connection you usually will get a transition on the DCD pin).
The internal processing of the related interrupt will read the local clock and the resulting timing information is then provided for further processing. This information is exactly what the PPS clock discipline is using and providing to ntp. What you need to configure here, is the offset from the triggering of the pulse to reading the local clock. (Pulse usually is assumed to occur "on the second.)
So, in your configuration, it is likely that the "source" of the PPS signal is the same GPSD is using for providing date/time data (your GPS device).
However, the actual signal used for date/time data and pps is different. Date/time will use a data telegram or some register content read from the GPS device while pps will be a level change on an input pin proveded from this very device.
For details start with the interfacing information from your GPS receiver, especialy any timings stated there. Then look at ntp and figure what driver(s) will allow exploiting such input data for best time quality.

Get packet loss from Open Flow switch

I am using ryu controller (3.22) to monitor switches (Open vSwitch 2.0.2, supporting Open Flow 1.3), which are a part of virtual network created using mininet (2.1.0). It is a tree topology with depth = 2, and fanout = 5. I am using switch_monitor.py
With the help of controller, I can get port statistics using the EventOFPPortStatsReply decorator. I can get values of rx_packets, rx_bytes, rx_errors, tx_packets, tx_bytes, tx_errors, rx_dropped, tx_dropped etc.
But the values of rx_dropped, tx_dropped come out to be zero always, even when the switches are actually dropping packets, as reported by qdisc (linux command).
How to get packet loss statistics from an Open Flow switch?
a. How to get a non zero value?
b. Is there any alternate way?
qdisc reports what the kernel is dropping, not what the network is dropping. You're getting zero's because the switch isn't dropping frames.
(I don't know if your virtual network system supports simulating frame drops.)
I believe that dropped only cares about packets that are dropped due to actual drop rules or due to buffer overflows.
Another way to calculate packet loss is to compare the packet counts for the two switches on the edge of a link. Suppose you have A <--> B and want to calculate the packet loss rate from A to B. Then you take:
plr(A,B) = (tx_packets(A) - rx_packets(B)) / tx_packets(A))
Beware though that sometimes the counters are reset leading to rx_packets being higher that tx_packets. I am facing this behavior in my SDN software and tend to invalidate the results, if there are strange combinations.

USB CDC device stalling

I'm writing a simple virtual serial port device to report an older serial port. By this point I'm able to enumerate the device and send/receive characters.
After a varying number of bulk-out transmissions from the host to the device the endpoint appears to give up and stop transferring data. On the PC side I receive a write error, and judging from a USBlyzer trace the music stops on a stall (USBD_STATUS_STALL_PID). However my code never intentionally issues a STALL condition on that endpoint and the status flag for having generated one never gets set though.
Given the short amount of time elapsed (<300 µs) between issuing the request and the STALL it would appear to be an invalid response of some sort, and not a time-out. On the device side the output endpoint is ready to go, with data in the buffer and proper DATA0/1 synchronization, but nothing further ever happens.
Note that the device appears to work fine even for long periods of time until I start sending "large" quantities of data. As near as I can tell the device enumeration/configuration also appears to complete successfully. Oh, and the bulk-in endpoint continues to work just fine after this.
For the record I'm using the standard Windows usbser.sys driver and an XMega128A4U µP. I'm also seeing the same behaviour across multiple Windows Vista and 7 machines.
Any ideas what I'm doing wrong or what further tests I might run to narrow things down?
USBlyzer log,
USB CDC stack,
test project
For the record this eventually turned out to be an oscillator problem. (Apparently the FLL's reference is always 1,024 Hz even when the 1,000 Hz USB frames are chosen. The slight clock error meant that a packet occasionally got rejected if it happened to contain one too many 1-bits in a row.)
I guess the moral of the story is to check the basics before assuming you've got a problem with the higher-level protocol. Also in retrospect a hardware USB analyzer would have been a worthwhile investment, the software alternatives mostly seems to spit out a generic error code or nothing at all when something goes awry.
Stalling the out-endpoint may happen on an overflow of the output buffer on the host side. Are you sure that the device does fetch the data it receives via out-endpoint - and if so does it fetch the data at least as fast as data is sent to the device?
Note that the device appears to work fine even for long periods of
time until I start sending "large" quantities of data.
This seems to be a hint for an overflow of the output-buffer.