How is the cpu clock connected to other components - clock

How is the cpu clock connected to other components. and what do people mean by saying all operations start at a clock tick.

The CPU Clock drives the CPU. Internally there is a bus system, basically a bundle of electrical connections. Someone, and only one, may put their data on it. For example a register output, or the ALU result (Well, they don't want to, they're told to do so by the control unit, which makes sure only one entity may access the bus in write mode).
This operation is unsafe as logic may fluctuate the electrical signal several times while it moves through the logic gates until it stabilizes, or some signals will come earlier than others. This depends on capacitive and inductive effects and such, delaying the signals.
Because of this, no one will read the data off the bus until the clock triggers. The clock pulse indicates that enough time has passed that the signals ought to be safe and it is assumed the signal on the bus is stable.
This is done by simply using an and-gate or an edge detector with the clock signal on the devices that want to read from the bus.
Example:
Data-Bus ----/8----- [ ]
Data-In ----------- [ R ]
Data-Out ----------- [ E ]
Clk ----------- [ G ]
' Data-Out may be asyncronous like this, though not recommended, or on falling/low clock pulse, or the Data-Out signal is clock synced:
Data-Bus[0] = Data-Out AND Data[0]
Data-Bus[1] = Data-Out AND Data[1]
Data-Bus[2] = Data-Out AND Data[2]
[...]
' Data-In will almost always be clock synced
If (Data-In AND Raising-Clk-Edge)
{
Data[0] = Data-Bus[0]
Data[1] = Data-Bus[1]
[...]
}
This is of course highly dependant on your actual hardware. For example Read-Enable, Write-Enable and Output-Enable can be active low, etc.
There is a great youtube series of a guy actually building a CPU on a breadboard. While of course this is overly simplified in regards what a modern cpu does, it helps to understand the basics.
The cpu clock itself is usually not directly connected to other hardware, and instead the cpu generates the trigger pulses that tell others it's safe to read/write from/to the bus now.
Each instruction may be made up from several microinstructions. For example:
LDA #5, Load 5 into the A-Register
' Fetch
Put IP on Address bus, Enable memory out => Opcode for LDA-Immediate is now on Data-Bus
Write-Enable Instruction Register, Increment IP
' Decode with combinatorial logic
' CU realizes it needs another word from memory (the value) and it needs to go to the A-Register
' Execute
Put IP on Address bus, Enable memory out => #5 now on data-bus
Increment IP, Write-Enable A-Register => #5 now in A-Register
' Done
This is done by the cpu clock. the system clock has little to do with that.

Related

How do I prevent the Raspberry Pi Pico from taking ALL background actions and interrupts to create a pulse in software?

My goal is to create a single pulse with the RPI pico by clearing and setting a hardware pin in software. I am attempting to do this in software because I did not see a way to provide a single non-repetitive pulse through one of the timer channels.
The resolution of the pulse width can be as much as 32ns, which should be easy to achieve with a 125MHz clock. Any pulse longer than 1us on the hardware pin will physically destroy the circuit.
In the simplest form the code should initialize a pin to the high state, pull the pin low, wait, and then set the pin high. There should be a way to predictably adjust the time between the low and high states.
In the code below, I would expect that every NOP between the gpio_clr and gpio_set commands increased the pulse width by either 8ns or 16ns. However, there is no consistency to the relation between pulse width and the number of NOP instructions between the gpio_clr and gpio_set commands. Sometimes it is 16ns, other times it is nearly a microsecond.
I have tried porting the C code to assembly and it did not change the outcome. Neither did save_and_disable_interrupts(). When placed into a while loop, the first pulse is usually over a microsecond and the other pulses are usually a consistent width.
When I view the disassembly of the C code, there are no instructions between the gpio_clr and gpio_set functions.
I have the impression that the RPI is taking a background action in between the setting and clearing of the pin. I am hoping someone knows how to make this code execute sequentially as it was written.
I would accept an answer that demonstrated how to use a timer channel to provide a single, non-repetitive pulse with 32ns of resolution. The alarm functions seem to have approximately 7us of overhead which makes them unusable.
#include "pico/stdlib.h"
#include "hardware/sync.h"
int main() {
const uint pulse_len = 1;
gpio_init(6);
sio_hw->gpio_set = 1u<<6;
gpio_set_dir(6, GPIO_OUT);
sio_hw->gpio_set = 1u<<6;
uint i=0;
//__asm("cpsid if");
uint32_t istatus = save_and_disable_interrupts();
//provide a single pulse
sio_hw->gpio_clr = 1u<<6;
//__asm("nop"); I would expect each NOP to increase pulse width by the same amount
//__asm("nop");
//__asm("nop");
//__asm("nop");
//__asm("nop");
//__asm("nop");
sio_hw->gpio_set = 1u<<6;
restore_interrupts(istatus);
//__asm("cpsie if");
while(true);
}

How is value of Program Counter incremented?

I am creating a Primitive Virtual Machine which is kind of inspired by LC-3 VMs but a 32-bit version. I am feeding the machine set of instructions. After executing the first instruction, how will the PC know the location of the second instruction.
Is there a particular method to store the instructions in memory in a systematic way so that PC knows the address of the next instruction
Example - All instructions are stored in a linear way as in memory[0] = instruction1, memory[1] = instruction2 etc.
Thank you for the help.
It depends on whether your Processor architecture is RISC or CISC. In the context you asked, A CISC processor has instructions whose size vary, say from 1 to 14 bytes, like for Intel processors. If it is RISC, each instruction size is fixed, say 4 byte, like for ARM processors. All the instruction of a program are stored, in sequence, in main memory. It is the processor control unit that decides how much to increment the PC. Instructions from the main memory would be read in sequence.
So say in CISC architecture, a single 8 byte read from main memory, can contain up to 8 '1 byte' instructions, e.g., repetitive 'inc ax' instruction in Intel processors. After sending the first instruction for decode, the control unit will increment PC by 1. But, at other extreme, there could be a instruction like 'add REG , [BASE+INDEX+OFFSET]' , which can take 13 bytes to store all the information (opcode + REG id + base address + index + some offset) that is there in the instruction. For such instruction, two memory read operation would be required to fetch the full instruction. After sending it for decode, the control unit will increment the PC by 13.
For RISC it is simple. Increment PC by size (2,4,...) of instructions.
Only exception is when there is branch. In that case, PC value is reset at usually the execute stage.
Instructions and data are generally grouped (segmented in some processor architecture) and stored separately. A code segment will end with some kind of return or exit instruction. If PC is set to some memory address where data is stored, the control unit of the processor will process it as instruction. After all both data and instructions are nothing but a sequence of bits! The control unit will not be able to differentiate. It is usually the role of OS or programmer (if there is no OS, like on micro-controllers) to prevent such anomaly.

timing issues: simulation (iverilog, gtkwave) works, hardware (yosys) does not

I am learning verilog, trying do make the "hello world" in the VGA world (a bouncing ball) on a ice40LX1K board (olimex ice40HX1K + VGA I/O board).
I have a strange problem: when I simulate my design using iverilog + GTKWave, it seams to work good. But the implementation in hardware does not work.
What is strange is that in the hardware implemention, the ball is doesn't move .. and its position is all zero (0,0) althou the verilog code never should set it overthere.
It looks like changing the value of xpos_ball or ypos_ball does not actually change these values. (a hardware issue? a yosys issue)? In the iverilog simulation, location of the ball does change as expected.
I have no idea if this is a error in my own verilog code (as I am new in this, this is very well possible), an issue in yosys, or a problem in the hardware (speed issue, is the 100 Mhz clock to fast?) or something else?
Any proposals on how to troubleshoot this, or next steps for this kind of issue? Are there other debugging-tricks I can use?
(edit: link to the actual verilog-code removed as not relevant anymore)
Kristoff
is the 100 Mhz clock to fast?
Yes. That design is good for 39.67 MHz:
$ make vga_bounceball.rpt
icetime -d hx1k -mtr vga_bounceball.rpt vga_bounceball.asc
// Reading input .asc file..
// Reading 1k chipdb file..
// Creating timing netlist..
// Timing estimate: 25.21 ns (39.67 MHz)
Edit re comment:
You can always safely divide a clock by a power of two by using FFs as clock dividers:
input clk_100MHz;
reg clk_50MHz = 0; // initialization needed for simulation
reg clk_25MHz = 0;
always #(posedge clk_100MHz) clk_50MHz <= !clk_50MHz;
always #(posedge clk_50MHz) clk_25MHz <= !clk_25MHz;
(A non-power-of-two prescaler is not always safe without making sure with timing analysis that the prescaler itself can run in the high frequency domain.)

measuring time between two rising edges in beaglebone

I am reading sensor output as square wave(0-5 volt) via oscilloscope. Now I want to measure frequency of one period with Beaglebone. So I should measure the time between two rising edges. However, I don't have any experience with working Beaglebone. Can you give some advices or sample codes about measuring time between rising edges?
How deterministic do you need this to be? If you can tolerate some inaccuracy, you can probably do it on the main Linux OS; if you want to be fancy pants, this seems like a potential use case for the BBB's PRU's (which I unfortunately haven't used so take this with substantial amounts of salt). I would expect you'd be able to write PRU code that just sits with an infinite outerloop and then inside that loop, start looping until it sees the pin shows 0, then starts looping until the pin shows 1 (this is the first rising edge), then starts counting until either the pin shows 0 again (this would then be the falling edge) or another loop to the next rising edge... either way, you could take the counter value and you should be able to directly convert that into time (the PRU is states as having fixed frequency for each instruction, and is a 200Mhz (50ns/instruction). Assuming your loop is something like
#starting with pin low
inner loop 1:
registerX = loadPin
increment counter
jump if zero registerX to inner loop 1
# pin is now high
inner loop 2:
registerX = loadPin
increment counter
jump if one registerX to inner loop 2
# pin is now low again
That should take 3 instructions per counter increment, so you can get the time as 3 * counter * 50 ns.
As suggested by Foon in his answer, the PRUs are a good fit for this task (although depending on your requirements it may be fine to use the ARM processor and standard GPIO). Please note that (as far as I know) both the regular GPIOs and the PRU inputs are based on 3.3V logic, and connecting a 5V signal might fry your board! You will need an additional component or circuit to convert from 5V to 3.3V.
I've written a basic example that measures timing between rising edges on the header pin P8.15 for my own purpose of measuring an engine's rpm. If you decide to use it, you should check the timing results against a known reference. It's about right but I haven't checked it carefully at all. It is implemented using PRU assembly and uses the pypruss python module to simplify interfacing.

Bits Are Scrambled

The Problem: I send one value into a UART and nulls emerge on the other UART.
--- Details ---
These are both PIC processors (PIC24 and PIC32)
They are both hard wired onto a printed circuit board.
They are communicating, each via one of the UART modules which reside within them.
They are (ostensibly; according to docs) both configured for 115200 bps, 8-N-1
No handshaking, no CTS enabled, no RTS enabled; I'm just putting bytes on the wire and out they go.
(These are short little 4-byte commands and responses which fits pretty neatly)
The PIC32 is going 80 MHz.
The PIC24 has F[cy] = 14745600
i.e., it is going 14.7456 MHz
The PIC24 sends four bytes (a specific command sequence)
When I set a breakpoint at the Interrupt Service Routine for the UART, The PIC32 shows nulls, then I am seeing repeated hits on the (PIC32 code) breakpoint after the first four, and I continue to see nulls (which makes sense since the PIC24 is not sending anything)
i.e., the UART appears to be repeatedly generating interrupts when there is no reason
I did not write the code on the PIC32 side, and I am learning daily how it works.
Then I let the code just run, and I inevitably wind up on a line that says
52570 1D01_335C 9D01_335C _general_execption_handler sdbbp 0x0
When I get there,
The cause register holds 0080181C
The EPC register holds 9D00F228
The SP register holds 9F8FFFA0
This happened like clockwork, so I got suspicious of the __ISR that would not stop. MpLab showed me this...
432:
433: //*********************************************************//
434: void __ISR(_UART1_VECTOR, ipl5) IntUart1Handler(void) //MCU communication port
435: {
9D00F204 415DE800 rdpgpr sp,sp
9D00F208 401A7000 mfc0 k0,EPC
9D00F20C 401B6000 mfc0 k1,Status
9D00F210 27BDFF88 addiu sp,sp,-120
9D00F214 AFBA0074 sw k0,116(sp)
9D00F218 AFBB0070 sw k1,112(sp)
9D00F21C 7C1B7844 ins k1,zero,1,15
9D00F220 377B1400 ori k1,k1,0x1400
9D00F224 409B6000 mtc0 k1,Status
9D00F228 AFBF0064 sw ra,100(sp) ;<<<-------EPC register always points here
9D00F22C AFBE0060 sw s8,96(sp)
9D00F230 AFB9005C sw t9,92(sp)
9D00F234 AFB80058 sw t8,88(sp)
9D00F238 AFAF0054 sw t7,84(sp)
9D00F23C AFAE0050 sw t6,80(sp)
9D00F240 AFAD004C sw t5,76(sp)
9D00F244 AFAC0048 sw t4,72(sp)
9D00F248 AFAB0044 sw t3,68(sp)
9D00F24C AFAA0040 sw t2,64(sp)
9D00F250 AFA9003C sw t1,60(sp)
9D00F254 AFA80038 sw t0,56(sp)
9D00F258 AFA70034 sw a3,52(sp)
9D00F25C AFA60030 sw a2,48(sp)
9D00F260 AFA5002C sw a1,44(sp)
9D00F264 AFA40028 sw a0,40(sp)
9D00F268 AFA30024 sw v1,36(sp)
9D00F26C AFA20020 sw v0,32(sp)
9D00F270 AFA1001C sw at,28(sp)
9D00F274 00001012 mflo v0
9D00F278 AFA2006C sw v0,108(sp)
9D00F27C 00001810 mfhi v1
9D00F280 AFA30068 sw v1,104(sp)
9D00F284 03A0F021 addu s8,sp,zero
I look a little more closely at the numbers, and I see that at that time, if we add 100 (0x64) to FFA0 (the bottom 16 bits of the SP) we get 0x10004, which I am guessing is 4 too much.
PIC Manual DS61143E-page 50 says that that nomenclature means, SW Store Word Mem[Rs+offset> = Rt and other experts have told me that the cause register is telling me that the EXCCODE bits are 7 which is the code for a bus exception on load or store.
Or, I'm totally guessing here (would love to get some experts' knowledge on this) something is not clearing something and I'm encountering infinite recursion on an int handler.
All of this is starting to make sense.
THE QUESTION
Could someone please suggest the most common reasons for an int like this to be repeatedly hitting me ?
Does anyone see any common relationship between the bogus nuls coming from the UART which could somehow be connected with this endlessly generated int ? Am I even on the right track ?
In your answer, please tell me how to acknowledge the Int from the UART. I know how I do that in the PIC24 (I wrote that code totally, in ASM) but I don't know how this is done in in C on the PIC32. Assembly will be fine. I'll inline it. I'm working with code I didn't write here, and I thank you for your answers
What is the most common reason that the UART (#1, in this case) would be repeatedly generating interrupts ?
The most common reason an interrupt subroutine is called over and over is that the interrupt request is never acknowledged in the routine.
Are you sure you clear the corresponding IRQ bit?
To ease UART debugging you should first connect the UART to a PC and make sure your target can communicate both ways with the PC. With two targets at the same time, you can't determine on which one the problem is apart from inspecting signals with a scope.
On many devices an interrupt must be explicitly cleared to prevent the ISR from simply re-entering when complete.
In most cases a UART will have status bits that indicate the source of the interrupt, knowing that might tell you something, but not telling us makes it difficult to help you. You can inspect the UART registers directly in the debugger, however in some devices the act of reading a bit may in fact clear a bit, and that is true in the debugger too, so be aware of that possibility (check the data sheet/user manual).
Some UARTS require their transmitter to be explicitly switched off to stop transmitting nulls, while others are triggered automatically when data is placed in the tx register and stop after the necessary number of bits are shifted out. Again check the data sheet/manual for the part. If the PIC32 code is known to be working, then since this possible error would be with the PIC24 code, it seems to fit. You can check this simply by using an oscilloscope on the Tx line from the PIC24, if it is transmitting, you will see at least start/stop bit transitions (framing). If there is nothing, then the problem is probably at the PIC32 end.
While you have the scope out, you can check that the bit timing is correct and that you are actually transmitting at 115200. It is easy to get the clocking wrong, and that should be your first check. If the baud rate is incorrect, the PIC32 will likely generate framing error interrupts, which if not handled may persist indefinitely.
Another possibility is that after transmission the PIC24 leaves the line in the "break" state, and that the PIC32 UART is generating "line-break" interrupts. That is why it is important to look at the UART status registers to determine the interrupt cause.
As you can see, there are many possibilities; I think I have covered the most likely ones, but more methodical debugging effort and information gathering on your part is required. I hope I have guided you in this too.
There were the three root causes which were in place...
The interrupt priority level was set at value 6 in the initialization code for UART1
The first line of the interrupt service routine was coded with an interrupt priority level of 5
The first three bytes of UART data were disappearing from the data stream (this is still unsolved)
Here's the not-so-obvious way they were causing the problem
First three bytes never appeared
Fourth byte did appear
Interrupt hit (as level 6) and invoked __ISR routine
__ISR was acting as ipl5 agent
First instruction executed (possibly more, I couldn't debug that accurately)
As soon as the first instruction finished, the "higher" priority 6 interrupt immediately kicked in
This resulted in the same interrupt again
The process repeated itself infinitely.
In short order, Stack Overflow resulted
The Fix
Make sure these two lines of code agree with each other...
The IPL line in the init code, wrong way then the right way
//IPC6bits.U1IP=6; //// Wrong !!! Uart 1 IPL should not be 6 !!!
IPC6bits.U1IP=5; //// Uart 1 IPL = 5 Correct way; matches __ISR
Interrupt Service Routine
void __ISR(_UART1_VECTOR, ipl5) IntUart1Handler(void) //// Operating as IPL 5
:
:
:
:
Poor design decision. If both are on same board SPI would have been more feasible and a lot faster.