How do I assign data to an internal input port - input

I have an FPGA trying to read/write values to SDRAM on the same chip. What the sdram sees as IN, the top level sees as OUT and otherwise. SDRAM "paths" are instantiated and are brought to the top level. These paths have no direction. However, I know that the top level reads and writes to the sdram. I tried a variation of the code shown and it compiled. The code below is an example to pass two values to the SDRAM and read a third value. I have assigned a direction to paths. Is my logic correct in that it sends two values and received a third?
use IEEE.STD_LOGIC_UNSIGNED.ALL; -- see page 36 of Circuit Design with VHDL
port(
-- ---------------------------------------------------------------------
-- Global signals ------------------------------------------------------
CLK : in std_logic;
RESET : in std_logic;
A : out std_logic_vector(15 downto 0);
B : out std_logic_vector(15 downto 0);
C : in std_logic_vector(15 downto 0);
end entity sigma_k_top;
architecture rtl of function_top is
signal cntr : std_logic_vector(31 downto 0);
signal sig_A : std_logic_vector(15 downto 0);
signal sig_B : std_logic_vector(15 downto 0);
signal sig_C : std_logic_vector(15 downto 0);
begin
sdram_inst : entity work.sdram
port map (
CLK => sdram_CLK_in, --CLK shared by all
A => sdram_A_in, -- Write to sdram
B => sdram_B_in,-- Write to sdram
C => sdram_C_out, --Read from sdram
);
transfer: process(CLK)
begin
IF rising_edge(CLK) then
cntr <= cntr + 1;
if cntr = 1000 then --
sig_A <= "1000000000000000";
sig_B <= "1000000000000000";
end if;
if cntr = 1001 then
if C(0) = '1' then
sig_A <= sig_A - 1; -- Writing
sig_B <= sig_B + 1; -- Writing
xfer <= C ; -- Reading
end if;
end if;
if cntr > 2000 then
cntr <= (others => '0');
end if;
END IF;
end process;
-- -------------------------------------------------------------------------
-- Top-level ports ---------------------------------------------------------
TEST_LED(7 downto 0) <= xfer(7 downto 0); -- Making some sdram output visible
A <= sig_A; -- Sending value to sdram
B <= sig_B; -- Sending value to sdram
end architecture rtl;

What inputs and outputs exist to/from the RAM could vary based on how you intend to use it. If the RAM really exists on the FPGA chip itself, an example might be that you want to use a simple single port RAM on say a Xilinx Block RAM library component.
As it appears from the code that the sdram is instanced under the FPGA's top level (the RAM is contained within the fpga chip), it seems that the what are the RAM's inputs/outputs should also be the top level's inputs/outputs. It would be reversed if the sdram were outside the FPGA (and thus outside the FPGA's top level)
In general, RAMs tend to be sequential elements that require at the least:
-A clock (typically a 1-bit wide signal)
-An address (tends to be log2(n) bits wide, where n is the size of the RAM array. So if the array has 64 elements, you'd need at least 6 bits to address everything. The same address signal could be used for both reads and writes, or maybe you would have 2 separate address signals.)
-A write enable (in the simplest from could be a 1-bit signal. The most typical use would be to assert this signal for 1 clock cycle to update data at the current address of the address signal)
-data (width would vary and tends to be flexible/configurable on an FPGA. If you want to store 16-bits of data in each RAM entry that should be perfectly valid. This could be a single signal or 2 separate ones for read and write data).
As long as the signal vectors going to/from the RAM have at least these basic functions, it seems like you should be able to use it at least as a simple RAM. Note by the way that in your code the sdram_* signals are neither declared nor connected to anything other than the sdram instance itself.

Related

why do i need wait after assert in VHDL?

I've started learning VHDL, and on EdaPlayground there's always a wait; after assert(cond); in the testing file.
Could you please explain why do i need a wait; in the end? From my point of view, it should terminate right after execution, but it doesn't (and instead of terminating it lands into an infinite loop).
Here is the architecture i want to test :
use IEEE.std_logic_1164.all;
entity minority is
port(
a: in std_logic;
b: in std_logic;
c: in std_logic;
y: out std_logic
);
end minority;
architecture impl of minority is
begin
y <= '1' when (a and b and c) else '0';
end impl;
Here is the code for testing :
use IEEE.std_logic_1164.all;
entity testbench is
-- empty
end testbench;
architecture tb of testbench is
-- DUT component
component minority is
port(
a: in std_logic;
b: in std_logic;
c: in std_logic;
y: out std_logic
);
end component;
signal a1, b1, c1, y1: std_logic;
begin
-- Connect DUT
DUT: minority port map(a1, b1, c1, y1);
process
begin
a1 <= '0';
b1 <= '1';
c1 <= '0';
y1 <= '0';
wait for 1 ns;
assert(y1='0') report "Y is ok." severity error;
wait; -- <-- without this line, the test starts to execute infinitely :(
end process;
end tb;
Thank you for your answers!
VHDL is a hardware description language. The VHDL process is a key part of it. Hardware doesn't just stop. Therefore, VHDL processes don't just stop, either; they keep on going, just like hardware does.
If you want to use a process for something other than hardware design (ie you are not going to synthesise it) and you want it to only run once, then you are going to have make it stop. You do that by adding a wait statement, because in VHDL, a wait on its own means wait forever.

bubble sort in vhdl

Can anyone help me in writing VHDL code for bubble sort given an array of data as input?
I have declared in_array as input which contains 15 array elements. i want to bubble sort them in descending order.
in_array is input array.
sorted_array is out put array.
in_array_sig is signal of type in_array
I am facing problem with statements inside process
Below is my code:
architecture behav of Bubblesort is
signal in_array_sig : bubble;
signal temp : std_logic_vector(3 downto 0);
signal i : integer := 0;
begin
in_array_sig <= in_array;
proc1 : process(clk, reset)
begin
if reset = '0' then
if (clk'event and clk = '1') then
while (i <= 15) loop
if (in_array_sig(i) < in_array_sig(i + 1)) then
temp <= in_array_sig(i);
in_array_sig(i) <= in_array_sig(i + 1);
in_array_sig(i + 1) <= temp;
end if;
i <= i + 1;
end loop;
end if;
end if;
end process;
sorted_array <= in_array_sig;
end behav;
I am beginner in VHDL coding. Kindly help me with this.
The lack of a Minimal Complete and Verifiable example makes it hard to provide an answer about all the the things stopping your code from bubble sorting accurately. These can be described in the order you'd encounter them troubleshooting.
proc1 : process(clk, reset)
begin
if reset = '0' then
if (clk'event and clk = '1') then
while (i <= 15) loop
if (in_array_sig(i) < in_array_sig(i + 1)) then
temp <= in_array_sig(i);
in_array_sig(i) <= in_array_sig(i + 1);
in_array_sig(i + 1) <= temp;
end if;
i <= i + 1;
end loop;
end if;
end if;
end process;
Before starting note that the clock is gated with reset. You could qualify assignments with reset making it an enable instead.
Problems
The first thing we'd find producing an MCVe and a testbench is that the process never suspends. This is caused by the condition in the while loop depending on i and i a signal being updated within the process. i shouldn't be a signal here (and alternatively you could use a for loop here).
This also points out that temp is a signal and suffers the same problem, you can't use the 'new' value of temp until the process has suspended and resumed. Signals are scheduled for update, a signal assignment without a waveform element containing an after clause have an implicit after clause with zero delay. Signal updates do no occur while any process scheduled to resume has yet to resume and subsequently suspend. This allows the semblance of concurrency for signals who's assignments are found in sequential statements (a concurrent statement has an equivalent process containing equivalent sequential statements). So neither i nor temp can update during execution of a processes sequence of statements and both want to be variables.
We'd also get bitten using a signal for in_array_sig. As you increment i the previously indexed in_array_sig(i + 1) becomes the next loop iteration's in_array_sig(i). Without an intervening process suspend and resume the original value is available. in_array_sig wants to be a variable as well.
If we were to fix all these we'd also likely note that i is not initialized (this would be taken care of in a for loop iteration scheme) and we might also find that we get a bound error in a line using an (i + 1) index for in_array_sig. It's not clear without the author of the question providing an MCVe whether the array size is 16 (numbered 0 to 15) or 17. If the former i = 15 + 1 would be out of the index range for the undisclosed array type for in_array, in_array_sig, and sorted_array.
If we were to insure the index range is met noting that we only need 1 fewer tests and swaps than the number of elements in an array we'd find that what the process isn't a complete bubble sort. We would see the largest binary value of in_array_sig end up as the right most element of sorted_array. However the order of the remaining elements isn't guaranteed.
To perform a complete bubble sort we need another loop nesting the first one. Also the now 'inner' for loop can have a decreasing number of elements to traverse because each iteration leaves a largest remaining element rightmost until the order is assured to be complete.
Fixes
Fixing the above would give us something that looks like this:
architecture foo of bubblesort is
use ieee.numeric_std.all;
begin
BSORT:
process (clk)
variable temp: std_logic_vector (3 downto 0);
variable var_array: bubble;
begin
var_array := in_array;
if rising_edge(clk) then
for j in bubble'LEFT to bubble'RIGHT - 1 loop
for i in bubble'LEFT to bubble'RIGHT - 1 - j loop
if unsigned(var_array(i)) > unsigned(var_array(i + 1)) then
temp := var_array(i);
var_array(i) := var_array(i + 1);
var_array(i + 1) := temp;
end if;
end loop;
end loop;
sorted_array <= var_array;
end if;
end process;
end architecture foo;
Note the loop iteration schemes are described in terms of type bubble bounds, the outer is one shorter than the length and the inner is one shorter for each iteration. Also note the sorted_array assignment is moved into the process where the in_array_sig variable replacement var_array is visible.
Also of note is the use of the unsigned greater than operator. The ">" for std_logic_vector allows meta-values and 'H' and 'L' values to distort relational comparison while the operator for unsigned is arithmetic.
Results
Throw in package and entity declarations:
library ieee;
use ieee.std_logic_1164.all;
package array_type is
type bubble is array (0 to 15) of std_logic_vector(3 downto 0);
end package;
library ieee;
use ieee.std_logic_1164.all;
use work.array_type.all;
entity bubblesort is
port (
signal clk: in std_logic;
signal reset: in std_logic;
signal in_array: in bubble;
signal sorted_array: out bubble
);
end entity;
along with a testbench:
library ieee;
use ieee.std_logic_1164.all;
use work.array_type.all;
entity bubblesort_tb is
end entity;
architecture fum of bubblesort_tb is
signal clk: std_logic := '0';
signal reset: std_logic := '0';
signal in_array: bubble :=
(x"F", x"E", x"D", x"C", x"B", x"A", x"9", x"8",
x"7", x"6", x"5", x"4", x"3", x"2", x"1", x"0");
signal sorted_array: bubble;
begin
DUT:
entity work.bubblesort(foo)
port map (
clk => clk,
reset => reset,
in_array => in_array,
sorted_array => sorted_array
);
CLOCK:
process
begin
wait for 10 ns;
clk <= not clk;
if now > 30 ns then
wait;
end if;
end process;
end architecture;
and we get:
something that works.
The reset as enable has not been included in process BSORT in architecture and can be added in, inside the if statement with a clock edge condition.
And about here we get to Matthew Taylor's point in a comment about describing hardware.
Depending on the synthesis tool the process may or may not be realizable as hardware. If not you'd need intermediary variables holding the array portions used in each successive iteration of the inner loop.
There's also the issue of how much you can do in a clock cycle. Worst case there is a delay depth comprised of fifteen element comparisons and fifteen 2:2 selectors conditionally swapping element pairs.
If you were to pick a clock speed that was incompatible with the synthesized delay you'd need to re-architect the implementation from a software loop emulation to something operating across successive clocks.
That could be as simple as allowing more clock periods by using that enable to determine when the bubble sort is valid for loading into the sorted_array register. It could be more complex also allowing different and better performing sorting methods or a modification to bubble sort to say detect no more swaps can be necessary.

Optimizing design with many identical units that could be shared

I have a design that generates a video signal on demand, without using RAM resources for a framebuffer.
I have a hierarchy that represents the screen layout, with a toplevel block generating the pixel clock and sync signals, and creating a signal that shows the coordinates of the next pixel. Below that are various blocks with the same interface:
type point is record
valid : std_logic;
x : unsigned(11 downto 0);
y : unsigned(11 downto 0);
end record;
type color is record
r : std_logic;
g : std_logic;
b : std_logic;
end record;
component source is
port(
pos : in point;
col : out color);
end component;
The general idea is that each of these blocks either generates a signal directly, or contains sub-blocks.
I'd like to stick with the pixel-on-demand schema, as it allows me to do
architecture syn of zoom2_block is
signal slave_pos : point;
begin
slave_pos.valid <= pos.valid;
slave_pos.x <= "0" & pos.x(10 downto 1);
slave_pos.y <= "0" & pos.y(10 downto 1);
slave : source port map(
pos => slave_pos,
col => col);
end architecture;
Now, the innermost pixel generators for several blocks are fairly similar (e.g. font pixel lookup), and because only one pixel will ever be passed outside, I wonder whether I can somehow share blocks, e.g. like the font in the hierarchy
output
source : split_screen
source : zoom
source : text
font
source: text
font
The text blocks themselves cannot be shared, because these contain the actual character codes given to the font blocks -- but the font block is twice exactly the same -- taking a coordinate and character code and returning the appropriate pixel value, with no state. Since the font data is large, not being able to share these is a problem.
Ideas I've had so far:
Have every block output '-' while pos.valid = '0', in the hope that the compiler will notice that this will be the case only for one block in the hierarchy, at all time. I'm not sure the compiler will get this.
Create a special component that arbitrates access to the font block, as a generic with array(1 to N) of point interface, selecting the first input with pos.valid = 1. This would still require me to build a hierarchy that is no longer a tree.
Can this be done?

Configuring LED pins as input on MACHxo2 board

I am attempting to configure the pins connected to the on board LEDs as input pins. Documentation states they are free i/o, but when I probe them with a scope it says they are outputting a "high" signal. This is on the MACHXO2 7000he cpld, but I assume the answer would be the same for any of the MACH boards.
Thanks in advance for any help.
Hey guys sorry for taking so long to reply. I would attach a picture of the circuit but I currently have too low a reputation to do so.
The LEDs are connected to a VCC of 3.3V. What I found was that by de-soldering the LEDs from the board, I was free to use the pins they were connected to as free i/o because this would create an open circuit between the pins and the 3.3V.
The pins are supposed to already by free i/o but the LEDs were active low which caused my program to see them as high signals all the time, ultimately making these pins permanent outputs.
Anyway, there's my answer and I hope it makes sense and helps one of you out one of these days.
Thanks for the responses.
You could try a led blinking example, like for example this:
LIBRARY ieee;
USE ieee.std_logic_1164.all;
LIBRARY lattice;
USE lattice.components.all;
ENTITY blinking_led IS
PORT(
led : BUFFER STD_LOGIC);
END blinking_led;
ARCHITECTURE behavior OF blinking_led IS
SIGNAL clk : STD_LOGIC;
--internal oscillator
COMPONENT OSCH
GENERIC(
NOM_FREQ: string := "53.20");
PORT(
STDBY : IN STD_LOGIC;
OSC : OUT STD_LOGIC;
SEDSTDBY : OUT STD_LOGIC);
END COMPONENT;
BEGIN
--internal oscillator
OSCInst0: OSCH
GENERIC MAP (NOM_FREQ => "53.20")
PORT MAP (STDBY => '0', OSC => clk, SEDSTDBY => OPEN);
PROCESS(clk)
VARIABLE count : INTEGER RANGE 0 TO 25_000_000;
BEGIN
IF(clk'EVENT AND clk = '1') THEN
IF(count < 25_000_000) THEN
count := count + 1;
ELSE
count := 0;
led <= NOT led;
END IF;
END IF;
END PROCESS;
END behavior;
For more reading please look at Lattice Diamond and MachXO2 Breakout Board Tutorial

Combinational Logic Timing

I am currently trying to implement a data path, which calculates the following, in one clock cycle.
Takes input A and B and add them.
Shift the result of addition, one bit to right. (Dividing by 2)
Subtract the shifted result from another input C.
The behavioral architecture of the entity is simply shown below.
signal sum_out : std_logic_vector (7 downto 0);
signal shift_out : std_logic_vector (7 downto 0);
process (clock, data_in_a, data_in_b, data_in_c)
begin
if clock'event and clock = '1' then
sum_out <= std_logic_vector(unsigned(data_in_a) + unsigned(data_in_b));
shift_out <= '0' & sum_out(7 downto 1);
data_out <= std_logic_vector(unsigned(data_in_c) - unsigned(shift_out));
end if;
end process;
When I simulate the above code, I do get the result I expect to get. However, I get the result, after 3 clock cycles, instead 1 as I wish. The simulation wave form is shown below.
I am not yet familiar with implementing designs with timing concerns. I was wondering, if there are ways to achieve above calculations, in one clock cycle. If there are, how can I implement them?
Do do this with signals simply register only the last element in the chain (data_out). This analyzes, I didn't write a test bench to verify simulation.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity signal_single_clock is
port (
signal clock: in std_logic;
signal data_in_a: in std_logic_vector(7 downto 0);
signal data_in_b: in std_logic_vector(7 downto 0);
signal data_in_c: in std_logic_vector(7 downto 0);
signal data_out: out std_logic_vector(7 downto 0)
);
end entity;
architecture behave of signal_single_clock is
signal sum_out : std_logic_vector (7 downto 0);
signal shift_out : std_logic_vector (7 downto 0);
begin
sum_out <= std_logic_vector(unsigned(data_in_a) + unsigned(data_in_b));
shift_out <= '0' & sum_out(7 downto 1);
single_reg:
process (clock)
begin
if clock'event and clock = '1' then
data_out <= std_logic_vector(unsigned(data_in_c) - unsigned(shift_out));
end if;
end process;
end architecture;
When you assign a new value to a signal inside a process, this new value will be available only after the process finishes execution. Therefore, anytime you read the signal's value you will be using the original value from when the process started executing.
On the other hand, assignments to varibles take place immediately, and the new value can be used in the subsequent statements if you wish.
So, to solve you problem, simply implement sum_out, shift_out, and data_out using variables, instead of signals. Then simply copy the value of data_out to an output port of your entity.
Without using variables:
sum <= in_a + in_b;
process (clock)
begin
if rising_edge(clock) then
data_out <= in_c - ('0' & sum(7 downto 1));
end if;
end process;
All declarations except clock are unsigned(7 downto 0); why make it more complicated than that?
The original, pipelined to 3 cycles, will probably work at higher clock rates.
EDIT following comment:
I wanted to demonstrate that VHDL doesn't really have to be that verbose.
However there seem to be a lot of people "teaching" VHDL who are focussing on trivial elements and missing the big picture entirely, so I'll say a little bit about that.
VHDL is a strongly typed language, to prevent mistakes that creep in when types are mistaken for each other and (e.g.) you add two large numbers and get a negative result.
It does NOT follow from that, that you need type conversions all over the place.
Indeed, if you need a lot of type conversions, it's a sign that your design is probably wrong, and it's time to rethink that instead of ploughing ahead down the wrong path.
Code - in ANY language - should be as clean and simple as possible.
Otherwise it's hard to read, and there are probably bugs in it.
The big difference between a C-like language and VHDL is this:
In C, using the correct data types you can write sum = in_a + in_b;
and it will work. Using the wrong data types you can also write sum = in_a + in_b;
and it will compile just fine; what it actually does is another matter! The bugs are hidden : it is up to you to determine the correct types, and if you get it wrong there is very little you can do except keep on testing.
in VHDL, using the right types you can write sum <= in_a + in_b;
and using the wrong types, the compiler forces you to write something like sum <= std_logic_vector(unsigned(in_a) + unsigned(in_b)); which is damn ugly, but will (probably: see note 1) still work correctly.
So to answer the question : how do I decide to use unsigned or std_logic_vector?
I see that I need three inputs and an output. I could just make them std_logic_vectorbut I stop and ask: what do they represent?
Numbers.
Can they be negative? Not according to my reading of the specification (your question).
So, unsigned numbers... (Note 1)
Do I need non-arithmetic operations on them? Yes there's a shift.(Note 2)
So, numeric_std.unsigned which is related to std_logic_vector instead of natural which is just an integer.
Now you can't avoid type conversions altogether. Coding standards may impose restrictions such as "all top level ports must be std_logic_vector" and you must implement the external entity specification you are asked to; intermediate signals for type conversions are sometimes cleaner than the alternatives, e.g. in_a <= unsigned(data_in_a);
Or if you are getting instructions, characters and the numbers above from the same memory, for example, you might decide the memory contents must be std_logic_vector because it doesn't just contain numbers. But pick the correct place to convert type and you will find maybe 90% of the type conversions disappear. Take that as a design guideline.
(Note 1 : but what happens if C < (A+B)/2 ? Should data_out be signed? Even thinking along these lines has surfaced a likely bug that std_logic_vector left hidden...
The right answer depends on unknowns including the purpose of data_out : if it is really supposed to be unsigned, e.g. a memory address, you may want to flag an error instead of making it signed)
(Note 2 : there isn't a synthesis tool left alive that won't translate
signal a : natural; ... x <= a/2 into a shift right, so natural would also work, unless there were other reasons to choose unsigned. A lot of people seem to still be taught that integers aren't synthesisable, and that's just wrong.)