Combinatorial synthesis: Better technology mapping results - yosys

Using the following script, I am synthesising to a standard cell library for which I have a lib file, my_library.lib:
read_liberty -lib my_library.lib
script yosys_readfiles.ys
proc; opt; memory; opt; fsm -norecode; opt
techmap; opt
dfflibmap -liberty my_library.lib
abc -liberty my_library.lib
hilomap -hicell LIB_TIEHI Y -locell LIB_TIELO Y
clean
write_verilog -noattr -noexpr output.v
stat
While this generally works, I found that some of the logic isn't mapped efficiently. For example, I have the following Verilog model of a 4-way multiplexer:
module mux4(
input i0,
input i1,
input i2,
input i3,
input s0,
input s1,
output z
);
reg zint;
parameter tdelay = `default_gate_delay;
always #(i0 or i1 or i2 or i3 or s0 or s1) begin
case ({s1, s0})
2'b00: zint <= i0;
2'b01: zint <= i1;
2'b10: zint <= i2;
2'b11: zint <= i3;
default: zint <= i3;
endcase
end
assign z = zint;
endmodule
Yosys synthesised this to the following gate-level netlist:
/* Generated by Yosys 0.5+ (git sha1 f13e387, gcc 5.3.1-8ubuntu2 -O2 -fstack-protector-strong -fPIC -Os) */
module mux4(i0, i1, i2, i3, s0, s1, z);
wire _00_;
wire _01_;
wire _02_;
wire _03_;
wire _04_;
wire _05_;
input i0;
input i1;
input i2;
input i3;
input s0;
input s1;
output z;
wire zint;
NAND3 _06_ (
.A(s1),
.B(s0),
.C(i3),
.Y(_04_)
);
INV _07_ (
.A(s1),
.Y(_05_)
);
NAND3 _08_ (
.A(_05_),
.B(s0),
.C(i1),
.Y(_00_)
);
INV _09_ (
.A(s0),
.Y(_01_)
);
NAND3 _10_ (
.A(_05_),
.B(_01_),
.C(i0),
.Y(_02_)
);
NAND3B _11_ (
.AN(s0),
.B(s1),
.C(i2),
.Y(_03_)
);
NAND4 _12_ (
.A(_02_),
.B(_00_),
.C(_03_),
.D(_04_),
.Y(z)
);
assign zint = z;
endmodule
Since the library I am using already has a MXI4 cell, I would have expected something similar to the following instead:
module mux4(i0, i1, i2, i3, s0, s1, z);
input i0;
input i1;
input i2;
input i3;
input s0;
input s1;
output z;
MXI4 _12_ (
.A(i0),
.B(i1),
.C(i2),
.D(i3),
.S0(s0),
.S1(s1),
.Y(z)
);
endmodule
I am wondering how I can direct Yosys to use the MXI4 cell instead of the cascaded NAND instances above as this would result in a significant reduction in area.
While for this specific cell, I could use the same technique as described in this answer to manually map to the MXI4 cell, but I am concerned that there may be other (more complex) areas of my design where such a manual mapping is either not as obvious and/or infeasible.
One thing I tried was to add the following option to the abc command in my synthesis script, which I found on Reddit:
-script +strash;scorr;ifraig;retime,{D};strash;dch,-f;map,-M,1,{D}
But it didn't solve the problem either. (Also I couldn't find any documentation on some of these ABC commands, any help there would be appreciated as well.)

The following ABC script should be able to map the MUX4, or it least it is when using the ABC version that is bundled with Yosys 0.7:
abc -liberty my_library.lib -script \
+strash;ifraig;scorr;dc2;dretime;strash;&get,-n;&dch,-f;&nf,{D};&put
Starting with git commit 8927e19 this is the new default script for abc -liberty.

Related

My testbench always shows X as the outputs

I'm not able to identify the bug, but all the code seems logically and syntactically right. The value of sum and carry in the testbench are always X.
There are two modules, one for an 8bit adder and another for a 16bit adder :
module adder_8(in1 , in2 , cin , sum , carry);
input [7:0] in1 , in2;
input cin;
output reg [7:0] sum;
output reg carry;
always #(in1 or in2) begin
{carry , sum} = in1 + in2 + cin;
end
endmodule
module adder_16(input_a , input_b , c , summation , cout);
input [15:0] input_a , input_b;
input c;
output [15:0] summation;
output cout;
wire t1;
adder_8 inst1 (input_a[7:0] , input_b[7:0] , c , summation[7:0] , t1);
adder_8 inst2 (input_a[15:8] , input_b[15:8] , t1 , summation[15:8] , cout);
endmodule
The test bench file is :
module testbench;
reg [15:0] a,b;
wire [15:0] sum;
wire carry;
parameter zero = 1'b0;
adder_16 ex(a , b , zero , sum , carry);
initial begin
$monitor($time," A = %d , B = %d sum = %d carry = %b", a , b , sum , carry);
#10 a = 16'd 100; b = 16'd 100;
#10 a = 16'd 50; b = 16'd 20;
#20 $finish;
end
endmodule
I would really appreciate some help.
cin is missing from the sensitivity list in always #(in1 or in2). It should be always #(in1 or in2 or cin) to be complement with the 1995 version of the standard. The 2001 version of the standard improved it to always #* (or the synonymous always #(*) for an automatic sensitivity list.
If you are targeting for SystemVerilog, use always_comb (no # or signal list). This will add an extra compile time check to make sure a the logic is not assigned in another always block which would make the code non-synthesizer.

Is there an option to synthsise some code into verilog built-in primitives?

I thought that techmap without any argument will do it but it didn't.
probably I missunderstand what 'logical synthsis' means.
basic example:
AND_GATE.v:
module AND_GATE( input A, input B, output X);
assign X = A & B;
endmodule
yosys> read_verilog AND_GATE.v
yosys> synth
....................
Number of wires: 3
Number of wire bits: 3
Number of public wires: 3
Number of public wire bits: 3
Number of memories: 0
Number of memory bits: 0
Number of processes: 0
Number of cells: 1
$_AND_ 1
yosys> abc -g AND,NAND,OR,NOR,XOR,XNOR
........................
3.1.2. Re-integrating ABC results.
ABC RESULTS: AND cells: 1
ABC RESULTS: internal signals: 0
ABC RESULTS: input signals: 2
ABC RESULTS: output signals: 1
Removing temp directory.
yosys> clean
Removed 0 unused cells and 3 unused wires.
yosys> write_verilog net.v
net.v
module AND_GATE(A, B, X);
(* src = "AND_GATE.v:1" *)
input A;
(* src = "AND_GATE.v:1" *)
input B;
(* src = "AND_GATE.v:1" *)
output X;
assign X = B & A;
endmodule
Using something like synth; abc -g AND,NAND,OR,NOR,XOR,XNOR will map to a basic set of gates equivalent to the Verilog primitives - techmap on its own won't get you far away either - but the Yosys verilog backend doesn't have an option to use built-in primitives, it always writes the gates as their expression.

Verilog Instantiating module inside a always block. Using Adder for multiplication

I have a code written for multiplying two 53 bit numbers (written below). I am using shift-add strategy using two other 106 bit registers. This code is working fine. Now I have another 53 bit highly optimized hans carlson adder module written in form:
module hans_carlson_adder(input [52:0] a, b, input c_in, output [52:0] sum, output c_out);
I want to use this adder to do the summation line in for loop (mentioned in code). I am having problem instantiating the adder inside an always block. Plus I dont want to have 106 instances (due to for loop) of this adder. Can you please help with this code
module mul1(
output reg [105:0] c,
input [52:0] x,
input [52:0] y,
input clk,
input state
);
reg [105:0] p;
reg [105:0]a;
integer i;
always #(posedge clk) begin
if (state==1) begin
a={53'b0,x[52:0]};
p=106'b0; // needs to zeroed
for(i=0;i<106;i=i+1) begin
if(y[i]) begin
p=p+a; //THIS LINE NEEDS TO BE REPLACED WITH HANS CARLSONADDER
end
a=a<<1;
end
c<=p;
end else begin
c=0;
end
end
endmodule
First you need to instantiate your adder outside of the always block and connect it to signals:
wire [52:0] b;
reg [5:0] count;
assign b = c[count+7'd52:count];
wire [52:0] sum;
wire c_out;
// Add in x depending on the bits in y
// b has the running result bits that still apply at this point
hans_carlson_adder u0(x, b, 1'b0, sum, c_out);
Now because this is a pipelined adder you are going to need something to kick off the multiplication (I'll call that input start) and something that indicates that the result is available (I'll call that output reg done). You'll want to add them to your mul1 module definition. You can choose a slightly different protocol depending on your needs. It appears that you have something that you've been implementing with the input state. I'm also going to use start to initialize during each calculation so that I don't need a separate reset signal.
reg [52:0] shift;
always #(posedge clk) begin
if (start) begin
done <= 0;
count <= 0;
c <= 106'b0;
shift <= y;
end else if (count < 53) begin
if (shift[0]) begin
c[count+7'd52:count] <= sum;
c[count+7'd53] <= c_out;
end
count <= count + 1;
shift = shift >> 1;
end else begin
done <= 1;
end
end
If you want to make an optimization you could end once the shift signal is equal to 0. In this case the done signal would become high as soon as there were no more bits to add into the result, so multiplying by small values of y would take less cycles.

Instantiate a module number of times based on a parameter value in Verilog

Assume we have the following arbitrary parameterized module
module module_x #(parameter WIDTH = 1) (in_a, in_b, out);
input [WIDTH - 1] in_a, in_b;
output out;
// Some module instantiation here
endmodule
How do I instantiate another based on the value of WIDTH ? like if it's 5 I instantiate it 5 times on each bit, is it possible to do this in Verilog ?
Generate statements are a common approach to this: Section 27 page 749 of IEEE 1800-1012.
A quick example :
logic [WIDTH-1:0] a;
logic [WIDTH-1:0] b;
genvar i;
generate
for(i=0; i<WIDTH; i++) begin
module_name instance_name(
.a(a[i]),
.b(a[i])
);
end
endgenerate
As #toolic has pointed out instance arrays are also possible, and simpler.
logic clk;
logic [WIDTH-1:0] a_i;
logic [WIDTH-1:0] b_i;
module_name instance_name[WIDTH-1:0] (
.clk ( clk ), //Single bit is replicated across instance array
.a ( a_i ), //connected wire a_i is wider than port so split across instances
.b ( b_i )
);

4-Bit verilog adder not passing carry bit

I had my 2-bit adder working, except for some reason it is not passing the carry bit. For instance if I use A=1 and B=1 the result S=00, but if either A or B is 1 i get S=1
?i tried printing out the values and it seems my c1 wire in the 2nd module isn't being set, and for some reason Cout is.
So with a input of A=1, B=1, S=00 and Cout=1
when it should be. S=10 and Cout=0
I have only been using Verilog for one day so the syntax is very new to me.
module fulladder(Cin,A,B,S,Cout); // dont forget semi colon
input A,B, Cin; // defaults to 1 bit or [0,0] size
output S, Cout;
wire XOR1,AND1,AND2;
xor(XOR1,A,B);
and(AND1,A,B);
xor(S,Cin,XOR1);
and(AND2,Cin,XOR1);
or(Cout,AND2,AND1);
endmodule
module adder4(Cin,A,B,S,Cout);
input Cin;
input [0:1]A;
input [0:1]B;
output [0:1]S;
output Cout;
wire c1;
fulladder FA1(Cin,A[0:0],B[0:0],S[0:0],c1);
fulladder FA2(c1,A[1:1],B[1:1],S[1:1],Cout);
endmodule
module t_adder;
reg Cin;
reg [1:0]A;
reg [1:0]B;// to declare size, must be on own line, wires can be more than 1 bit
wire [1:0]S;
wire Cout;
adder4 add4bit(Cin,A,B,S,Cout);
initial
begin
A = 1; B = 1; Cin = 0;
#1$display("S=%b Cout = %b",S,Cout);
end
endmodule
You're reversing the bit order in the adder4 module, by declaring the inputs as [0:1], where elsewhere it is [1:0].
Since you reverse the bits, to adder4 it looks like you are adding A=2'b10, B=2'b10, which gives the output you see (3'b100).