compute results and mux or not - optimization

Using pseudo code here. Are there pros and cons to these styles:
Say you have an alu that can do add, and, or and xor. Is it better to have code that computes the possible answers all the time then select the answer based on the opcode (in this case a one hot):
alu_add = a + b;
alu_and = a & b;
alu_or = a | b;
alu_xor = a ^ b;
...
if(opcode[0]) alu_out = alu_add;
else if(opcode[1]) alu_out = alu_and;
else if(opcode[2]) alu_out = alu_or;
else if(opcode[3]) alu_out = alu_xor;
An alternative would be to code it like this:
if(opcode[0]) alu_out = a + b;
else if(opcode[1]) alu_out = a & b;
else if(opcode[2]) alu_out = a | b;
else if(opcode[3]) alu_out = a ^ b;
I have also seen it as:
alu_add = a + b;
alu_and = a & b;
alu_or = a | b;
alu_xor = a ^ b;
...
alu_out =
( 8{opcode[0]} & alu_add ) |
( 8{opcode[1]} & alu_and ) |
( 8{opcode[2]} & alu_or ) |
( 8{opcode[3]} & alu_xor );
Are there pros and cons to either method or do they come out about the same in the end?

Think about this in terms of levels of logic and readability. The first two forms are fine in terms of readability, but both embody priority unnecessarily and will result in more levels of logic. The third form is also not so wonderful by either of these metrics. Lastly, there's no apparent reason to use one-hot coding here over binary coding. Here's how I'd code this:
parameter ALU_ADD = 2'b00;
parameter ALU_AND = 2'b01;
parameter ALU_OR = 2'b10;
parameter ALU_XOR = 2'b11;
reg [1:0] opcode; // 2 bits for binary coding vs. 4 for one-hot
// and later, in your always block:
case (opcode) // synopsys parallel_case
ALU_ADD: alu_out = a + b;
ALU_AND: alu_out = a & b;
ALU_OR: alu_out = a | b;
ALU_XOR: alu_out = a ^ b;
endcase
Here I've explicitly assigned values to the ALU opcodes, avoiding "magic numbers" and making it easier for others to understand what's happening. I've also used the case statement and applied a directive that tells my synthesis tool that no more than one expression can be matched, so no priority encoder will be inferred. I didn't name the intermediate signals (alu_add etc.) because these are trivial operations, but I frequently do so when I want convenient access to those signals (seeing their values after simulation in a waveform viewer for example).
You can learn more about using case statements effectively in this article from the excellent Sunburst Design site (no affiliation, just a former student).
Finally, about your question, "Is it better to have code that computes the possible answers all the time then select the answer based on the opcode" -- remember that Verilog is a hardware description language. All the implementations on this page are computing everything all the time anyway. Where they differ is in levels of logic and readability. Take a look at this page, which shows that my implementation has 1 level of logic beyond the operations themselves, where the if-else implementation has 3 additional levels of logic.

The first two will give the same logic, but you'll get a latch on alu_out because your if else block (your priority encoder) has no final else. (This is true for verilog anyway). If your timing is tight, you may have issues with the long paths that a priority encoder implies.
On the 3rd version, you'll get a more 'parallel' structure which may be better for timing. It won't latch.
In all three versions, each of the four operations will be clattering away nomatter what the opcode. This will have power implications.
It's my opinion that the first version wins for clarity, and you can get at each of the separate operations in your waveform viewer when debugging. There's no point in coding up something that is not as easy to read unless timing, area or power constraints are hurting.

Related

how to write "then" as IP constraint in Julia

Hello fellows, i am learning Julia and integer programing but i am stuck at one point
How to model "then" in julia-jump for integer programing leanring.
Stuck here here
#Define the variables of the model
#variable(mo, x[1:N,1:S], Bin)
#variable(mo, a[1:S]>=0)
#Assignment constraint
#constraint(mo, [i=1:N], sum(x[i,j] for j=1:S) == 1)
##constraint (mo, PLEASE HELP )
In cases like this you usually need to use Big-M constraints
So this will be:
a_ij >= s_i^2 - M*(1-x_ij)
where M is a "big enough" number. This means that if x_ij == 0 the inequality will always be true (and hence kind of turned-off). On the other hand when x_ij == 1 the M-part will be zeroed and the equation will hold.
In JuMP terms the code will look like this:
const M = 10_000
#constraint(mo, [i=1:N, j=1:S], a[i, j] >= s[i]^2 - M*(1 - x[i, j]))
However, if s[i] is an external parameter rather than model variable you could simply use x[i,j] <= a[j]/s[i]^2 proposed by #DanGetz. However when s[i] is #variable you really want to avoid dividing or multiplying variables by each other. So this big M approach is more general across use cases.

Given no modulus or if even/odd function, how would one check for an odd or even number?

I have recently sat a computing exam in university in which we were never taught beforehand about the modulus function or any other check for odd/even function and we have no access to external documentation except our previous lecture notes. Is it possible to do this without these and how?
Bitwise AND (&)
Extract the last bit of the number using the bitwise AND operator. If the last bit is 1, then it's odd, else it's even. This is the simplest and most efficient way of testing it. Examples in some languages:
C / C++ / C#
bool is_even(int value) {
return (value & 1) == 0;
}
Java
public static boolean is_even(int value) {
return (value & 1) == 0;
}
Python
def is_even(value):
return (value & 1) == 0
I assume this is only for integer numbers as the concept of odd/even eludes me for floating point values.
For these integer numbers, the check of the Least Significant Bit (LSB) as proposed by Rotem is the most straightforward method, but there are many other ways to accomplish that.
For example, you could use the integer division operation as a test. This is one of the most basic operation which is implemented in virtually every platform. The result of an integer division is always another integer. For example:
>> x = int64( 13 ) ;
>> x / 2
ans =
7
Here I cast the value 13 as a int64 to make sure MATLAB treats the number as an integer instead of double data type.
Also here the result is actually rounded towards infinity to the next integral value. This is MATLAB specific implementation, other platform might round down but it does not matter for us as the only behavior we look for is the rounding, whichever way it goes. The rounding allow us to define the following behavior:
If a number is even: Dividing it by 2 will produce an exact result, such that if we multiply this result by 2, we obtain the original number.
If a number is odd: Dividing it by 2 will result in a rounded result, such that multiplying it by 2 will yield a different number than the original input.
Now you have the logic worked out, the code is pretty straightforward:
%% sample input
x = int64(42) ;
y = int64(43) ;
%% define the checking function
% uses only multiplication and division operator, no high level function
is_even = #(x) int64(x) == (int64(x)/2)*2 ;
And obvisouly, this will yield:
>> is_even(x)
ans =
1
>> is_even(y)
ans =
0
I found out from a fellow student how to solve this simplistically with maths instead of functions.
Using (-1)^n :
If n is odd then the outcome is -1
If n is even then the outcome is 1
This is some pretty out-of-the-box thinking, but it would be the only way to solve this without previous knowledge of complex functions including mod.

minimize/1 is not rearranging the order of solutions

For Colombia's Observatorio Fiscal[1], I am coding a simple tax minimization problem, using CLP(R) (in SWI-Prolog). I want to use minimize/1 to find the least solution first. It is instead listing the bigger solution first. Here is the code:
:- use_module(library(clpr)).
deduction(_,3). % Anyone can take the standard deduction.
deduction(Who,D) :- itemizedDeduction(Who,D). % Or they can itemize.
income(joe,10). % Joe makes $10 a year.
itemizedDeduction(joe,4). % He can deduct more if he itemizes.
taxableIncome(Who,TI) :-
deduction(Who,D),
income(Who,I),
TI is I - D,
minimize(TI).
Here is what an interactive session looks like:
?- taxableIncome(joe,N).
N = 7 ;
N = 6 ;
false.
If I switch the word "minimize" to "maximize" it behaves identically. If I include no minimize or maximize clause, it doesn't look for a third solution, but otherwise it behaves the same:
?- taxableIncome(joe,N).
N = 7 ;
N = 6.
[1] The Observatorio Fiscal is a new organization that aims to model the Colombian economy, in order to anticipate the effects of changes in the law, similar to what the Congressional Budget Office or the Tax Policy Center do in the United States.
First, let's add the following definition to the program:
:- op(950,fy, *).
*_.
Using (*)/1, we can generalize away individual goals in the program.
For example, let us generalize away the minimize/1 goal by placing * in front:
taxableIncome(Who,TI) :-
deduction(Who,D),
income(Who,I),
TI #= I - D,
* minimize(TI).
We now get:
?- taxableIncome(X, Y).
X = joe,
Y = 7 ;
X = joe,
Y = 6.
This shows that CLP(R) in fact has nothing to do with this issue! These answers show that everything is already instantiated at the time minimize/1 is called, so there is nothing left to minimize.
To truly benefit from minimize/1, you must express the task in the form of CLP(R)β€”or better: CLP(Q)β€” constraints, then apply minimize/1 on a constrained expression.
Note also that in SWI-Prolog, both CLP(R) and CLP(Q) have elementary mistakes, and you cannot trust their results.
Per Mat's response, I rewrote the program expressing the constraints using CLP. The tricky bit was that I had to first collect all (both) possible values for deduction, then convert those values into a CLP domain. I couldn't get that conversion to work in CLP(R), but I could in CLP(FD):
:- use_module(library(clpfd)).
deduction(_,3). % Anyone can take the same standard deduction.
deduction(Who,D) :- % Or they can itemize.
itemizedDeduction(Who,D).
income(joe,10).
itemizedDeduction(joe,4).
listToDomain([Elt],Elt).
listToDomain([Elt|MoreElts],Elt \/ MoreDom) :-
MoreElts \= []
, listToDomain(MoreElts,MoreDom).
taxableIncome(Who,TI) :-
income(Who,I)
, findall(D,deduction(Who,D),DList)
, listToDomain(DList,DDomain)
% Next are the CLP constraints.
, DD in DDomain
, TI #= I-DD
, labeling([min(TI)],[TI]).

Hyperpriors for hierarchical models with Stan

I'm looking to fit a model to estimate multiple probabilities for binomial data with Stan. I was using beta priors for each probability, but I've been reading about using hyperpriors to pool information and encourage shrinkage on the estimates.
I've seen this example to define the hyperprior in pymc, but I'm not sure how to do something similar with Stan
#pymc.stochastic(dtype=np.float64)
def beta_priors(value=[1.0, 1.0]):
a, b = value
if a <= 0 or b <= 0:
return -np.inf
else:
return np.log(np.power((a + b), -2.5))
a = beta_priors[0]
b = beta_priors[1]
With a and b then being used as parameters for the beta prior.
Can anybody give me any pointers on how something similar would be done with Stan?
To properly normalize that, you need a Pareto distribution. For example, if you want a distribution p(a, b) ∝ (a + b)^(-2.5), you can use
a + b ~ pareto(L, 1.5);
where a + b > L. There's no way to normalize the density with support for all values greater than or equal to zero---it needs a finite L as a lower bound. There's a discussion of using just this prior as the count component of a hierarchical prior for a simplex.
If a and b are parameters, they can either both be constrained to be positive, or you can leave a unconstrained and declare
real<lower = L - a> b;
to insure a + b > L. L can be a small constant or something more reasonable given your knowledge of a and b.
You should be careful because this will not identify a + b. We use this construction as a hierarchical prior for simplexes as:
parameters {
real<lower = 1> kappa;
real<lower = 0, upper = 1> phi;
vector<lower = 0, upper = 1>[K] theta;
model {
kappa ~ pareto(1, 1.5); // power law prior
phi ~ beta(a, b); // choose your prior for theta
theta ~ beta(kappa * phi, kappa * (1 - phi)); // vectorized
There's an extended example in my Stan case study of repeated binary trials, which is reachable from the case studies page on the Stan web site (the case study directory is currently linked under the documentation link from the users tab).
Following suggestions in the comments I'm not sure that I will follow this approach, but for reference I thought I'd at least post the answer to my question of how this could be accomplished in Stan.
After some asking around on Stan Discourses and further investigation I found that the solution was to set a custom density distribution and use the target += syntax. So the equivalent for Stan of the example for pymc would be:
parameters {
real<lower=0> a;
real<lower=0> b;
real<lower=0,upper=1> p;
...
}
model {
target += log((a + b)^-2.5);
p ~ beta(a,b)
...
}

Algorithm - find the minimal subtraction between sum of two arrays

I am hunting job now and doing many algorithm exercises. Here is my problem:
Given two arrays: a and b with same length, the subject is to make |sum(a)-sum(b)| minimal, by swapping elements between a and b.
Here is my though:
assume we swap a[i] and b[j], set Delt = sum(a) - sum(b), x = a[i]-b[j]
then Delt2 = sum(a)-a[i]+b[j] - (sum(b)-b[j]+a[i]) = Delt - 2*x,
then the change = |Delt| - |Delt2|, which is proportional to |Delt|^2 - |Delt2|^2 = 4*x*(Delt-x),
Based on the thought above I got the following code:
Delt = sum(a) - sum(b);
done = false;
while(!done)
{
done = true;
for i = [0, n)
{
for j = [0,n)
{
x = a[i]-b[j];
change = x*(Delt-x);
if(change >0)
{
swap(a[i], b[j]);
Delt = Delt - 2*x;
done = false;
}
}
}
}
However, does anybody have a much better solution ? If you got, please tell me and I would be very grateful of you!
This problem is basically the optimization problem for Partition Problem with an extra constraint of equal parts. I'll prove that adding this constraint doesn't make the problem easier.
NP-Hardness proof:
Assume there was an algorithm A that solves this problem in polynomial time, we can solve the Partition-Problem in polynomial time.
Partition(S):
for i in range(|S|):
S += {0}
result <- A(S\2,S\2) //arbitrary split S into 2 parts
if result is a partition: //simple to check, since partition is NP.
return true.
return false //no partition
Correctness:
If there is a partition denote as (S1,S2) [assume S2 has more elements], on iteration |S2|-|S1| [i.e. when adding |S2|-|S1| zeros]. The input to A will contatin enough zeros so we can return two equal length arrays: S2,S1+{0,0,...,0}, which will be a partition to S, and the algorithm will yield true.
If the algorithm yields true, and iteration k, we had two arrays: S2,S1, with same number of elements, and equal values. by removing k zeros from the arrays, we get a partition to the original S, so S had a partition.
Polynomial:
assume A takes P(n) time, the algorithm we produced will take n*P(n) time, which is also polynomial.
Conclusion:
If this problem is solveable in polynomial time, so does the Partion-Problem, and thus P=NP. based on this: this problem is NP-Hard.
Because this problem is NP-Hard, for an exact solution you will probably need an exponential algorith. One of those is simple backtracking [I leave it as an exercise to the reader to implement a backtracking solution]
EDIT: as mentioned by #jpalecek: by simply creating a reduction: S->S+(0,0,...,0) [k times 0], one can directly prove NP-Hardness by reduction. polynomial is trivial and correctness is very similar to the above partion's correctness proof: [if there is a partition, adding 'balancing' zeros is possible; the other direction is simply trimming those zeros]
Just a comment. Through all this swapping you can basically arrange the contents of both arrays as you like. So it is unimportant in which array the values are at start.
Can't do it in my head but I'm pretty sure there is a constructive solution. I think if you sort them first and then deal them according to some rule. Something along the lines If value > 0 and if sum(a)>sum(b) then insert to a else into b