How do I set output flags for ALU in "Nand to Tetris" course? - hdl

Although I tagged this homework, it is actually for a course which I am doing on my own for free. Anyway, the course is called "From Nand to Tetris" and I'm hoping someone here has seen or taken the course so I can get some help. I am at the stage where I am building the ALU with the supplied hdl language. My problem is that I can't get my chip to compile properly. I am getting errors when I try to set the output flags for the ALU. I believe the problem is that I can't subscript any intermediate variable, since when I just try setting the flags to true or false based on some random variable (say an input flag), I do not get the errors. I know the problem is not with the chips I am trying to use since I am using all builtin chips.
Here is my ALU chip so far:
/**
* The ALU. Computes a pre-defined set of functions out = f(x,y)
* where x and y are two 16-bit inputs. The function f is selected
* by a set of 6 control bits denoted zx, nx, zy, ny, f, no.
* The ALU operation can be described using the following pseudocode:
* if zx=1 set x = 0 // 16-bit zero constant
* if nx=1 set x = !x // Bit-wise negation
* if zy=1 set y = 0 // 16-bit zero constant
* if ny=1 set y = !y // Bit-wise negation
* if f=1 set out = x + y // Integer 2's complement addition
* else set out = x & y // Bit-wise And
* if no=1 set out = !out // Bit-wise negation
*
* In addition to computing out, the ALU computes two 1-bit outputs:
* if out=0 set zr = 1 else zr = 0 // 16-bit equality comparison
* if out<0 set ng = 1 else ng = 0 // 2's complement comparison
*/
CHIP ALU {
IN // 16-bit inputs:
x[16], y[16],
// Control bits:
zx, // Zero the x input
nx, // Negate the x input
zy, // Zero the y input
ny, // Negate the y input
f, // Function code: 1 for add, 0 for and
no; // Negate the out output
OUT // 16-bit output
out[16],
// ALU output flags
zr, // 1 if out=0, 0 otherwise
ng; // 1 if out<0, 0 otherwise
PARTS:
// Zero the x input
Mux16( a=x, b=false, sel=zx, out=x2 );
// Zero the y input
Mux16( a=y, b=false, sel=zy, out=y2 );
// Negate the x input
Not16( in=x, out=notx );
Mux16( a=x, b=notx, sel=nx, out=x3 );
// Negate the y input
Not16( in=y, out=noty );
Mux16( a=y, b=noty, sel=ny, out=y3 );
// Perform f
Add16( a=x3, b=y3, out=addout );
And16( a=x3, b=y3, out=andout );
Mux16( a=andout, b=addout, sel=f, out=preout );
// Negate the output
Not16( in=preout, out=notpreout );
Mux16( a=preout, b=notpreout, sel=no, out=out );
// zr flag
Or8way( in=out[0..7], out=zr1 ); // PROBLEM SHOWS UP HERE
Or8way( in=out[8..15], out=zr2 );
Or( a=zr1, b=zr2, out=zr );
// ng flag
Not( in=out[15], out=ng );
}
So the problem shows up when I am trying to send a subscripted version of 'out' to the Or8Way chip. I've tried using a different variable than 'out', but with the same problem. Then I read that you are not able to subscript intermediate variables. I thought maybe if I sent the intermediate variable to some other chip, and that chip subscripted it, it would solve the problem, but it has the same error. Unfortunately I just can't think of a way to set the zr and ng flags without subscripting some intermediate variable, so I'm really stuck!
Just so you know, if I replace the problematic lines with the following, it will compile (but not give the right results since I'm just using some random input):
// zr flag
Not( in=zx, out=zr );
// ng flag
Not( in=zx, out=ng );
Anyone have any ideas?
Edit: Here is the appendix of the book for the course which specifies how the hdl works. Specifically look at section 5 which talks about buses and says: "An internal pin (like v above) may not be subscripted".
Edit: Here is the exact error I get: "Line 68, Can't connect gate's output pin to part". The error message is sort of confusing though, since that does not seem to be the actual problem. If I just replace "Or8way( in=out[0..7], out=zr1 );" with "Or8way( in=false, out=zr1 );" it will not generate this error, which is what lead me to look up in the appendix and find that the out variable, since it was derived as intermediate, could not be subscripted.

For anyone else interested, the solution the emulator supports is to use multiple outputs
Something like:
Mux16( a=preout, b=notpreout, sel=no, out=out,out=preout2,out[15]=ng);

This is how I did the ALU:
CHIP ALU {
IN // 16-bit inputs:
x[16], y[16],
// Control bits:
zx, // Zero the x input
nx, // Negate the x input
zy, // Zero the y input
ny, // Negate the y input
f, // Function code: 1 for add, 0 for and
no; // Negate the out output
OUT // 16-bit output
out[16],
// ALU output flags
zr, // 1 if out=0, 0 otherwise
ng; // 1 if out<0, 0 otherwise
PARTS:
Mux16(a=x, b=false, sel=zx, out=M16x);
Not16(in=M16x, out=Nx);
Mux16(a=M16x, b=Nx, sel=nx, out=M16M16x);
Mux16(a=y, b=false, sel=zy, out=M16y);
Not16(in=M16y, out=Ny);
Mux16(a=M16y, b=Ny, sel=ny, out=M16M16y);
And16(a=M16M16x, b=M16M16y, out=And16);
Add16(a=M16M16x, b=M16M16y, out=Add16);
Mux16(a=And16, b=Add16, sel=f, out=F16);
Not16(in=F16, out=NF16);
Mux16(a=F16, b=NF16, sel=no, out=out, out[15]=ng, out[0..7]=zout1, out[8..15]=zout2);
Or8Way(in=zout1, out=zr1);
Or8Way(in=zout2, out=zr2);
Or(a=zr1, b=zr2, out=zr3);
Not(in=zr3, out=zr);
}

The solution as Pax suggested was to use an intermediate variable as input to another chip, such as Or16Way. Here is the code after I fixed the problem and debugged:
CHIP ALU {
IN // 16-bit inputs:
x[16], y[16],
// Control bits:
zx, // Zero the x input
nx, // Negate the x input
zy, // Zero the y input
ny, // Negate the y input
f, // Function code: 1 for add, 0 for and
no; // Negate the out output
OUT // 16-bit output
out[16],
// ALU output flags
zr, // 1 if out=0, 0 otherwise
ng; // 1 if out<0, 0 otherwise
PARTS:
// Zero the x input
Mux16( a=x, b=false, sel=zx, out=x2 );
// Zero the y input
Mux16( a=y, b=false, sel=zy, out=y2 );
// Negate the x input
Not16( in=x2, out=notx );
Mux16( a=x2, b=notx, sel=nx, out=x3 );
// Negate the y input
Not16( in=y2, out=noty );
Mux16( a=y2, b=noty, sel=ny, out=y3 );
// Perform f
Add16( a=x3, b=y3, out=addout );
And16( a=x3, b=y3, out=andout );
Mux16( a=andout, b=addout, sel=f, out=preout );
// Negate the output
Not16( in=preout, out=notpreout );
Mux16( a=preout, b=notpreout, sel=no, out=preout2 );
// zr flag
Or16Way( in=preout2, out=notzr );
Not( in=notzr, out=zr );
// ng flag
And16( a=preout2, b=true, out[15]=ng );
// Get final output
And16( a=preout2, b=preout2, out=out );
}

Have you tried:
// zr flag
Or8way(
in[0]=out[ 0], in[1]=out[ 1], in[2]=out[ 2], in[3]=out[ 3],
in[4]=out[ 4], in[5]=out[ 5], in[6]=out[ 6], in[7]=out[ 7],
out=zr1);
Or8way(
in[0]=out[ 8], in[1]=out[ 9], in[2]=out[10], in[3]=out[11],
in[4]=out[12], in[5]=out[13], in[6]=out[14], in[7]=out[15],
out=zr2);
Or( a=zr1, b=zr2, out=zr );
I don't know if this will work but it seems to make sense from looking at this document here.
I'd also think twice about using out as a variable name since it's confusing trying to figure out the difference between that and the keyword out (as in "out=...").
Following your edit, if you cannot subscript intermediate values, then it appears you will have to implement a separate "chip" such as IsZero16 which will take a 16-bit value as input (your intermediate out) and return one bit indicating its zero-ness that you can load into zr. Or you could make an IsZero8 chip but you'd have to then call it it two stages as you're currently doing with Or8Way.
This seems like a valid solution since you can subscript the input values to a chip.
And, just looking at the error, this may be a different problem to the one you suggest. The phrase "Can't connect gate's output pin to part" would mean to me that you're unable to connect signals from the output parameter back into the chips processing area. That makes sense from an electrical point of view.
You may find you have to store the output into a temporary variable and use that to both set zr and out (since once the signals have been "sent" to the chips output pins, they may no longer be available).
Can we try:
CHIP SetFlags16 {
IN inpval[16];
OUT zflag,nflag;
PARTS:
Or8way(in=inpval[0.. 7],out=zr0);
Or8way(in=inpval[8..15],out=zr1);
Or(a=zr0,b=zr1,out=zflag);
Not(in=inpval[15],out=nflag);
}
and then, in your ALU chip, use this at the end:
// Negate the output
Not16( in=preout, out=notpreout );
Mux16( a=preout, b=notpreout, sel=no, out=tempout );
// flags
SetFlags16(inpval=tempout,zflag=zr,nflag=ng);
// Transfer tempout to out (may be a better way).
Or16(a=tempout,b=tempout,out=out);

Here's one also with a new chip but it feels cleaner
/**
* Negator16 - negates the input 16-bit value if the selection flag is lit
*/
CHIP Negator16 {
IN sel,in[16];
OUT out[16];
PARTS:
Not16(in=in, out=negateIn);
Mux16(a=in, b=negateIn, sel=sel, out=out);
}
CHIP ALU {
// IN and OUT go here...
PARTS:
//Zero x and y if needed
Mux16(a=x, b[0..15]=false, sel=zx, out=x1);
Mux16(a=y, b[0..15]=false, sel=zy, out=y1);
//Create x1 and y1 negations if needed
Negator16(in=x1, sel=nx, out=x2);
Negator16(in=y1, sel=ny, out=y2);
//Create x&y and x+y
And16(a=x2, b=y2, out=andXY);
Add16(a=x2, b=y2, out=addXY);
//Choose between And/Add according to selection
Mux16(a=andXY, b=addXY, sel=f, out=res);
// negate if needed and also set negative flag
Negator16(in=res, sel=no, out=res1, out=out, out[15]=ng);
// set zero flag (or all bits and negate)
Or16Way(in=res1, out=nzr);
Not(in=nzr, out=zr);
}

Related

Eigenvalues do not match between Eigen, Numpy, LAPACKE (Ubuntu), Intel MKL

I have the following matrix M of doubles:
55.774375 61.0225 62.805625
-122.045 -125.61125 -122.045
62.805625 61.0225 55.774375
(the matrix is part of an algorithm to estimate the parameters of an ellipse from a 2D point cloud, see https://autotrace.sourceforge.net/WSCG98.pdf for reference).
And now comes the interesting part. Determining the eigenvalues and (right) eigenvectors of the matrix with different packages leads to different results for the (right) eigenvectors. Eigenvalues are for every package the same:
[-5.09420041e-13 -7.03125000e+00 -7.03125000e+00]
For numpy with python:
M = np.array([[55.774375, 61.0225, 62.805625],
[-122.045, -125.61125, -122.045, ],
[62.805625, 61.0225, 55.774375]])
eval, evec = np.linalg.eig(M)
I get:
[[ 0.41608575 0.37443021 -0.80942954]
[-0.80854518 -0.82367703 0.34119147]
[ 0.41608575 0.42586167 0.47792489]]
With Eigen C++, the code looks as follows
Eigen::Matrix3d M;
M << 55.774375, 61.0225, 62.805625,
-122.045, -125.61125, -122.045,
62.805625, 61.0225, 55.774375;
Eigen::EigenSolver<Eigen::MatrixXd> solver;
solver.compute(M);
I get for the eigenvectors
0.416086 0.376456 -0.462421
-0.808545 -0.823758 0.820878
0.416086 0.423914 -0.335151
With LAPACKE (apt install liblapack-dev lapacke lapacke-dev)
double Marr[]{55.774375, 61.0225, 62.805625,
-122.045, -125.61125, -122.045,
62.805625, 61.0225, 55.774375};
char jobvl = 'N';
char jobvr = 'V';
int n=3;
int lda = n;
int ldvl = n;
int ldvr = n;
int lwork = -1;
int info;
double wr[n], wi[n], vl[ldvl*n], vr[ldvr*n];
LAPACKE_dgeev( LAPACK_ROW_MAJOR, 'V', 'V', n, Marr, lda, wr, wi,
vl, ldvl, vr, ldvr );
if( info > 0 ) {
printf( "The algorithm failed to compute eigenvalues.\n" );
exit( 1 );
}
I get for the eigenvectors
0.416086 0.376456 -0.788993
-0.808545 -0.823758 0.565975
0.416086 0.423914 0.239087
Similar are the results for Intel MKL.
I checked the determinant of M and it is close to zero (-4.0031989907207254e-05).
What I would like to understand is
Why do the eigenvectors for same eigenvalues differ so much between the libraries? Is this because of the different numerical methods used to approx them?
I understand that an Eigenvalue d has many associated eigenvectors, i.e. if v is the eigenvector for d than q * v (q being a scalar) is also an eigenvector of d. Since the second and the third eigenvalue are the same, I would assume that there is some scalar that transforms one into the other, but this doesn't seem to be the case.
My algorithm fails in C++ (in python it is working) due to the different eigenvectors. Is there a way out?

Collision Angle Detection

I have some questions regarding collision angles. I am trying to code physics for a game and I do not want to use any third party library, actually I want to code each and every thing by myself. I know how to detect collisions between two spheres but I can't figure out, how to find the angle of collision/repulsion between the two spherical objects. I've tried reversing the direction of the objects, but no luck. It would be very nice if you link me to an interesting .pdf file teaching physics programming.
There's a lot of ways to deal with collision
Impulsion
To model a impulsion, you can directly act on the speed of each objects, using the law of reflection, you can "reflect" each speed using the "normal of the impact"
so : v1 = v1 - 2 x ( v1 . n2 ) x n2
and v2 = v2 - 2 x ( v2 . n1 ) x n1
v1 and v2 speeds of sphere s1 and s2
n1 and n2 normal at collision point
Penalty
Here, we have 2 object interpenetrating, and we model the fact that they tend to not interpenetrate anymore, so you create a force that is proportional to the penetration using a spring force
I didn't speak about all the ways, but this are the two simplest I know
the angle between two objects in the 2D or 3D coordinate space can be found by
A * B = |A||B|cosɵ
Both A and B are vectors and ɵ is the angle between both vectors.
the below class can be used to solve basic Vector calculations in games
class 3Dvector
{
private:
float x, y, z;
public:
// purpose: Our constructor
// input: ex- our vector's i component
// why- our vector's j component
// zee- our vector's k component
// output: no explicit output
3Dvector(float ex = 0, float why = 0, float zee = 0)
{
x = ex; y = why; z = zee;
}
// purpose: Our destructor
// input: none
// output: none
~3Dvector() { }
// purpose: calculate the magnitude of our invoking vector
// input: no explicit input
// output: the magnitude of our invoking object
float getMagnitude()
{
return sqrtf(x * x + y * y + z * z);
}
// purpose: multiply our vector by a scalar value
// input: num - the scalar value being multiplied
// output: our newly created vector
3Dvector operator*(float num) const
{
return 3Dvector(x * num, y * num, z * num);
}
// purpose: multiply our vector by a scalar value
// input: num - the scalar value being multiplied
// vec - the vector we are multiplying to
// output: our newly created vector
friend 3Dvector operator*(float num, const 3Dvector &vec)
{
return 3Dvector(vec.x * num, vec.y * num, vec.z * num);
}
// purpose: Adding two vectors
// input: vec - the vector being added to our invoking object
// output: our newly created sum of the two vectors
3Dvector operator+(const 3Dvector &vec) const
{
return 3Dvector(x + vec.x, y + vec.y, z + vec.z);
}
// purpose: Subtracting two vectors
// input: vec - the vector being subtracted from our invoking object
// output: our newly created difference of the two vectors
3Dvector operator-(const 3Dvector &vec) const
{
return 3Dvector(x - vec.x, y - vec.y, z - vec.z);
}
// purpose: Normalize our invoking vector *this changes our vector*
// input: no explicit input
// output: none
void normalize3Dvector(void)
{
float mag = sqrtf(x * x + y * y + z * z);
x /= mag; y /= mag; z /= mag
}
// purpose: Dot Product two vectors
// input: vec - the vector being dotted with our invoking object
// output: the dot product of the two vectors
float dot3Dvector(const 3Dvector &vec) const
{
return x * vec.x + y * vec.y + z * vec.z;
}
// purpose: Cross product two vectors
// input: vec- the vector being crossed with our invoking object
// output: our newly created resultant vector
3Dvector cross3Dvector(const 3Dvector &vec) const
{
return 3Dvector( y * vec.z – z * vec.y,
z * vec.x – x * vec.z,
x * vec.y – y * vec.x);
}
};
I shouldn't be answering my own question but I found what I needed, I guess. It may help other people too. I was just fingering the wikipedia's physics section and I got this.
This link solves my question
The angle in a cartesian system can be found this way:
arctan((Ya-Yb)/(Xa-Xb))
Because this is a retangle triangle where you know the catets (diferences of heights and widths). This will calc the tangent. So the arctan will calc the angle thats have this tangent.
I hope I was helpful.

C/Obj-C noise generators always return 0 after the first run?

Unfortunately the simplex/perlin noise generator I've always used is very bloated and java-based, and would be a pain to transfer to c/obj-c. I'm looking for better classes to use in an iOS version of a game, but i have an odd problem.
I have code that loops through each "tile" of a 2d background - it should calculate a noise value for each tile. In my java implementations it works fine.
However, each time I run the code, it appears to print a proper value the first time the breakpoint is hit, but from then on only ever returns zero:
for (double x = 0; x < 2; x++){
for (double y = 0; y < 2; y++){
double tileNoise = PerlinNoise2D(x,y,2,2,1);
}
}
I've tried two different implementations, the current being this c perlin library.
The breakpoint shows a value like 1.88858049852505e-308 the first time, but when I continue execution all subsequent breaks show "0".
What am I missing?
Perlin noise is defined to be zero for integer locations. Try rotating, scaling or translating your space and see what happens example:
double u = 0.1;
double v = 0.1;
for (double x = 0; x < 2; x++){
for (double y = 0; y < 2; y++){
double tileNoise = PerlinNoise2D(x+u,y+v,2,2,1);
}
}

Checking if lines intersect and if so return the coordinates

I've written some code below to check if two line segments intersect and if they do to tell me where. As input I have the (x,y) coordinates of both ends of each line. It appeared to be working correctly but now in the scenario where line A (532.87,787.79)(486.34,769.85) and line B (490.89,764.018)(478.98,783.129) it says they intersect at (770.136, 487.08) when the lines don't intersect at all.
Has anyone any idea what is incorrect in the below code?
double dy[2], dx[2], m[2], b[2];
double xint, yint, xi, yi;
WsqT_Location_Message *location_msg_ptr = OPC_NIL;
FIN (intersect (<args>));
dy[0] = y2 - y1;
dx[0] = x2 - x1;
dy[1] = y4 - y3;
dx[1] = x4 - x3;
m[0] = dy[0] / dx[0];
m[1] = dy[1] / dx[1];
b[0] = y1 - m[0] * x1;
b[1] = y3 - m[1] * x3;
if (m[0] != m[1])
{
//slopes not equal, compute intercept
xint = (b[0] - b[1]) / (m[1] - m[0]);
yint = m[1] * xint + b[1];
//is intercept in both line segments?
if ((xint <= max(x1, x2)) && (xint >= min(x1, x2)) &&
(yint <= max(y1, y2)) && (yint >= min(y1, y2)) &&
(xint <= max(x3, x4)) && (xint >= min(x3, x4)) &&
(yint <= max(y3, y4)) && (yint >= min(y3, y4)))
{
if (xi && yi)
{
xi = xint;
yi = yint;
location_msg_ptr = (WsqT_Location_Message*)op_prg_mem_alloc(sizeof(WsqT_Location_Message));
location_msg_ptr->current_latitude = xi;
location_msg_ptr->current_longitude = yi;
}
FRET(location_msg_ptr);
}
}
FRET(location_msg_ptr);
}
There is an absolutely great and simple theory about lines and their intersections that is based on adding an extra dimensions to your points and lines. In this theory a line can be created from two points with one line of code and the point of line intersection can be calculated with one line of code. Moreover, points at the Infinity and lines at the Infinity can be represented with real numbers.
You probably heard about homogeneous representation when a point [x, y] is represented as [x, y, 1] and the line ax+by+c=0 is represented as [a, b, c]?
The transitioning to Cartesian coordinates for a general homogeneous representation of a point [x, y, w] is [x/w, y/w]. This little trick makes all the difference including representation of lines at infinity (e.g. [1, 0, 0]) and making line representation look similar to point one. This introduces a GREAT symmetry into formulas for numerous line/point manipulation and is an
absolute MUST to use in programming. For example,
It is very easy to find line intersections through vector product
p = l1xl2
A line can be created from two points is a similar way:
l=p1xp2
In the code of OpenCV it it just:
line = p1.cross(p2);
p = line1.cross(line2);
Note that there are no marginal cases (such as division by zero or parallel lines) to be concerned with here. My point is, I suggest to rewrite your code to take advantage of this elegant theory about lines and points.
Finally, if you don't use openCV, you can use a 3D point class and create your own cross product function similar to this one:
template<typename _Tp> inline Point3_<_Tp> Point3_<_Tp>::cross(const Point3_<_Tp>& pt) const
{
return Point3_<_Tp>(y*pt.z - z*pt.y, z*pt.x - x*pt.z, x*pt.y - y*pt.x);
}

Discrete Wavelet Transform on images and watermark embedding in LL band coefficients, data is lost when IDWT-DWT is performed again?

I'm writing an image watermarking system to hide a watermark in an image's low frequency band by transforming the image's luminance channel with a Discrete Wavelet Transform, then modifying coefficients in the LL band of the DWT output. I then do an Inverse DWT and rebuild my image.
The problem I'm having is when I modify coefficients in the DWT output, then inverse-DWT, and then DWT again, the modified coefficients are radically different.
For example, one of the output coefficients in the LL band of the 2-scale DWT was -0.10704, I modified this coefficient to be 16.89, then performed the IDWT on my data. I then took the output of the IDWT and performed a DWT on it again, and my coefficient which was modified to be 16.89 became 0.022.
I'm fairly certain that the DWT and IDWT code is correct because I've tested it against other libraries and the output from each transform matches when the filter coefficients and other parameters are the same. (Within what can be expected due to rounding error)
The main problem I have is that I perhaps don't understand the DWT all that well, I thought DWT and IDWT were supposed to be reasonably lossless (Aside from rounding error and such), yet this doesn't seem to be the case here.
I'm hoping someone more familiar with the transform can point me at a possible issue, is it possible that because the coefficients in my other subbands (LH, HL, HH) for that position are insignificant I'm losing data? If so, how can I determine which coefficients this may happen to?
My embedding function is below, coefficients are chosen in the LL band, "strong" is determined to be true if the absolute value of the LH, HH, or HL band for the selected location is larger than the mean value of the corresponding subband.
//If this evaluates to true, then the texture is considered strong.
if ((Math.Abs(LH[i][w]) >= LHmean) || (Math.Abs(HL[i][w]) >= HLmean) || (Math.Abs(HH[i][w]) >= HHmean))
static double MarkCoeff(int index, double coeff,bool strong)
{
int q1 = 16;
int q2 = 8;
int quantizestep = 0;
byte watermarkbit = binaryWM[index];
if(strong)
quantizestep = q1;
else
quantizestep = q2;
coeff /= (double)quantizestep;
double coeffdiff = 0;
if(coeff > 0.0)
coeffdiff = coeff - (int)coeff;
else
coeffdiff = coeff + (int)coeff;
if (1 == ((int)coeff % 2))
{
//odd
if (watermarkbit == 0)
{
if (Math.Abs(coeffdiff) > 0.5)
coeff += 1.0;
else
coeff -= 1.0;
}
}
else
{
//even
if (watermarkbit == 1)
{
if (Math.Abs(coeffdiff) > 0.5)
coeff += 1.0;
else
coeff -= 1.0;
}
}
coeff *= (double)quantizestep;
return coeff;
}