Question on arithmetic right shift operator - operators

I am having a little trouble wrapping my head around this particular operation-
0x44 >> 3
Where >> is an arithmetic right shift operator.
Now, the textbook which I'm referring to gives the answer as 1110 1000
However, I did it as follows-
0x44 => 0100 0100
Now, since the first bit is a zero, I calculated the result of arithmetic right shift as 0000 1000 (hex value 0x08)
But, the book gives the answer as 1110 1000 (hex value 0xE9)
What am I doing wrong here?
(The book is CS:APP, practise problem 2.16 for those interested).

As far as I can tell, this is practice problem 2.16 from the global edition of Computer Science: A Programmer's Perspective (3rd edition), which according to the original authors is full of errors.
Direct quote from the errata page:
Note on the Global Edition: Unfortunately, the publisher arranged for the generation of a different set of practice and homework
problems in the global edition. The person doing this didn't do a very
good job, and so these problems and their solutions have many errors.
We have not created an errata for this edition.
The advice around the web seems to be to pick up the North American edition if you're interested in doing the practice and homework problems.
Your answer is indeed the correct one:
0x44 >> 3 == 0x08

Related

PLC Best Naming Conventions for RSLogix 5000

What good naming conventions do you use for PLC?
I've seen hundreds of projects from different programmers, dozens of companies standards, RA, Beckhoff posted in some documents their naming... dozens of different ideas.
For years, naming tags was one of the most difficult task for me. You can't imagine discussion when I ask a student to create a bit. It's like being the hardest thing on Earth :) (usually, after creating a_bit and another_bit, inspiration is gone).
I asked for RSLogix 5000 because I found it most flexible, having tags, alias, scope tags, descriptions(stored in CPU for latest versions).
Have some tips to share that you find suitable for your use?
Naming tags should have a refrence to the real world. A recent example I did was this:
PTK3KOS1
Pressure Transmitter Kettle 3 Kettle Overhead Solvent #1
This is the tag used in the CMMS system (Maintenance system), and the P&ID
I use UDT's in RSL5K, so that becomes the following in RSLogix:
PTK3KOS1.VAL (Current value)
PTK3KOS1.MIN (I use this especially when I use flex I/O for scaling)
PTK3KOS1.MAX (And I also use it to pass min/max values to some HMI's like WW)
PTK3KOS1.LFF (Signal fault)
PTK3KOS1.LLA (Low alarm bit)
PTK3KOS1.LLL (Low Low bit)
PTK3KOS1.LHA (Hi Alarm bit)
PTK3KOS1.LHH (Hi Hi Bit)
PTK3KOS1.SLA (Setpoint low alarm)
PTK3KOS1.SLL
PTK3KOS1.SHA
PTK3KOS1.SHH
The most common system is the ISA system, see
http://www.engineeringtoolbox.com/isa-intrumentation-codes-d_415.html for an example.
There is also the KKS system, which I personally believe was designed by masochists, and will only use it when forced to do so.
http://www.vgb.org/en/db_kks_eng.html
I like to use some thing like this:
aabccdd_eeee_human-readable-name_wirenumber
aa
DO=Digital Output
DI=Digital Input
AO=Analog Output
AI=Analog Input
gl=Global variable
co=constant
pt=produced Tag
ct=consumed Tag
b
Rack Number
cc
Slot
dd
address 0-64
eeeee
panel/drawing tag
DO10606_MA949_WshLoaderAdvance_9491

Control a robotic arm

I have a Cyber Robot CYBER 310 and a Sciento CS-113 robotic arm with no documentation. Both use a parallel port.
How could I program those?
For the Cyber one, I found this:
Nothing at all on the Sciento one.
Any pointers or examples in Python/Java/C/whatever appreciated.
[update] This page contains some information, but I'm still lost: http://www.anf.nildram.co.uk/beebcontrol/arms/cyber/software.html
I am not entirely sure I understand what the question is.
Are you unfamiliar with with programming the parallel port?
My memory on it is hazy, but iirc it's pretty simple. It's a "dumb" interface so you simply need to write to it.
If you are running under linux then there are some great resources on it:
Linux Device Drivers: Chapter 9: An Overview of the Parallel port - Talks a bit about parallel port programming and goes on to talk about writing device drivers for it. A bit overkill I think for your application, but the entire book is fascinating, and enlightening.
Linux I/O port programming - essentially you can write to /dev/port, or include asm/io.h and use inb() and outb() (I haven't done this in a while, but im sure if you run into a specific problem there will be a multitude of answers out there once you have it narrowed down to something specific)
If you are on windows or mac, then id still suggest reading the above so you know what you are trying to do, they are straightforward in my opinion, then search for the windows/mac equivalent.
Now for what I assume the crux of the question is, what do you write to the ports?
For the Cyber 310 you have the pin layouts, although there seems to be multiple different pin layouts if you browse the site you have listed, and if we follow anf.nildram.co.uk here we can find some PIC assembly that will show us how to rotate the base.
I have never touched PIC assembly before today, but with some help from the internet and the comments, I think we can translate what this is trying to do (snipped out the relevant portion, as most of it is timing and looping )
; 6: Symbol prf = PORTA.0
; The address of 'prf' is 0x5,0
; 7: Symbol strobe = PORTA.1
; The address of 'strobe' is 0x5,1
; 8: Symbol base = PORTB.0
; The address of 'base' is 0x6,0
; 9: Symbol shoulder = PORTB.1
; The address of 'shoulder' is 0x6,1
...
; 16: main:
L0001:
; 17: base = 1
BSF 0x06,0 // set bit 0 at 0x06 to 1 essentially set base bit to 1
; 18: strobe = 1
BSF 0x05,1 // set strobe bit to 1
; 19: strobe = 0
BCF 0x05,1 // set strobe bit to 0
; 20: While a <> 730 // now we loop 729 more times
So it appears, from my naive perspective, that to rotate the arm you need to set the motor bits (grabbed from your pinout) then set and clear strobe.
Let me know if I am completely off base, this is a fascinating project.
Chris is right about the parallel port being a dumb interface. The parallel port has an address that you can output an 8bit binary number to that match the Digital Output's positions.
I found this to be a really good example of programming the Parallel port using C#.
http://www.codeproject.com/Articles/4981/I-O-Ports-Uncensored-1-Controlling-LEDs-Light-Emit
To match your project to his example. C0 is strobe. Then your Digital Outputs from left to right match his D0-D6.
Seems like a really fun project. Have fun.

Datefield changes to 2069 or 1970

I have a datefield in flex with editable="true".
When I type: 20-1-69 and I go to another field, it changes into 20-01-2069. but I don't want it to change into 2069. Is it possible to turn off this automatically change thing?
Thanks,
JSMB
In the past, when bytes where expensive, smart programmers used two digits to store the date. So a 60 was interpreted as an 1960 and all was wel.
Until the dreaded year 2000 was near, people started to panic, because computers would certainly stop to work and we would all die, or at least been cut from the Internet which is possibly worse.
So smart programmers were called to solve the problem.
Real smart programmers extended the number of digits to 4. But programmers that were self proclaimed smart, used the sliding window. 70 to 99 where interpreted as 1970-1999 and 0 to 69 was ineterpreted as 2000 to 2069. This of course solved the Y2K problem. But as you have found out, we have a 2070 problem.
The fun part is that, using this scheme, old guys like me are born in the future. Which makes us feel young again. So they probably did this on purpose.
You need to decide where you want the two-digit date to cut off between 1900 and 2000, and then perform some magic on the input.
Example: Cutoff is the year 1940
(pseudocode)
If twoDigitYear > 40
fourDigitYear = "19" + twoDigitYear
Else
fourDigitYear = "20" + twoDigitYear
This will give you years from 1941 to 2040.
Use a DateValidator to enforce a 4 digit year? If you're going to be entering in old dates, this is pretty much a requirement no matter what SDK you have. Even if you work out some way to make this work, some user will come along and get confused.

Quick divisibility check in ZX81 BASIC

Since many of the Project Euler problems require you to do a divisibility check for quite a number of times, I've been trying to figure out the fastest way to perform this task in ZX81 BASIC.
So far I've compared (N/D) to INT(N/D) to check, whether N is dividable by D or not.
I have been thinking about doing the test in Z80 machine code, I haven't yet figured out how to use the variables in the BASIC in the machine code.
How can it be achieved?
You can do this very fast in machine code by subtracting repeatedly. Basically you have a procedure like:
set accumulator to N
subtract D
if carry flag is set then it is not divisible
if zero flag is set then it is divisible
otherwise repeat subtraction until one of the above occurs
The 8 bit version would be something like:
DIVISIBLE_TEST:
LD B,10
LD A,100
DIVISIBLE_TEST_LOOP:
SUB B
JR C, $END_DIVISIBLE_TEST
JR Z, $END_DIVISIBLE_TEST
JR $DIVISIBLE_TEST_LOOP
END_DIVISIBLE_TEST:
LD B,A
LD C,0
RET
Now, you can call from basic using USR. What USR returns is whatever's in the BC register pair, so you would probably want to do something like:
REM poke the memory addresses with the operands to load the registers
POKE X+1, D
POKE X+3, N
LET r = USR X
IF r = 0 THEN GOTO isdivisible
IF r <> 0 THEN GOTO isnotdivisible
This is an introduction I wrote to Z80 which should help you figure this out. This will explain the flags if you're not familiar with them.
There's a load more links to good Z80 stuff from the main site although it is Spectrum rather than ZX81 focused.
A 16 bit version would be quite similar but using register pair operations. If you need to go beyond 16 bits it would get a bit more convoluted.
How you load this is up to you - but the traditional method is using DATA statements and POKEs. You may prefer to have an assembler figure out the machine code for you though!
Your existing solution may be good enough. Only replace it with something faster if you find it to be a bottleneck in profiling.
(Said with a straight face, of course.)
And anyway, on the ZX81 you can just switch to FAST mode.
Don't know if RANDOMIZE USR is available in ZX81 but I think it can be used to call routines in assembly. To pass arguments you might need to use POKE to set some fixed memory locations before executing RANDOMIZE USR.
I remember to find a list of routines implemented in the ROM to support the ZX Basic. I'm sure there are a few to perform floating operation.
An alternative to floating point is to use fixed point math. It's a lot faster in these kind of situations where there is no math coprocessor.
You also might find more information in Sinclair User issues. They published some articles related to programming in the ZX Spectrum
You should place the values in some pre-known memory locations, first. Then use the same locations from within Z80 assembler. There is no parameter passing between the two.
This is based on what I (still) remember of ZX Spectrum 48. Good luck, but you might consider upgrading your hw. ;/
The problem with Z80 machine code is that it has no floating point ops (and no integer divide or multiply, for that matter). Implementing your own FP library in Z80 assembler is not trivial. Of course, you can use the built-in BASIC routines, but then you may as well just stick with BASIC.

What are the most hardcore optimisations you've seen?

I'm not talking about algorithmic stuff (eg use quicksort instead of bubblesort), and I'm not talking about simple things like loop unrolling.
I'm talking about the hardcore stuff. Like Tiny Teensy ELF, The Story of Mel; practically everything in the demoscene, and so on.
I once wrote a brute force RC5 key search that processed two keys at a time, the first key used the integer pipeline, the second key used the SSE pipelines and the two were interleaved at the instruction level. This was then coupled with a supervisor program that ran an instance of the code on each core in the system. In total, the code ran about 25 times faster than a naive C version.
In one (here unnamed) video game engine I worked with, they had rewritten the model-export tool (the thing that turns a Maya mesh into something the game loads) so that instead of just emitting data, it would actually emit the exact stream of microinstructions that would be necessary to render that particular model. It used a genetic algorithm to find the one that would run in the minimum number of cycles. That is to say, the data format for a given model was actually a perfectly-optimized subroutine for rendering just that model. So, drawing a mesh to the screen meant loading it into memory and branching into it.
(This wasn't for a PC, but for a console that had a vector unit separate and parallel to the CPU.)
In the early days of DOS when we used floppy discs for all data transport there were viruses as well. One common way for viruses to infect different computers was to copy a virus bootloader into the bootsector of an inserted floppydisc. When the user inserted the floppydisc into another computer and rebooted without remembering to remove the floppy, the virus was run and infected the harddrive bootsector, thus permanently infecting the host PC. A particulary annoying virus I was infected by was called "Form", to battle this I wrote a custom floppy bootsector that had the following features:
Validate the bootsector of the host harddrive and make sure it was not infected.
Validate the floppy bootsector and
make sure that it was not infected.
Code to remove the virus from the
harddrive if it was infected.
Code to duplicate the antivirus
bootsector to another floppy if a
special key was pressed.
Code to boot the harddrive if all was
well, and no infections was found.
This was done in the program space of a bootsector, about 440 bytes :)
The biggest problem for my mates was the very cryptic messages displayed because I needed all the space for code. It was like "FFVD RM?", which meant "FindForm Virus Detected, Remove?"
I was quite happy with that piece of code. The optimization was program size, not speed. Two quite different optimizations in assembly.
My favorite is the floating point inverse square root via integer operations. This is a cool little hack on how floating point values are stored and can execute faster (even doing a 1/result is faster than the stock-standard square root function) or produce more accurate results than the standard methods.
In c/c++ the code is: (sourced from Wikipedia)
float InvSqrt (float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;
i = 0x5f3759df - (i>>1); // Now this is what you call a real magic number
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
A Very Biological Optimisation
Quick background: Triplets of DNA nucleotides (A, C, G and T) encode amino acids, which are joined into proteins, which are what make up most of most living things.
Ordinarily, each different protein requires a separate sequence of DNA triplets (its "gene") to encode its amino acids -- so e.g. 3 proteins of lengths 30, 40, and 50 would require 90 + 120 + 150 = 360 nucleotides in total. However, in viruses, space is at a premium -- so some viruses overlap the DNA sequences for different genes, using the fact that there are 6 possible "reading frames" to use for DNA-to-protein translation (namely starting from a position that is divisible by 3; from a position that divides 3 with remainder 1; or from a position that divides 3 with remainder 2; and the same again, but reading the sequence in reverse.)
For comparison: Try writing an x86 assembly language program where the 300-byte function doFoo() begins at offset 0x1000... and another 200-byte function doBar() starts at offset 0x1001! (I propose a name for this competition: Are you smarter than Hepatitis B?)
That's hardcore space optimisation!
UPDATE: Links to further info:
Reading Frames on Wikipedia suggests Hepatitis B and "Barley Yellow Dwarf" virus (a plant virus) both overlap reading frames.
Hepatitis B genome info on Wikipedia. Seems that different reading-frame subunits produce different variations of a surface protein.
Or you could google for "overlapping reading frames"
Seems this can even happen in mammals! Extensively overlapping reading frames in a second mammalian gene is a 2001 scientific paper by Marilyn Kozak that talks about a "second" gene in rat with "extensive overlapping reading frames". (This is quite surprising as mammals have a genome structure that provides ample room for separate genes for separate proteins.) Haven't read beyond the abstract myself.
I wrote a tile-based game engine for the Apple IIgs in 65816 assembly language a few years ago. This was a fairly slow machine and programming "on the metal" is a virtual requirement for coaxing out acceptable performance.
In order to quickly update the graphics screen one has to map the stack to the screen in order to use some special instructions that allow one to update 4 screen pixels in only 5 machine cycles. This is nothing particularly fantastic and is described in detail in IIgs Tech Note #70. The hard-core bit was how I had to organize the code to make it flexible enough to be a general-purpose library while still maintaining maximum speed.
I decomposed the graphics screen into scan lines and created a 246 byte code buffer to insert the specialized 65816 opcodes. The 246 bytes are needed because each scan line of the graphics screen is 80 words wide and 1 additional word is required on each end for smooth scrolling. The Push Effective Address (PEA) instruction takes up 3 bytes, so 3 * (80 + 1 + 1) = 246 bytes.
The graphics screen is rendered by jumping to an address within the 246 byte code buffer that corresponds to the right edge of the screen and patching in a BRanch Always (BRA) instruction into the code at the word immediately following the left-most word. The BRA instruction takes a signed 8-bit offset as its argument, so it just barely has the range to jump out of the code buffer.
Even this isn't too terribly difficult, but the real hard-core optimization comes in here. My graphics engine actually supported two independent background layers and animated tiles by using different 3-byte code sequences depending on the mode:
Background 1 uses a Push Effective Address (PEA) instruction
Background 2 uses a Load Indirect Indexed (LDA ($00),y) instruction followed by a push (PHA)
Animated tiles use a Load Direct Page Indexed (LDA $00,x) instruction followed by a push (PHA)
The critical restriction is that both of the 65816 registers (X and Y) are used to reference data and cannot be modified. Further the direct page register (D) is set based on the origin of the second background and cannot be changed; the data bank register is set to the data bank that holds pixel data for the second background and cannot be changed; the stack pointer (S) is mapped to graphics screen, so there is no possibility of jumping to a subroutine and returning.
Given these restrictions, I had the need to quickly handle cases where a word that is about to be pushed onto the stack is mixed, i.e. half comes from Background 1 and half from Background 2. My solution was to trade memory for speed. Because all of the normal registers were in use, I only had the Program Counter (PC) register to work with. My solution was the following:
Define a code fragment to do the blend in the same 64K program bank as the code buffer
Create a copy of this code for each of the 82 words
There is a 1-1 correspondence, so the return from the code fragment can be a hard-coded address
Done! We have a hard-coded subroutine that does not affect the CPU registers.
Here is the actual code fragments
code_buff: PEA $0000 ; rightmost word (16-bits = 4 pixels)
PEA $0000 ; background 1
PEA $0000 ; background 1
PEA $0000 ; background 1
LDA (72),y ; background 2
PHA
LDA (70),y ; background 2
PHA
JMP word_68 ; mix the data
word_68_rtn: PEA $0000 ; more background 1
...
PEA $0000
BRA *+40 ; patched exit code
...
word_68: LDA (68),y ; load data for background 2
AND #$00FF ; mask
ORA #$AB00 ; blend with data from background 1
PHA
JMP word_68_rtn ; jump back
word_66: LDA (66),y
...
The end result was a near-optimal blitter that has minimal overhead and cranks out more than 15 frames per second at 320x200 on a 2.5 MHz CPU with a 1 MB/s memory bus.
Michael Abrash's "Zen of Assembly Language" had some nifty stuff, though I admit I don't recall specifics off the top of my head.
Actually it seems like everything Abrash wrote had some nifty optimization stuff in it.
The Stalin Scheme compiler is pretty crazy in that aspect.
I once saw a switch statement with a lot of empty cases, a comment at the head of the switch said something along the lines of:
Added case statements that are never hit because the compiler only turns the switch into a jump-table if there are more than N cases
I forget what N was. This was in the source code for Windows that was leaked in 2004.
I've gone to the Intel (or AMD) architecture references to see what instructions there are. movsx - move with sign extension is awesome for moving little signed values into big spaces, for example, in one instruction.
Likewise, if you know you only use 16-bit values, but you can access all of EAX, EBX, ECX, EDX , etc- then you have 8 very fast locations for values - just rotate the registers by 16 bits to access the other values.
The EFF DES cracker, which used custom-built hardware to generate candidate keys (the hardware they made could prove a key isn't the solution, but could not prove a key was the solution) which were then tested with a more conventional code.
The FSG 2.0 packer made by a Polish team, specifically made for packing executables made with assembly. If packing assembly isn't impressive enough (what's supposed to be almost as low as possible) the loader it comes with is 158 bytes and fully functional. If you try packing any assembly made .exe with something like UPX, it will throw a NotCompressableException at you ;)