Correlating traditional Windows joystick axes with HID - usb

I'm a bit confused on the description of joystick axes and I'm hoping that someone has a link or document which could help clear my confusion.
I'm not a Windows guy, so trying to port some traditional Windows gameport code has me a bit confused.
We all know about the common first three axes:
X
Y
Z
My understanding was that in the gameport-style interface the three other axes are:
R
U
V
However, looking in my IOHIDUsageTables (OS X), I see:
kHIDUsage_GD_X = 0x30, /* Dynamic Value */
kHIDUsage_GD_Y = 0x31, /* Dynamic Value */
kHIDUsage_GD_Z = 0x32, /* Dynamic Value */
kHIDUsage_GD_Rx = 0x33, /* Dynamic Value */
kHIDUsage_GD_Ry = 0x34, /* Dynamic Value */
kHIDUsage_GD_Rz = 0x35, /* Dynamic Value */
kHIDUsage_GD_Vx = 0x40, /* Dynamic Value */
kHIDUsage_GD_Vy = 0x41, /* Dynamic Value */
kHIDUsage_GD_Vz = 0x42, /* Dynamic Value */
kHIDUsage_GD_Vbrx = 0x43, /* Dynamic Value */
kHIDUsage_GD_Vbry = 0x44, /* Dynamic Value */
kHIDUsage_GD_Vbrz = 0x45, /* Dynamic Value */
kHIDUsage_GD_Vno = 0x46, /* Dynamic Value */
This has me a bit confused due to the three R axis (though that does not appear to be uncommon) and the lack of a U axis.
Two questions:
1) Can anyone confirm to what axis the traditional U axis would be? I saw one document describe it as "the axis for rudder pedals" leading me to believe it would be Ry.
2) Can anyone describe in more detail the typical usages of the V and Vbr axes? I understand the descriptions are "vector" and "relative vector,' respectively, but I'm having difficult visualizing what that means in terms of a physical device.
All enlightenment and documentation pointers welcome.

There are 2 different conventions here with confusingly similar naming:
Position, (R)otation and/or (V)elocity for each of the x, y, z axes
(R), (U), (V) axes
It might be the case that the R, U, V axes map directly onto 3 of the HID slots, whichever ones they may be. Or it might be the case that the drivers do something else, depending on which exact piece of hardware it is.
Personally I wouldn't spend too much time worrying about what each axis 'means' or whether they can be mapped directly. Each joystick has different physical controllers which will be mapped by the drivers in an arbitrary way. So beyond X and Y it's difficult to anticipate what axes will be used for each function. And even if you can guess the original intention, it's likely that a user may wish to override the defaults. So it's probably best to implement your axis mapping via a settings file that can be configured on a per-device and per-user basis.

Related

objective-c crop vImage PixelBuffer

How to access the existing crop capabilities of vImage that are only documented for swift, but for objective-c?
https://developer.apple.com/documentation/accelerate/vimage/pixelbuffer/3951652-cropped?changes=_7_1&language=objc
just for linkage, i asked also on apple developer forum:
https://developer.apple.com/forums/thread/720851
In C, the computation is fairly straightforward, because the vImage_Buffer is just a pointer, height, width and rowBytes. It didn't exist for the first 20 years because it was assumed you could just do it trivially yourself. (Apple assumes familiarity with pointers in C based languages.) To be clear, you aren't actually cropping the image, just moving the pointer from the top left of the image to the top left of the sub rectangle and making the width and height smaller. The pixels stay where they are.
#include <Accelerate/Accelerate.h>
#include <CoreGraphics/CoreGraphics.h>
#define AdvancePtr( _ptr, _bytes) (__typeof__(_ptr))((uintptr_t)(_ptr) + (size_t)(_bytes))
static inline vImage_Buffer MyCrop( vImage_Buffer buf, CGRect where, size_t pixelBytes )
{
return (vImage_Buffer)
{
// irresponsibly assume where fits inside buf without checking
.data = AdvancePtr( buf.data, where.origin.y * buf.rowBytes + where.origin.x * pixelBytes ),
.height = (unsigned long) where.size.height, // irresponsibly assume where.size.height is an integer and not oversized
.width = (unsigned long) where.size.width, // irresponsibly assume where.size.width is an integer and not oversized
.rowBytes = buf.rowBytes
};
}
In Swift, there is less monkeying with raw pointers, so such methods may be deemed necessary.
Note that in certain cases with video content, wherein the "pixels" are actually glommed together in chunks, the calculation may be slightly different, and possibly the "pixel" may not be directly addressable at all. For example, if we had 422 content with YCbYCr 10-bit chunks (5 bytes/chunk), and you want to point to the second Y in the chunk, this wouldn't be possible because it would not be located at a byte addressable address. It would be spanned across a pair of bytes.
When it is calculable, the x part of the pointer movement would look like this:
(x_offset * bits_per_pixel) / 8 /*bits per byte*/
and we'd want to make sure that division was exact, without remainder. Most pixel formats have channels that are some integer multiple of a byte and don't suffer from this complication.

nand2tetris HDL: Getting error "Sub bus of an internal node may not be used"

I am trying to make a 10-bit adder/subtractor. Right now, the logic works as intended. However, I am trying to set all bits to 0 iff there is overflow. To do this, I need to pass the output (tempOut) through a 10-bit Mux, but in doing so, am getting an error.
Here is the chip:
/**
* Adds or Subtracts two 10-bit values.
* Both inputs a and b are in SIGNED 2s complement format
* when sub == 0, the chip performs add i.e. out=a+b
* when sub == 1, the chip performs subtract i.e. out=a-b
* carry reflects the overflow calculated for 10-bit add/subtract in 2s complement
*/
CHIP AddSub10 {
IN a[10], b[10], sub;
OUT out[10],carry;
PARTS:
// If sub == 1, subtraction, else addition
// First RCA4
Not4(in=b[0..3], out=notB03);
Mux4(a=b[0..3], b=notB03, sel=sub, out=MuxOneOut);
RCA4(a=a[0..3], b=MuxOneOut, cin=sub, sum=tempOut[0..3], cout=cout03);
// Second RCA4
Not4(in=b[4..7], out=notB47);
Mux4(a=b[4..7], b=notB47, sel=sub, out=MuxTwoOut);
RCA4(a=a[4..7], b=MuxTwoOut, cin=cout03, sum=tempOut[4..7], cout=cout47);
// Third RCA4
Not4(in[0..1]=b[8..9], out=notB89);
Mux4(a[0..1]=b[8..9], b=notB89, sel=sub, out=MuxThreeOut);
RCA4(a[0..1]=a[8..9], b=MuxThreeOut, cin=cout47, sum[0..1]=tempOut[8..9], sum[0]=tempA, sum[1]=tempB, sum[2]=carry);
// FIXME, intended to solve overflow/underflow
Xor(a=tempA, b=tempB, out=overflow);
Mux10(a=tempOut, b=false, sel=overflow, out=out);
}
Instead of x[a..b]=tempOut[c..d] you need to use the form x[a..b]=tempVariableAtoB (creating a new internal bus) and combine these buses in your Mux10:
Mux10(a[0..3]=temp0to3, a[4..7]=temp4to7, ... );
Without knowing what line the compiler is complaining about, it is difficult to diagnose the problem. However, my best guess is that you can't use an arbitrary internal bus like tempOut because the compiler doesn't know how big it is when it first runs into it.
The compiler knows the size of the IN and OUT elements, and it knows the size of the inputs and outputs of a component. But it can't tell how big tempOut would be without parsing everything, and that's probably outside the scope of the compiler design.
I would suggest you refactor so that each RCA4 has a discrete output bus (ie: sum1, sum2, sum3). You can then use them and their individual bits as needed in the Xor and Mux10.

how do I divide large number into two smaller integers and then reassemble the large number?

i have tried the below but do not seem to get the correct value in the end:
I have a number that may be larger than 32bit and hence I want to store it into two 32 bit array indices.
I broke them up like:
int[0] = lgval%(2^32);
int[1] = lgval/(2^32);
and reassembling the 64bit value I tried like:
CPU: PowerPC e500v2
lgval= ((uint64)int[0]) | (((uint64)int[1])>>32);
mind shift to right since we're on big endian. For some reason I do not get the correct value at the end, why not? What am I doing wrong here?
The ^ operator is xor, not power.
The way you want to do this is probably:
uint32_t split[2];
uint64_t lgval;
/* ... */
split[0] = lgval & 0xffffffff;
split[1] = lgval >> 32;
/* code to operate on your 32-bit array elements goes here */
lgval = ((uint64_t)split[1] << 32) | (uint64_t)(split[0]);
As Raymond Chen has mentioned, endianness is about storage. In this case, you only need to consider endianness if you want to access the bytes in your split-32-bit-int as a single 64-bit value. This probably isn't a good idea anyway.

howmany() Macro Objective C

While using Xcode, I accidentally auto completed to the macro howmany(x,y) and traced it to types.h. The entire line reads as follows:
#define howmany(x, y) __DARWIN_howmany(x, y) /* # y's == x bits? */
This didn't really make much sense, so I followed the path a little more and found __DARWIN_howmany(x, y) in _fd_def.h. The entire line reads as follows:
#define __DARWIN_howmany(x, y) ((((x) % (y)) == 0) ? ((x) / (y)) : (((x) / (y)) + 1)) /* # y's == x bits? */
I have no idea what __DARWIN_howmany(x, y) does. Does the comment at the end of the line shed any light on the intended function of this macro? Could someone please explain what this macro does, how it is used, and its relevance in _fd_def.h
This is a fairly commonly used macro to help programmers quickly answer the question, if I have some number of things, and my containers can each hold y of them, how many containers do I need to hold x things?
So if your containers can hold five things each, and you have 18 of them:
n = howmany(18, 5);
will tell you that you will need four containers. Or, if my buffers are allocated in words, but I need to put n characters into them, and words are 8 characters long, then:
n = howmanu(n, 8);
returns the number of words needed. This sort of computation is ubiquitous in buffer allocation code.
This is frequently computed:
#define howmany(x, y) (((x)+(y)-1)/(y))
Also related is roundup(x, y), which rounds x up to the next multiple of y:
#define roundup(x, y) (howmany(x, y)*(y))
Based on what you've posted, the macro seems to be intended to answer a question like, "How many chars does it take to hold 18 bits?" That question could be answered with this line of code
int count = howmany( 18, CHAR_BIT );
which will set count to 3.
The macro works by first checking if y divides evenly into x. If so, it returns x/y, otherwise it divides x by y and rounds up.

ROL / ROR on variable using inline assembly only in Objective-C [duplicate]

This question already has answers here:
ROL / ROR on variable using inline assembly in Objective-C
(2 answers)
Closed 9 years ago.
A few days ago, I asked the question below. Because I was in need of a quick answer, I added:
The code does not need to use inline assembly. However, I haven't found a way to do this using Objective-C / C++ / C instructions.
Today, I would like to learn something. So I ask the question again, looking for an answer using inline assembly.
I would like to perform ROR and ROL operations on variables in an Objective-C program. However, I can't manage it – I am not an assembly expert.
Here is what I have done so far:
uint8_t v1 = ....;
uint8_t v2 = ....; // v2 is either 1, 2, 3, 4 or 5
asm("ROR v1, v2");
the error I get is:
Unknown use of instruction mnemonic with unknown size suffix
How can I fix this?
A rotate is just two shifts - some bits go left, the others right - once you see this rotating is easy without assembly. The pattern is recognised by some compilers and compiled using the rotate instructions. See wikipedia for the code.
Update: Xcode 4.6.2 (others not tested) on x86-64 compiles the double shift + or to a rotate for 32 & 64 bit operands, for 8 & 16 bit operands the double shift + or is kept. Why? Maybe the compiler understands something about the performance of these instructions, maybe the just didn't optimise - but in general if you can avoid assembler do so, the compiler invariably knows best! Also using static inline on the functions, or using macros defined in the same way as the standard macro MAX (a macro has the advantage of adapting to the type of its operands), can be used to inline the operations.
Addendum after OP comment
Here is the i86_64 assembler as an example, for full details of how to use the asm construct start here.
First the non-assembler version:
static inline uint32 rotl32_i64(uint32 value, unsigned shift)
{
// assume shift is in range 0..31 or subtraction would be wrong
// however we know the compiler will spot the pattern and replace
// the expression with a single roll and there will be no subtraction
// so if the compiler changes this may break without:
// shift &= 0x1f;
return (value << shift) | (value >> (32 - shift));
}
void test_rotl32(uint32 value, unsigned shift)
{
uint32 shifted = rotl32_i64(value, shift);
NSLog(#"%8x <<< %u -> %8x", value & 0xFFFFFFFF, shift, shifted & 0xFFFFFFFF);
}
If you look at the assembler output for profiling (so the optimiser kicks in) in Xcode (Product > Generate Output > Assembly File, then select Profiling in the pop-up menu as the bottom of the window) you will see that rotl32_i64 is inlined into test_rotl32 and compiles down to a rotate (roll) instruction.
Now producing the assembler directly yourself is a bit more involved than for the ARM code FrankH showed. This is because to take a variable shift value a specific register, cl, must be used, so we need to give the compiler enough information to do that. Here goes:
static inline uint32 rotl32_i64_asm(uint32 value, unsigned shift)
{
// i64 - shift must be in register cl so create a register local assigned to cl
// no need to mask as i64 will do that
register uint8 cl asm ( "cl" ) = shift;
uint32 shifted;
// emit the rotate left long
// %n values are replaced by args:
// 0: "=r" (shifted) - any register (r), result(=), store in var (shifted)
// 1: "0" (value) - *same* register as %0 (0), load from var (value)
// 2: "r" (cl) - any register (r), load from var (cl - which is the cl register so this one is used)
__asm__ ("roll %2,%0" : "=r" (shifted) : "0" (value), "r" (cl));
return shifted;
}
Change test_rotl32 to call rotl32_i64_asm and check the assembly output again - it should be the same, i.e. the compiler did as well as we did.
Further note that if the commented out masking line in rotl32_i64 is included it essentially becomes rotl32 - the compiler will do the right thing for any architecture all for the cost of a single and instruction in the i64 version.
So asm is there is you need it, using it can be somewhat involved, and the compiler will invariably do as well or better by itself...
HTH
The 32bit rotate in ARM would be:
__asm__("MOV %0, %1, ROR %2\n" : "=r"(out) : "r"(in), "M"(N));
where N is required to be a compile-time constant.
But the output of the barrel shifter, whether used on a register or an immediate operand, is always a full-register-width; you can shift a constant 8-bit quantity to any position within a 32bit word, or - as here - shift/rotate the value in a 32bit register any which way.
But you cannot rotate 16bit or 8bit values within a register using a single ARM instruction. None such exists.
That's why the compiler, on ARM targets, when you use the "normal" (portable [Objective-]C/C++) code (in << xx) | (in >> (w - xx)) will create you one assembler instruction for a 32bit rotate, but at least two (a normal shift followed by a shifted or) for 8/16bit ones.