Decoding HID descritor to match RAW HID Data - usb
i'm trying to decode a HID RAW data stream of a multitouch screen that i'm reading from /dev/hidraw2 when connected to a linux computer.
i've already have a HID report descriptor of my multitouch screen and a 64byte stream coming from my screen.
1 finger touch
so far i can understand that 02 - report id / 04 - 07 if pressed or not pressed / xx - i dont know // xx xx cordinates //yy yy cordinates. but it must be a way to read the hid descriptor and translate that for this data stream that i'm getting.
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Touchscreen), ; Touch screen (04h, application collection)
Collection (Application),
Report ID (2),
Usage (Finger), ; Finger (22h, logical collection)
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Collection (Logical),
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (Tip Switch), ; Tip switch (42h, momentary control)
Logical Minimum (0),
Logical Maximum (1),
Report Size (1),
Report Count (1),
Input (Variable),
Usage (In Range), ; In range (32h, momentary control)
Input (Variable),
Usage (47h),
Input (Variable),
Report Count (5),
Input (Constant, Variable),
Report Size (8),
Usage (51h),
Report Count (1),
Input (Variable),
Usage Page (Desktop), ; Generic desktop controls (01h)
Logical Minimum (0),
Logical Maximum (32767),
Report Size (16),
Usage (X), ; X (30h, dynamic value)
Input (Variable),
Usage (Y), ; Y (31h, dynamic value)
Input (Variable),
End Collection,
Usage Page (Digitizer), ; Digitizer (0Dh)
Usage (54h),
Report Count (1),
Report Size (8),
Input (Variable),
Usage (55h),
Logical Maximum (10),
Feature (Variable),
End Collection,
Usage (0Eh),
Collection (Application),
Report ID (4),
Usage (23h),
Collection (Logical),
Usage (52h),
Logical Minimum (0),
Logical Maximum (10),
Report Size (8),
Report Count (1),
Feature (Variable),
End Collection,
End Collection,
Usage Page (FF00h), ; FF00h, vendor-defined
Usage (01h),
Collection (Application),
Report ID (250),
Usage (01h),
Usage Minimum (01h),
Usage Maximum (3Fh),
Logical Minimum (0),
Logical Maximum (-1),
Report Size (8),
Report Count (63),
Input (Variable),
Report ID (18),
Usage (02h),
Usage Minimum (01h),
Usage Maximum (3Fh),
Output (Variable),
Report ID (16),
Usage (03h),
Usage Minimum (01h),
Usage Maximum (3Fh),
Logical Minimum (0),
Logical Maximum (-1),
Report Size (8),
Report Count (7),
Feature (Variable),
End Collection
The input report 02 decodes as the following C-language structure (see below). Basically it comprises a 1-byte report id, 10 lots of finger touch data (flags, contact id, x, y), and a 1-byte contact count (i.e. number of touches detected). The total input report length should be 1+10x(1+1+2+2)+1 = 62 bytes:
//--------------------------------------------------------------------------------
// Digitizer Device Page inputReport 02 (Device --> Host)
//--------------------------------------------------------------------------------
typedef struct
{
uint8_t reportId; // Report ID = 0x02 (2)
// Collection: CA:TouchScreen CL:Finger
uint8_t DIG_TouchScreenFingerTipSwitch : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_1 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_1 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_1 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_1; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_1; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_1; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_2 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_2 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_2 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_2; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_2; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_2; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_3 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_3 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_3 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_3; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_3; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_3; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_4 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_4 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_4 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_4; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_4; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_4; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_5 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_5 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_5 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_5; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_5; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_5; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_6 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_6 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_6 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_6; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_6; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_6; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_7 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_7 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_7 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_7; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_7; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_7; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_8 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_8 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_8 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_8; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_8; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_8; // Usage 0x00010031: Y, Value = 0 to 32767
uint8_t DIG_TouchScreenFingerTipSwitch_9 : 1; // Usage 0x000D0042: Tip Switch, Value = 0 to 1
uint8_t DIG_TouchScreenFingerInRange_9 : 1; // Usage 0x000D0032: In Range, Value = 0 to 1
uint8_t DIG_TouchScreenFingerConfidence_9 : 1; // Usage 0x000D0047: Confidence, Value = 0 to 1
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t : 1; // Pad
uint8_t DIG_TouchScreenFingerContactIdentifier_9; // Usage 0x000D0051: Contact Identifier, Value = 0 to 1
uint16_t GD_TouchScreenFingerX_9; // Usage 0x00010030: X, Value = 0 to 32767
uint16_t GD_TouchScreenFingerY_9; // Usage 0x00010031: Y, Value = 0 to 32767
// Collection: CA:TouchScreen
uint8_t DIG_TouchScreenContactCount; // Usage 0x000D0054: Contact Count, Value = 0 to 10
} inputReport02_t;
Related
Largest set of different byte values unique when clearing bits
I am creating a data format, which will be stored in a DS2431 1-wire EEPROM. One page will be using EPROM emulation mode (where data once written can only be modified by clearing bits). In this page I want to store a byte with an ID, which cannot be changed to another valid value (due to only allowing clearing bits). I am considering using the set of values that have a popcount of 4 (there are 70 different values). Clearing any bits means popcount is no longer 4, so this satisfies the desired property. But could a set of byte values be found with more than 70 different values, that satisfy the property?
No. For an 8-bit value, using four bits is optimal. If you have your 70 4-bit values and decide to add a 5-bit value as valid, you have to give up five 4-bit values that can be created by clearing a bit. Similarly, if you want a valid 3-bit value, you also have to give up five 4-bit values. If you could increase the number of bits, then you can increase the ratio of possible values to bits used.
Since there are only 256 possible values and 8 possible populations it is a trivial task to test all possible population counts: #include <stdio.h> #include <stdint.h> int popcount( uint8_t byte ) { int count = 0 ; for( uint8_t b = 0x01; b != 0; b <<= 1 ) { count = count + (((byte & b) != 0) ? 1 : 0) ; } return count ; } int main() { int valuecount[8] = {0} ; for( int i = 0; i < 256; i++ ) { valuecount[popcount(i)]++ ; } printf( "popcount\tvalues\n") ; for( int p = 0; p < 9; p++ ) { printf( " %d\t\t %d\n", p, valuecount[p] ) ; } return 0; } Result: popcount values 0 1 1 8 2 28 3 56 4 70 5 56 6 28 7 8 8 1 The optimum population count for any word length n is always n / 2. For 16-bits the number of values with 8 1-bits is 12870.
Find nth int with 10 set bits
Find the nth int with 10 set bits n is an int in the range 0<= n <= 30 045 014 The 0th int = 1023, the 1st = 1535 and so on snob() same number of bits, returns the lowest integer bigger than n with the same number of set bits as n int snob(int n) { int a=n&-n, b=a+n; return b|(n^b)/a>>2; } calling snob n times will work int nth(int n){ int o =1023; for(int i=0;i<n;i++)o=snob(o); return o; } example https://ideone.com/ikGNo7 Is there some way to find it faster? I found one pattern but not sure if it's useful. using factorial you can find the "indexes" where all 10 set bits are consecutive 1023 << x = the (x+10)! / (x! * 10!) - 1 th integer 1023<<1 is the 10th 1023<<2 is the 65th 1023<<3 the 285th ... Btw I'm not a student and this is not homework. EDIT: Found an alternative to snob() https://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation int lnbp(int v){ int t = (v | (v - 1)) + 1; return t | ((((t & -t) / (v & -v)) >> 1) - 1); }
I have built an implementation that should satisfy your needs. /** A lookup table to see how many combinations preceeded this one */ private static int[][] LOOKUP_TABLE_COMBINATION_POS; /** The number of possible combinations with i bits */ private static int[] NBR_COMBINATIONS; static { LOOKUP_TABLE_COMBINATION_POS = new int[Integer.SIZE][Integer.SIZE]; for (int bit = 0; bit < Integer.SIZE; bit++) { // Ignore less significant bits, compute how many combinations have to be // visited to set this bit, i.e. // (bit = 4, pos = 5), before came 0b1XXX and 0b1XXXX, that's C(3, 3) + C(4, 3) int nbrBefore = 0; // The nth-bit can be only encountered after pos n for (int pos = bit; pos < Integer.SIZE; pos++) { LOOKUP_TABLE_COMBINATION_POS[bit][pos] = nbrBefore; nbrBefore += nChooseK(pos, bit); } } NBR_COMBINATIONS = new int[Integer.SIZE + 1]; for (int bits = 0; bits < NBR_COMBINATIONS.length; bits++) { NBR_COMBINATIONS[bits] = nChooseK(Integer.SIZE, bits); assert NBR_COMBINATIONS[bits] > 0; // Important for modulo check. Otherwise we must use unsigned arithmetic } } private static int nChooseK(int n, int k) { assert k >= 0 && k <= n; if (k > n / 2) { k = n - k; } long nCk = 1; // (N choose 0) for (int i = 0; i < k; i++) { // (N choose K+1) = (N choose K) * (n-k) / (k+1); nCk *= (n - i); nCk /= (i + 1); } return (int) nCk; } public static int nextCombination(int w, int n) { // TODO: maybe for small n just advance naively // Get the position of the current pattern w int nbrBits = 0; int position = 0; while (w != 0) { final int currentBit = Integer.lowestOneBit(w); // w & -w; final int bitPos = Integer.numberOfTrailingZeros(currentBit); position += LOOKUP_TABLE_COMBINATION_POS[nbrBits][bitPos]; // toggle off bit w ^= currentBit; nbrBits++; } position += n; // Wrapping, optional position %= NBR_COMBINATIONS[nbrBits]; // And reverse lookup int v = 0; int m = Integer.SIZE - 1; while (nbrBits-- > 0) { final int[] bitPositions = LOOKUP_TABLE_COMBINATION_POS[nbrBits]; // Search for largest bitPos such that position >= bitPositions[bitPos] while (Integer.compareUnsigned(position, bitPositions[m]) < 0) m--; position -= bitPositions[m]; v ^= (0b1 << m--); } return v; } Now for some explanation. LOOKUP_TABLE_COMBINATION_POS[bit][pos] is the core of the algorithm that makes it as fast as it is. The table is designed so that a bit pattern with k bits at positions p_0 < p_1 < ... < p_{k - 1} has a position of `\sum_{i = 0}^{k - 1}{ LOOKUP_TABLE_COMBINATION_POS[i][p_i] }. The intuition is that we try to move back the bits one by one until we reach the pattern where are all bits are at the lowest possible positions. Moving the i-th bit from position to k + 1 to k moves back by C(k-1, i-1) positions, provided that all lower bits are at the right-most position (no moving bits into or through each other) since we skip over all possible combinations with the i-1 bits in k-1 slots. We can thus "decode" a bit pattern to a position, keeping track of the bits encountered. We then advance by n positions (rolling over in case we enumerated all possible positions for k bits) and encode this position again. To encode a pattern, we reverse the process. For this, we move bits from their starting position forward, as long as the position is smaller than what we're aiming for. We could, instead of a linear search through LOOKUP_TABLE_COMBINATION_POS, employ a binary search for our target index m but it's hardly needed, the size of an int is not big. Nevertheless, we reuse our variant that a smaller bit must also come at a less significant position so that our algorithm is effectively O(n) where n = Integer.SIZE. I remain with the following assertions to show the resulting algorithm: nextCombination(0b1111111111, 1) == 0b10111111111; nextCombination(0b1111111111, 10) == 0b11111111110; nextCombination(0x00FF , 4) == 0x01EF; nextCombination(0x7FFFFFFF , 4) == 0xF7FFFFFF; nextCombination(0x03FF , 10) == 0x07FE; // Correct wrapping nextCombination(0b1 , 32) == 0b1; nextCombination(0x7FFFFFFF , 32) == 0x7FFFFFFF; nextCombination(0xFFFFFFEF , 5) == 0x7FFFFFFF;
Let us consider the numbers with k=10 bits set. The trick is to determine the rank of the most significant one, for a given n. There is a single number of length k: C(k, k)=1. There are k+1 = C(k+1, k) numbers of length k + 1. ... There are C(m, k) numbers of length m. For k=10, the limit n are 1 + 10 + 55 + 220 + 715 + 2002 + 5005 + 11440 + ... For a given n, you easily find the corresponding m. Then the problem is reduced to finding the n - C(m, k)-th number with k - 1 bits set. And so on recursively. With precomputed tables, this can be very fast. 30045015 takes 30 lookups, so that I guess that the worst case is 29 x 30 / 2 = 435 lookups. (This is based on linear lookups, to favor small values. By means of dichotomic search, you reduce this to less than 29 x lg(30) = 145 lookups at worse.) Update: My previous estimates were pessimistic. Indeed, as we are looking for k bits, there are only 10 determinations of m. In the linear case, at worse 245 lookups, in the dichotomic case, less than 50. (I don't exclude off-by-one errors in the estimates, but clearly this method is very efficient and requires no snob.)
SystemVerilog Instantiated Modules Share Inputs When They Shouldn't (Easy Solution)?
I am having a small issue here in when I instantiated my modules. I am using a generate loop to create 100 instances of 2 counters (16 & 32 bit counters). Each counter should have their own independent controls (UPDN & EN), but they share a clock and a reset. Module Descriptions: SAT_COUNTER.sv // SIMPLE COUNTER MODULE TWO_SC.sv // INSTANTIATES TWO SAT_COUNTER MODULES (16 BIT & 32 BIT COUNTERS) GEN_SC.sv // INSTANTIATES 100 MODULES OF TWO_SC MODULES tb_GEN_SC.sv // TESTBENCH I am sure that my problem is in the GEN_SC module where I instantiate all 100.. I appreciate any help! Thank you in advance! module SAT_COUNTER( COUNT, // SCALABLE COUNT OUTPUT CLK, // CLOCK al_RST, // ACTIVE LOW RESET UPDN, // COUNTER WILL COUNT: UP = 1; DN = 0; EN); // ENABLE parameter WIDTH = 8; input CLK, al_RST, UPDN, EN; output reg [WIDTH-1:0] COUNT; ... endmodule //********************** module TWO_SC( COUNT1, // N-BIT COUNTER OUTPUT COUNT2, // M-BIT COUNTER OUTPUT CLK, // CLOCK al_RST, // ACTIVE-LOW RESET UPDN, // DIR. CONTROL EN); // ENABLE parameter WIDTH1 = 16; parameter WIDTH2 = 32; input CLK, al_RST; input [1:0] UPDN, EN; output [WIDTH1-1:0] COUNT1; output [WIDTH2-1:0] COUNT2; SAT_COUNTER #(WIDTH1) GSC1(.COUNT(COUNT1), .CLK(CLK), .al_RST(al_RST), .UPDN(UPDN[0]), .EN(EN[0])); SAT_COUNTER #(WIDTH2) GSC2(.COUNT(COUNT2), .CLK(CLK), .al_RST(al_RST), .UPDN(UPDN[1]), .EN(EN[1])); endmodule //********************** module GEN_SC( COUNT1, // COUNT1 COUNT2, // COUNT2 CLK, // CLOCK al_RST, // ACTIVE-LOW RESET UPDN, // DIR. CONTROL EN); // ENABLE parameter MOD_COUNT = 100; parameter WIDTH1 = 16; parameter WIDTH2 = 32; input CLK, al_RST; input [1:0] UPDN [MOD_COUNT-1:0]; input [1:0] EN [MOD_COUNT-1:0]; output [WIDTH1-1:0] COUNT1; output [WIDTH2-1:0] COUNT2; genvar j; generate for(j = 0; j < MOD_COUNT; j++) begin: SC TWO_SC #(.WIDTH1(WIDTH1), .WIDTH2(WIDTH2)) TWOCOUNTERS(.COUNT1(COUNT1), .COUNT2(COUNT2), .CLK(CLK), .al_RST(al_RST), .UPDN(UPDN[j]), .EN(EN[j])); end endgenerate endmodule //********************** module tb_GEN_SC(); parameter MOD_COUNT = 100; parameter WIDTH1 = 16; parameter WIDTH2 = 32; reg CLK, al_RST; reg [1:0] UPDN [MOD_COUNT-1:0]; reg [1:0] EN [MOD_COUNT-1:0]; wire [WIDTH1-1:0] COUNT1; wire [WIDTH2-1:0] COUNT2; GEN_SC #(.WIDTH1(WIDTH1), .WIDTH2(WIDTH2)) UUT(COUNT1, COUNT2, CLK, al_RST, UPDN, EN); initial begin CLK = 1'b1; forever #5 CLK = ~CLK; end initial $monitorb("%d COUNT = %b (%d) | UPDN = %b | EN = %b | COUNT = %b (%d) | UPDN = %b | EN = %b | al_RST = %b | CLK = %b", $time, UUT.SC[87].TWOCOUNTERS.COUNT1, UUT.SC[87].TWOCOUNTERS.COUNT1, UUT.SC[87].TWOCOUNTERS.UPDN[0], UUT.SC[87].TWOCOUNTERS.EN[0], UUT.SC[99].TWOCOUNTERS.COUNT1, UUT.SC[99].TWOCOUNTERS.COUNT1, UUT.SC[99].TWOCOUNTERS.UPDN[0], UUT.SC[99].TWOCOUNTERS.EN[0], al_RST, CLK); initial begin $vcdpluson; UUT.SC[87].TWOCOUNTERS.GSC1.UPDN = 1; UUT.SC[99].TWOCOUNTERS.GSC1.UPDN = 1; EN = 0; al_RST = 1; #10 UUT.SC[99].TWOCOUNTERS.GSC1.UPDN = 0; al_RST = 0; // RESET COUNTER #10 EN = 1; al_RST = 1; // ENABLE COUNTER AND COUNT UP (HITS MAX) #200 UUT.SC[87].TWOCOUNTERS.GSC1.UPDN = 0; UUT.SC[99].TWOCOUNTERS.GSC1.UPDN = 1; EN = 1; // BEGIN TO COUNT DOWN #10 EN = 0; #60 EN = 3; // #230 UPDN = 1; UPDN = 0; #3017 al_RST = 0; #100 al_RST = 1; #20 $finish; end /////////// ERRORS I GET ///////////////// Error-[IBLHS-NT] Illegal behavioral left hand side tb_GEN_SC.sv, 34 Net type cannot be used on the left side of this assignment. The offending expression is : tb_GEN_SC.UUT.SC[87].TWOCOUNTERS.GSC1.UPDN Source info: tb_GEN_SC.UUT.SC[87].TWOCOUNTERS.GSC1.UPDN = 1; Error-[IBLHS-NT] Illegal behavioral left hand side tb_GEN_SC.sv, 34 Net type cannot be used on the left side of this assignment. The offending expression is : tb_GEN_SC.UUT.SC[99].TWOCOUNTERS.GSC1.UPDN Source info: tb_GEN_SC.UUT.SC[99].TWOCOUNTERS.GSC1.UPDN = 1; Error-[IUDA] Incompatible dimensions tb_GEN_SC.sv, 34 Incompatible unpacked dimensions in assignment Arrays with incompatible unpacked dimensions cannot be used in assignments, initializations and instantiations. Error-[ICTA] Incompatible complex type tb_GEN_SC.sv, 34 Incompatible complex type assignment Type of source expression is incompatible with type of target expression. Mismatching types cannot be used in assignments, initializations and instantiations. The type of the target is 'reg[1:0]$[99:0]', while the type of the source is 'int'. Source Expression: 0
You have just one UPDB and one EN port in port list. So how you wana apply different UPDN and EN to instances? An idea is to define an array with size of MOD_COUNT so that each element has its own control input. then in genvar loop you can use the index. like this: input [1:0] UPDN [MOD_COUNT-1:0]; input [1:0] EN [MOD_COUNT-1:0]; ... generate for(j = 0; j < MOD_COUNT; j++) begin: SC TWO_SC #(.WIDTH1(WIDTH1), .WIDTH2(WIDTH2)) TWOCOUNTERS(.COUNT1(COUNT1), .COUNT2(COUNT2), .CLK(CLK), .al_RST(al_RST), .UPDN(UPDN[j]), .EN(EN[j])); end endgenerate
accelerate framework cepstrum peak find
I'm trying to find peak values of cepstrum analysis with accelerate framework. I get peak values always at the end of or at the beginning of frames. I'm analysing it real-time getting audio from microphone. What is wrong with this my code? My code is below : OSStatus microphoneInputCallback (void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData){ // get reference of test app we need for test app attributes TestApp *this = (TestApp *)inRefCon; COMPLEX_SPLIT complexArray = this->fftA; void *dataBuffer = this->dataBuffer; float *outputBuffer = this->outputBuffer; FFTSetup fftSetup = this->fftSetup; uint32_t log2n = this->fftLog2n; uint32_t n = this->fftN; // 4096 uint32_t nOver2 = this->fftNOver2; uint32_t stride = 1; int bufferCapacity = this->fftBufferCapacity; // 4096 SInt16 index = this->fftIndex; OSStatus renderErr; // observation objects float *observerBufferRef = this->observerBuffer; int observationCountRef = this->observationCount; renderErr = AudioUnitRender(rioUnit, ioActionFlags, inTimeStamp, bus1, inNumberFrames, this->bufferList); if (renderErr < 0) { return renderErr; } // Fill the buffer with our sampled data. If we fill our buffer, run the // fft. int read = bufferCapacity - index; if (read > inNumberFrames) { memcpy((SInt16 *)dataBuffer + index, this->bufferList->mBuffers[0].mData, inNumberFrames*sizeof(SInt16)); this->fftIndex += inNumberFrames; } else { // If we enter this conditional, our buffer will be filled and we should PERFORM FFT. memcpy((SInt16 *)dataBuffer + index, this->bufferList->mBuffers[0].mData, read*sizeof(SInt16)); // Reset the index. this->fftIndex = 0; /*************** FFT ***************/ //multiply by window vDSP_vmul((SInt16 *)dataBuffer, 1, this->window, 1, this->outputBuffer, 1, n); // We want to deal with only floating point values here. vDSP_vflt16((SInt16 *) dataBuffer, stride, (float *) outputBuffer, stride, bufferCapacity ); /** Look at the real signal as an interleaved complex vector by casting it. Then call the transformation function vDSP_ctoz to get a split complex vector, which for a real signal, divides into an even-odd configuration. */ vDSP_ctoz((COMPLEX*)outputBuffer, 2, &complexArray, 1, nOver2); // Carry out a Forward FFT transform. vDSP_fft_zrip(fftSetup, &complexArray, stride, log2n, FFT_FORWARD); vDSP_ztoc(&complexArray, 1, (COMPLEX *)outputBuffer, 2, nOver2); complexArray.imagp[0] = 0.0f; vDSP_zvmags(&complexArray, 1, complexArray.realp, 1, nOver2); bzero(complexArray.imagp, (nOver2) * sizeof(float)); // scale float scale = 1.0f / (2.0f*(float)n); vDSP_vsmul(complexArray.realp, 1, &scale, complexArray.realp, 1, nOver2); // step 2 get log for cepstrum float *logmag = malloc(sizeof(float)*nOver2); for (int i=0; i < nOver2; i++) logmag[i] = logf(sqrtf(complexArray.realp[i])); // configure float array into acceptable input array format (interleaved) vDSP_ctoz((COMPLEX*)logmag, 2, &complexArray, 1, nOver2); // create cepstrum vDSP_fft_zrip(fftSetup, &complexArray, stride, log2n-1, FFT_INVERSE); //convert interleaved to real float *displayData = malloc(sizeof(float)*n); vDSP_ztoc(&complexArray, 1, (COMPLEX*)displayData, 2, nOver2); float dominantFrequency = 0; int currentBin = 0; float dominantFrequencyAmp = 0; // find peak of cepstrum for (int i=0; i < nOver2; i++){ //get current frequency magnitude if (displayData[i] > dominantFrequencyAmp) { // DLog("Bufferer filled %f", displayData[i]); dominantFrequencyAmp = displayData[i]; currentBin = i; } } DLog("currentBin : %i amplitude: %f", currentBin, dominantFrequencyAmp); } return noErr; }
I haven't worked with the Accelerate Framework, but your code appears to be taking the proper steps to calculate the Cepstrum. The Cepstrum of real acoustic signals tends to have a very large DC component, a large peak at and near zero quefrency [sic]. Just ignore the near-DC portion of the Cepstrum and look for peaks above 20 Hz frequency (above quefrency of Cepstrum_Width/20Hz). If the input signal contains a series of very closely spaced overtones, the Cepstrum will also have a large peak at the high quefrency end. For example, the plot below shows the Cepstrum of a Dirichlet Kernel of N=128 and Width=4096, the spectrum of which is a series of very closely spaced overtones. You may want to use a static synthetic signal to test and debug your code. A good choice for a test signal is any sinusoid with a fundamental F and several overtones at exact integer multiples of F. Your Cepstra should look something like the following examples. First a synthetic signal. The plot below shows the Cepstrum of a synthetic steady-state E2 note, synthesized using a typical near-DC component, a fundamental at 82.4 Hz, and 8 harmonics at integer multiples of 82.4 Hz. The synthetic sinusoid was programmed to generate 4096 samples. Observe the prominent non-DC peak at 12.36. The Cepstrum width is 1024 (the output of the second FFT), therefore the peak corresponds to 1024/12.36 = 82.8 Hz which is very close to 82.4 Hz the true fundamental frequency. Now a real acoustical signal. The plot below shows the Cepstrum of a real acoustic guitar's E2 note. The signal was not windowed prior to the first FFT. Observe the prominent non-DC peak at 542.9. The Cepstrum width is 32768 (the output of the second FFT), therefore the peak corresponds to 32768/542.9 = 60.4 Hz which is fairly far from 82.4 Hz the true fundamental frequency. The plot below shows the Cepstrum of the same real acoustic guitar's E2 note, but this time the signal was Hann windowed prior to the first FFT. Observe the prominent non-DC peak at 268.46. The Cepstrum width is 32768 (the output of the second FFT), therefore the peak corresponds to 32768/268.46 = 122.1 Hz which is even farther from 82.4 Hz the true fundamental frequency. The acoustic guitar's E2 note used for this analysis was sampled at 44.1 KHz with a high quality microphone under studio conditions, it contains essentially zero background noise, no other instruments or voices, and no post processing. References: Real audio signal data, synthetic signal generation, plots, FFT, and Cepstral analysis were done here: Musical instrument cepstrum
CUDA Kernel Optimization regarding register
I'm quite new to CUDA and GPU programming. I'm trying to write a Kernel for an application in physics. The parallelization is made over a quadrature of directions, each direction resulting in a sweep of a 2D cartesian domain. Here is the kernel. it actually works well, giving good results. However, a very high number of registers per blocks leads to a spill to local memory that harshly slow down the code performance. __global__ void KERNEL (int imax, int jmax, int mmax, int lg, int lgmax, double *x, double *y, double *qd, double *kappa, double *S, double *G, double *qw, double *SkG, double *Ska,double *a, double *Ljm, int *data) { int m = 1+blockIdx.x*blockDim.x + threadIdx.x ; int tid = threadIdx.x ; //Var needed for thread execution ... extern __shared__ double shared[] ; //Read some data from Global mem mu = qd[ (m-1)]; eta = qd[ MSIZE+(m-1)]; wm = qd[3*MSIZE+(m-1)]; amu = fabs(mu); aeta= fabs(eta); ista = data[ (m-1)] ; iend = data[1*MSIZE+(m-1)] ; istp = data[2*MSIZE+(m-1)] ; jsta = data[3*MSIZE+(m-1)] ; jend = data[4*MSIZE+(m-1)] ; jstp = data[5*MSIZE+(m-1)] ; j1 = (1-jstp) ; j2 = (1+jstp)/2 ; i1 = (1-istp) ; i2 = (1+istp)/2 ; isw = ista-istp ; jsw = jsta-jstp ; dy = dx = 1.0e-2 ; for(i=1 ; i<=imax; i++) Ljm[MSIZE*(i-1)+m] = S[jsw*(imax+2)+i] ; //Beginning of the vertical Sweep, can be from left to right, // or opposite depending on the thread for(j=jsta ; j1*jend + j2*j<=j2*jend + j1*j ; j=j+jstp) { Lw = S[j*(imax+2)+isw] ; //Beginning of the horizontal Sweep, can be from left to right, // or opposite depending on the thread for(i=ista ; i1*iend + i2*i<=i2*iend + i1*i ; i=i+istp) { ax = dy ; Lx = ax*amu/ex ; ay = dx ; Ly = ay*aeta/ey ; dv = ax*ay ; L0 = dv*kappaij ; Sp = S[j*(imax+2)+i]*dv ; Ls = Ljm[MSIZE*(i-1)+m] ; Lp = (Lx*Lw+Ly*Ls+Sp)/(Lx+Ly+L0) ; Lw = Lw+(Lp-Lw)/ex ; Ls = Ls+(Lp-Ls)/ey ; Ljm[MSIZE*(i-1)+m] = Ls ; shared[tid] = wm*Lp ; __syncthreads(); for (s=16; s>0; s>>=1) { if (tid < s) { shared[tid] += shared[tid + s] ; } } if(tid==0) atomicAdd(&SkG[imax*(j-1)+(i-1)],shared[tid]*kappaij); } // End of horizontal sweep } // End of vertical sweep } How can i optimize the execution of this code ? I run it over 8 blocks of 32 threads. The occupancy for this kernel is really low, limited by the registers according to the Visual profiler. I have no idea on how to improve it. Thanks !
First of all, you are using blocks of 32 threads, because of that, occupancy kernel is too low. Your gpu is running only 256 threads in parallel but it can run up to 1536 threads per multiprocessor (compute capability 2.x) How many registers are you using? You also can try to declare your variables into their local scope, helping to the device to reuse better the registers.