AVR: volatile variable resetting to zero - variables

I have an interrupt service routine that contains the variable count and a variable state that changes when count reaches a certain value.
What I want my code to do is change and maintain state for a certain period of time determined by the value of count in the if statements in the ISR.
e.g. I want the variable state to equal 1 for 10 counts;
I want state to equal 0 for 5 counts.
Somewhat similar to changing the duty cycle of a PWM.
The problem I am having is that the variable state resets to zero around the end of the ISR or the end of the if statement, I'm not sure.
After searching for answers, I have found that it might be related to the gcc compiler's optimisation but I cannot find a fix for this problem besides declaring volatile variables, which I have already done.
Any help is appreciated thanks.
My code:
#include <avr/io.h>
#include <avr/interrupt.h>
volatile int count = 0;
volatile int state = 0;
int main(void)
{
cli();
DDRB |= (1 << PB2);
TCCR0B |= (1 << CS02)|(0 << CS01)|(1 << CS00);
TIMSK |= (1 << TOIE0);
sei();
while(1) {
...
}
}
ISR(TIM0_OVF_vect)
{
cli();
// user code here
count = count + 1;
if ((count > 5) && (state < 1) && !(PORTB & (1 << PB2))) {
state = 1;
count = 0;
}
else if ((count > 10) && (state > 0) && (PORTB & (1 << PB2))) {
state = 0;
count = 0;
}
sei();
}

First of all your cli() has no effect in your ISR. When an interrupt happens the I flag automatically cleared and restored at the end of ISR - so sei() is needless as well. However there is rational case when sei() inserted somewhere in the middle of an ISR service to let other ISR's to interrupt it. Though it is not beginner level to build such sophisticated ISR system.
Second: if you set up a conditional structure the elementary conditions should be chained by double operators. The evaluation of (count > 5) & (state < 1) cannot be predicted, correctly written it is (count > 5) && (state < 1). Obviously for 'or'-ing || is used.

The conditions of the if statements look suspicious if you are mimicking a PWM.
Why is the pin of port B being tested? The bodies of the if statements will only be entered if the pin (button) is pressed with the correct frequency.
Do you mean to set the pin on port B instead? Then you would want
count = count + 1;
if ((count > 5) && (state < 1)) {
PORTB &= ~(1 << PB2); // clears PB2
state = 1;
count = 0;
}
else if ((count > 10) && (state > 0)) {
PORTB |= 1 << PB2; // sets PB2
state = 0;
count = 0;
}

Related

Questions about this serial communication code? [Cortex-M4]

I'm looking at the following code from STMicroelectronics on implementing USART communication with interrupts
#include <stm32f10x_lib.h> // STM32F10x Library Definitions
#include <stdio.h>
#include "STM32_Init.h" // STM32 Initialization
/*----------------------------------------------------------------------------
Notes:
The length of the receive and transmit buffers must be a power of 2.
Each buffer has a next_in and a next_out index.
If next_in = next_out, the buffer is empty.
(next_in - next_out) % buffer_size = the number of characters in the buffer.
*----------------------------------------------------------------------------*/
#define TBUF_SIZE 256 /*** Must be a power of 2 (2,4,8,16,32,64,128,256,512,...) ***/
#define RBUF_SIZE 256 /*** Must be a power of 2 (2,4,8,16,32,64,128,256,512,...) ***/
/*----------------------------------------------------------------------------
*----------------------------------------------------------------------------*/
#if TBUF_SIZE < 2
#error TBUF_SIZE is too small. It must be larger than 1.
#elif ((TBUF_SIZE & (TBUF_SIZE-1)) != 0)
#error TBUF_SIZE must be a power of 2.
#endif
#if RBUF_SIZE < 2
#error RBUF_SIZE is too small. It must be larger than 1.
#elif ((RBUF_SIZE & (RBUF_SIZE-1)) != 0)
#error RBUF_SIZE must be a power of 2.
#endif
/*----------------------------------------------------------------------------
*----------------------------------------------------------------------------*/
struct buf_st {
unsigned int in; // Next In Index
unsigned int out; // Next Out Index
char buf [RBUF_SIZE]; // Buffer
};
static struct buf_st rbuf = { 0, 0, };
#define SIO_RBUFLEN ((unsigned short)(rbuf.in - rbuf.out))
static struct buf_st tbuf = { 0, 0, };
#define SIO_TBUFLEN ((unsigned short)(tbuf.in - tbuf.out))
static unsigned int tx_restart = 1; // NZ if TX restart is required
/*----------------------------------------------------------------------------
USART1_IRQHandler
Handles USART1 global interrupt request.
*----------------------------------------------------------------------------*/
void USART1_IRQHandler (void) {
volatile unsigned int IIR;
struct buf_st *p;
IIR = USART1->SR;
if (IIR & USART_FLAG_RXNE) { // read interrupt
USART1->SR &= ~USART_FLAG_RXNE; // clear interrupt
p = &rbuf;
if (((p->in - p->out) & ~(RBUF_SIZE-1)) == 0) {
p->buf [p->in & (RBUF_SIZE-1)] = (USART1->DR & 0x1FF);
p->in++;
}
}
if (IIR & USART_FLAG_TXE) {
USART1->SR &= ~USART_FLAG_TXE; // clear interrupt
p = &tbuf;
if (p->in != p->out) {
USART1->DR = (p->buf [p->out & (TBUF_SIZE-1)] & 0x1FF);
p->out++;
tx_restart = 0;
}
else {
tx_restart = 1;
USART1->CR1 &= ~USART_FLAG_TXE; // disable TX interrupt if nothing to send
}
}
}
/*------------------------------------------------------------------------------
buffer_Init
initialize the buffers
*------------------------------------------------------------------------------*/
void buffer_Init (void) {
tbuf.in = 0; // Clear com buffer indexes
tbuf.out = 0;
tx_restart = 1;
rbuf.in = 0;
rbuf.out = 0;
}
/*------------------------------------------------------------------------------
SenChar
transmit a character
*------------------------------------------------------------------------------*/
int SendChar (int c) {
struct buf_st *p = &tbuf;
// If the buffer is full, return an error value
if (SIO_TBUFLEN >= TBUF_SIZE)
return (-1);
p->buf [p->in & (TBUF_SIZE - 1)] = c; // Add data to the transmit buffer.
p->in++;
if (tx_restart) { // If transmit interrupt is disabled, enable it
tx_restart = 0;
USART1->CR1 |= USART_FLAG_TXE; // enable TX interrupt
}
return (0);
}
/*------------------------------------------------------------------------------
GetKey
receive a character
*------------------------------------------------------------------------------*/
int GetKey (void) {
struct buf_st *p = &rbuf;
if (SIO_RBUFLEN == 0)
return (-1);
return (p->buf [(p->out++) & (RBUF_SIZE - 1)]);
}
/*----------------------------------------------------------------------------
MAIN function
*----------------------------------------------------------------------------*/
int main (void) {
buffer_Init(); // init RX / TX buffers
stm32_Init (); // STM32 setup
printf ("Interrupt driven Serial I/O Example\r\n\r\n");
while (1) { // Loop forever
unsigned char c;
printf ("Press a key. ");
c = getchar ();
printf ("\r\n");
printf ("You pressed '%c'.\r\n\r\n", c);
} // end while
} // end main
My questions are the following:
In the handler function, when does the statement ((p->in - p->out) & ~(RBUF_SIZE-1)) ever evaluate to a value other than zero? If RBUF_SIZE is a power of 2 as indicated, then ~(RBUF_SIZE-1) should always be zero. Is it checking if p->in > p->out? Even if this isn't true, the conditional should evaluate to zero anyway, right?
In the line following, the statement p->buf [p->in & (RBUF_SIZE-1)] = (USART1->DR & 0x1FF); is made. Why does the code AND p->in with RBUF_SIZE-1?
What kind of buffer are we using in this code? FIFO?
Not so. For example, assuming 32-bit arithmetic, if RBUF_SIZE == 0x00000100 then RBUF_SIZE-1 == 0x000000FF and ~(RBUF_SIZE-1) == 0xFFFFFF00 (it's a bitwise NOT, not a logical NOT). The check you refer to is therefore effectively the same as (p->in - p->out) < RBUF_SIZE, and it's not clear why it is superior. ARM GCC 7.2.1 produces identical length code for the two (-O1).
p->in & (RBUF_SIZE-1) is the same as p->in % RBUF_SIZE when p->in is unsigned. Again, not sure why the former would be used when the latter is clearer; sure, it effectively forces the compiler to compute the modulo using an AND operation, but given that RBUF_SIZE is known at compile time to be a power of two my guess is that most compilers could figure this out (again, ARM GCC 7.2.1 certainly can, I've just tried it - it produces the same instructions either way).
Looks like it. FIFO implemented as a circular buffer.

Debug data/neon performance hazards in arm neon code

Originally the problem appeared when I tried to optimize an algorithm for neon arm and some minor part of it was taking 80% of according to profiler. I tried to test to see what can be done to improve it and for that I created array of function pointers to different versions of my optimized function and then I run them in the loop to see in profiler which one performs better:
typedef unsigned(*CalcMaxFunc)(const uint16_t a[8][4], const uint16_t b[4][4]);
CalcMaxFunc CalcMaxFuncs[] =
{
CalcMaxFunc_NEON_0,
CalcMaxFunc_NEON_1,
CalcMaxFunc_NEON_2,
CalcMaxFunc_NEON_3,
CalcMaxFunc_C_0
};
int N = sizeof(CalcMaxFunc) / sizeof(CalcMaxFunc[0]);
for (int i = 0; i < 10 * N; ++i)
{
auto f = CalcMaxFunc[i % N];
unsigned retI = f(a, b);
// just random code to ensure that cpu waits for the results
// and compiler doesn't optimize it away
if (retI > 1000000)
break;
ret |= retI;
}
I got surprising results: performance of a function was totally depend on its position within CalcMaxFuncs array. For example, when I swapped CalcMaxFunc_NEON_3 to be first it would be 3-4 times slower and according to profiler it would stall at the last bit of the function where it tried to move data from neon to arm register.
So, what does it make stall sometimes and not in other times? BY the way, I profile on iPhone6 in xcode if that matters.
When I intentionally introduced neon pipeline stalls by mixing-in some floating point division between calling these functions in the loop I eliminated unreliable behavior, now all of them perform the same regardless of the order in which they were called. So, why in the first place did I have that problem and what can I do to eliminate it in actual code?
Update:
I tried to create a simple test function and then optimize it in stages and see how I could possibly avoid neon->arm stalls.
Here's the test runner function:
void NeonStallTest()
{
int findMinErr(uint8_t* var1, uint8_t* var2, int size);
srand(0);
uint8_t var1[1280];
uint8_t var2[1280];
for (int i = 0; i < sizeof(var1); ++i)
{
var1[i] = rand();
var2[i] = rand();
}
#if 0 // early exit?
for (int i = 0; i < 16; ++i)
var1[i] = var2[i];
#endif
int ret = 0;
for (int i=0; i<10000000; ++i)
ret += findMinErr(var1, var2, sizeof(var1));
exit(ret);
}
And findMinErr is this:
int findMinErr(uint8_t* var1, uint8_t* var2, int size)
{
int ret = 0;
int ret_err = INT_MAX;
for (int i = 0; i < size / 16; ++i, var1 += 16, var2 += 16)
{
int err = 0;
for (int j = 0; j < 16; ++j)
{
int x = var1[j] - var2[j];
err += x * x;
}
if (ret_err > err)
{
ret_err = err;
ret = i;
}
}
return ret;
}
Basically it it does sum of squared difference between each uint8_t[16] block and returns index of the block pair that has lowest squared difference. So, then I rewrote it in neon intrisics (no particular attempt was made to make it fast, as it's not the point):
int findMinErr_NEON(uint8_t* var1, uint8_t* var2, int size)
{
int ret = 0;
int ret_err = INT_MAX;
for (int i = 0; i < size / 16; ++i, var1 += 16, var2 += 16)
{
int err;
uint8x8_t var1_0 = vld1_u8(var1 + 0);
uint8x8_t var1_1 = vld1_u8(var1 + 8);
uint8x8_t var2_0 = vld1_u8(var2 + 0);
uint8x8_t var2_1 = vld1_u8(var2 + 8);
int16x8_t s0 = vreinterpretq_s16_u16(vsubl_u8(var1_0, var2_0));
int16x8_t s1 = vreinterpretq_s16_u16(vsubl_u8(var1_1, var2_1));
uint16x8_t u0 = vreinterpretq_u16_s16(vmulq_s16(s0, s0));
uint16x8_t u1 = vreinterpretq_u16_s16(vmulq_s16(s1, s1));
#ifdef __aarch64__1
err = vaddlvq_u16(u0) + vaddlvq_u16(u1);
#else
uint32x4_t err0 = vpaddlq_u16(u0);
uint32x4_t err1 = vpaddlq_u16(u1);
err0 = vaddq_u32(err0, err1);
uint32x2_t err00 = vpadd_u32(vget_low_u32(err0), vget_high_u32(err0));
err00 = vpadd_u32(err00, err00);
err = vget_lane_u32(err00, 0);
#endif
if (ret_err > err)
{
ret_err = err;
ret = i;
#if 0 // enable early exit?
if (ret_err == 0)
break;
#endif
}
}
return ret;
}
Now, if (ret_err > err) is clearly data hazard. Then I manually "unrolled" loop by two and modified code to use err0 and err1 and check them after performing next round of compute. According to profiler I got some improvements. In simple neon loop I got roughly 30% of entire function spent in the two lines vget_lane_u32 followed by if (ret_err > err). After I unrolled by two these operations started to take 25% (e.g. I got roughly 10% overall speedup). Also, check armv7 version, there is only 8 instructions between when err0 is set (vmov.32 r6, d16[0]) and when it's accessed (cmp r12, r6). T
Note, in the code early exit is ifdefed out. Enabling it would make function even slower. If I unrolled it by four and changed to use four errN variables and deffer check by two rounds then I still saw vget_lane_u32 in profiler taking too much time. When I checked generated asm, appears that compiler destroys all the optimizations attempts because it reuses some of the errN registers which effectively makes CPU access results of vget_lane_u32 much earlier than I want (and I aim to delay access by 10-20 instructions). Only when I unrolled by 4 and marked all four errN as volatile vget_lane_u32 totally disappeared from the radar in profiler, however, the if (ret_err > errN) check obviously got slow as hell as now these probably ended up as regular stack variables overall these 4 checks in 4x manual loop unroll started to take 40%. Looks like with proper manual asm it's possible to make it work properly: have early loop exit, while avoiding neon->arm stalls and have some arm logic in the loop, however, extra maintenance required to deal with arm asm makes it 10x more complex to maintain that kind of code in a large project (that doesn't have any armasm).
Update:
Here's sample stall when moving data from neon to arm register. To implement early exist I need to move from neon to arm once per loop. This move alone takes more than 50% of entire function according to sampling profiler that comes with xcode. I tried to add lots of noops before and/or after the mov, but nothing seems to affect results in profiler. I tried to use vorr d0,d0,d0 for noops: no difference. What's the reason for the stall, or the profiler simply shows wrong results?

pic Interrupt on change doesn't trigger first time

I'm trying to comunicate a pic18f24k50 with an arduino 101. I'm using two lines to establish a synchronized comunication. On every change from low to high of the first line, I read the value from the second line.
I have no problems with the arduino code, my problem is that in the pic, the Interrupt on change triggers on the second change from low to high instead of triggering on the first change. This only happens the first time I send data, after that, it works perfectly (it triggers on the first change and I receive the byte properly). Sorry for my english, I'll try to explain myself better with this image:
Channel 1 is the clock signal, channel2 is the data (Im sending one byte for the moment, with bit values 10101010). Channel 4 is an output I'm changing every time I process a bit. (as you can see, it begins on the second rise of the clock signal instead of the first one). This is captured on the first sent byte, the next ones works ok.
I post the relevant code in the pic here:
This is where I initialize things:
TRISCbits.TRISC6 = 0;
TRISCbits.TRISC1 = 1;
TRISCbits.TRISC2 = 1;
IOCC1 = 1;
ANSELCbits.ANSC2=0;
IOCC2 = 0;
INTCONbits.IOCIE = 1;
INTCONbits.IOCIF = 0;
And this is on the interrupt code:
void interrupt SYS_InterruptHigh(void)
{
if (INTCONbits.IOCIE==1 && INTCONbits.IOCIF==1)
{
readByte();
}
}
void readByte(void)
{
while(contaBits<8)
{
INTCONbits.IOCIE = 0;
INTCONbits.IOCIF = 0;
while (PORTCbits.RC1 != HIGH)
{
}
if (PORTCbits.RC1 == HIGH)
{
LATCbits.LATC6 = !LATCbits.LATC6;
//LATCbits.LATC6 = ~LATCbits.LATC6;
switch (contaBits)
{
case 0:
if (PORTCbits.RC2 == HIGH)
varByte.b0 = 1;
else
varByte.b0 = 0;
break;
case 1:
if (PORTCbits.RC2 == HIGH)
varByte.b1 = 1;
else
varByte.b1 = 0;
break;
case 2:
if (PORTCbits.RC2 == HIGH)
varByte.b2 = 1;
else
varByte.b2 = 0;
break;
case 3:
if (PORTCbits.RC2 == HIGH)
varByte.b3 = 1;
else
varByte.b3 = 0;
break;
case 4:
if (PORTCbits.RC2 == HIGH)
varByte.b4 = 1;
else
varByte.b4 = 0;
break;
case 5:
if (PORTCbits.RC2 == HIGH)
varByte.b5 = 1;
else
varByte.b5 = 0;
break;
case 6:
if (PORTCbits.RC2 == HIGH)
varByte.b6 = 1;
else
varByte.b6 = 0;
break;
case 7:
if (PORTCbits.RC2 == HIGH)
varByte.b7 = 1;
else
varByte.b7 = 0;
break;
}
contaBits++;
}
}//while(contaBits<8)
INTCONbits.IOCIE = 1;
contaBits=0;
}
LATCbits.LATC6 = !LATCbits.LATC6; <-- this is the line corresponding to channel 4.
RC1 is channel 1
and RC2 is channel 2
My question is what am I doing wrong, why on the first sent of bytes the interrupt doesn't triggers on the first change of the line 1?
Thank you.
What are you trying to achieve?
The communication protocol as you're describing is commonly refered to as SPI (Serial Peripheral Interface). You should use the hardware implementation available by PIC/Microchip, if possible, for best performance.
Keep your code documented/formatted/logic
I noticed your code being a little, weird.
Weird code gives weird errors.
First:
You're using interrupts, you have this blocking code in your interrupt. Which isn't nice at all.
Second:
Your "contabits" and "varByte" variable come out of nowhere, probably they're global. Which might not be the best practice.
Though, if you're using interrupts, and when it might make sense to transfer the variables to your main program by using a global. The global should be volatile.
Third:
That case switch is just 8x the same code, but a little different.
Also, add some comments to your code.
To be honest
I didn't spot your actual error, has been a long time since I worked with PIC.
But, you should try hardware SPI.
Below is my attempt at software SPI as you described, it might be a little more logical, and takes out any annoyance of the interrupts.
I would recommend replacing the while(CLOCK_PIN_MACRO != 1){}; with for(unsigned int timeout = 64000; timeout > 0; timeout--){};.
So that you can be sure your program won't hang while waiting for an SPI signal that won't come. (Make the timeout something what fits your needs)
#define CLOCK_PIN_MACRO PORTCbits.RC1
#define DATA_PIN_MACRO PORTCbits.RC2
/*
Test, without interrupts.
I would strongly advise to use Hardware SPI or non-blocking code.
*/
unsigned char readByte(void){
unsigned char returnByte = 0;
for(int i = 0; i < 8; i++){ //Do this for bit 0 to 7, to get a complete byte.
while(CLOCK_PIN_MACRO != 1){}; //Wait until the Clock pin goes high.
if(DATA_PIN_MACRO == 1) //Check the data pin.
returnByte = returnByte & (1 << i); //Shift the bit in our byte. (https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/bitshift.html)
while(CLOCK_PIN_MACRO == 1){}; //Wait untill the Clock pin goes low again.
}
return returnByte;
}
void setup(){
TRISCbits.TRISC6 = 0;
TRISCbits.TRISC1 = 1;
TRISCbits.TRISC2 = 1;
//IOCC1 = 1;
ANSELCbits.ANSC2=0;
//IOCC2 = 0;
INTCONbits.IOCIE = 1;
INTCONbits.IOCIF = 0;
}
void run(){
unsigned char theByte = readByte();
}
main(){
setup();
while(1){
run();
}
}
You can find the following from the datasheet.
The pins are compared with the old value latched on the last read of
PORTB. The "mismatch" outputs of RB7:RB4 are ORed together to generate
the RB Port Change Interrupt with Flag bit,RBIF (INTCON<0>).
This means that when you read PORTB RB7:RB4 pins, the value of each pin will be stored in an internal latch. Interrupt will be generated only when there is any change in the pin input from the previously latched value.
Then, what will be the default initial value in the latch? i.e. Before we read anything from PORTB.
What ever it is. in your case the default latch value is different from the logic state of your input signal. During the next transition of the signal the latch value and signal level becomes same. So no intterrupt is generated first time.
What you can do is setting the latch value to the initial state of your signal. This can be done by reading PORTB (should be done before enabling the intterrupt).
The below code is enough to simply read the register.
unsigned char ch;
ch = PORTB;
Hope this will help.

Read RC PWM signal using ATMega2560 in Atmel AVR studio

I am trying to read several PWM signals from an RC receiver into an ATMega 2560. I am having trouble understanding how the ICRn pin functions as it appears to be used for all three compare registers.
The RC PWM signal has a period of 20ms with a HIGH pulse of 2ms being a valid upper value and 1ms being a valid lower value. So the value will sweep from 1000us to 2000us. The period should begin at the rising edge of the pulse.
I have prescaled the 16MHz clock by 8 to have a 2MHz timer an thus should be able to measure the signal to 0.5us accuracy which is sufficient for my requirements.
Please note that I am having not problems with PWM output and this question is specifically about PWM input.
My code thus far is attached below. I know that I will have to use ICR3 and an ISR to measure the PWM values but I am unsure as to the best procedure for doing this. I also do not know how to check if the value measured is from PE3, PE4, or PE5. Is this code right and how do I get the value that I am looking for?
Any help would be greatly appreciated.
// Set pins as inputs
DDRE |= ( 0 << PE3 ) | ( 0 << PE4 ) | ( 0 << PE5 );
// Configure Timers for CTC mode
TCCR3A |= ( 1 << WGM31 ) | ( 1 << WGM30 ); // Set on compare match
TCCR3B |= ( 1 << WGM33 ) | ( 1 << WGM32 ) | ( 1 << CS31); // Set on compare match, prescale_clk/8
TCCR3B |= ( 1 << ICES5 ) // Use rising edge as trigger
// 16 bit register - set TOP value
OCR3A = 40000 - 1;
OCR3B = 40000 - 1;
OCR3C = 40000 - 1;
TIMSK3 |= ( 1 << ICIE3 );
I had forgotten to post my solution a few months ago so here it is...
I used a PPM receiver in the end so this code can easily edited to read a simple PWM.
In my header file I made a structure for a 6 channel receiver that I was using for my project. This can be changed as required for receivers with more or less channels.
#ifndef _PPM_H_
#define _PPM_H_
// Libraries included
#include <stdint.h>
#include <avr/interrupt.h>
struct orangeRX_ppm {
uint16_t ch[6];
};
volatile unsigned char ch_index;
struct orangeRX_ppm ppm;
/* Functions */
void ppm_input_init(void); // Initialise the PPM Input to CTC mode
ISR( TIMER5_CAPT_vect ); // Use ISR to handle CTC interrupt and decode PPM
#endif /* _PPM_H_ */
I then had the following in my .c file.
// Libraries included
#include <avr/io.h>
#include <stdint.h>
#include "ppm.h"
/* PPM INPUT
* ---
* ICP5 Pin48 on Arduino Mega
*/
void ppm_input_init(void)
{
DDRL |= ( 0 << PL1 ); // set ICP5 as an input
TCCR5A = 0x00; // none
TCCR5B = ( 1 << ICES5 ) | ( 1 << CS51); // use rising edge as trigger, prescale_clk/8
TIMSK5 = ( 1 << ICIE5 ); // allow input capture interrupts
// Clear timer 5
TCNT5H = 0x00;
TCNT5L = 0x00;
}
// Interrupt service routine for reading PPM values from the radio receiver.
ISR( TIMER5_CAPT_vect )
{
// Count duration of the high pulse
uint16_t high_cnt;
high_cnt = (unsigned int)ICR5L;
high_cnt += (unsigned int)ICR5H * 256;
/* If the duration is greater than 5000 counts then this is the end of the PPM signal
* and the next signal being addressed will be Ch0
*/
if ( high_cnt < 5000 )
{
// Added for security of the array
if ( ch_index > 5 )
{
ch_index = 5;
}
ppm.ch[ch_index] = high_cnt; // Write channel value to array
ch_index++; // increment channel index
}
else
{
ch_index = 0; // reset channel index
}
// Reset counter
TCNT5H = 0;
TCNT5L = 0;
TIFR5 = ( 1 << ICF5 ); // clear input capture flag
}
This code will use an trigger an ISR every time ICP5 goes from low to high. In this ISR the 16bit ICR5 register "ICR5H<<8|ICR5L" holds the number of pre-scaled clock pulses that have elapsed since the last change from low to high. This count is typically less than 2000 us. I have said that if the count is greater than 2500us (5000 counts) then the input is invalid and the next input should be ppm.ch[0].
I have attached an image of PPM as seen on my oscilloscope.
This method of reading PPM is quite efficient as we do not need to keep polling pins to check their logic level.
Don't forget to enable interrupts using the sei() command. Otherwise the ISR will never run.
Let's say you want to do the following (I'm not saying this will allow you to accurately measure the PWM signals but it might serve as example on how to set the registers)
Three timers running, which reset every 20 ms. This can be done by setting them in CTC mode for OCRnA: wgm3..0 = 0b0100.
//timer 1
TCCR4A = 0;
TCCR1B = (1<<CS11) | (1<<WGM12);
OCR1A = 40000 - 1;
//timer 3 (there's no ICP2)
TCCR3A = 0;
TCCR3B = (1<<CS31) | (1<<WGM32);
OCR3A = 40000 - 1;
//timer 4
TCCR4A = 0;
TCCR4B = (1<<CS41) | (1<<WGM42);
OCR4A = 40000 - 1;
Now connect each of the three pwm signals to their own ICPn pin (where n = timer). Check the datasheet for the locations of the different ICPn pins (i'm pretty sure it's not PE3, 4, 5)
Assuming the pwm signals start high at t=0 and go low after their high-time for the remainder of the period. You want to measure the high-time so we trigger an interrupt for each when a falling edge occurs on the ICPn pin.
bit ICESn in the TCCRnB register set to 0 will select the falling edge (this is already done in the previous code block).
To trigger the interrupts, set the corresponding interrupt enable bits:
TIMSK1 |= (1<<ICIE1);
TIMSK3 |= (1<<ICIE3);
TIMSK4 |= (1<<ICIE4);
sei();
Now each time an interrupt is triggered for ICn you can grab the ICRn register to see the time (in clockperiods/8) at which the falling edge occurred.

Determine Position of Most Signifiacntly Set Bit in a Byte

I have a byte I am using to store bit flags. I need to compute the position of the most significant set bit in the byte.
Example Byte: 00101101 => 6 is the position of the most significant set bit
Compact Hex Mapping:
[0x00] => 0x00
[0x01] => 0x01
[0x02,0x03] => 0x02
[0x04,0x07] => 0x03
[0x08,0x0F] => 0x04
[0x10,0x1F] => 0x05
[0x20,0x3F] => 0x06
[0x40,0x7F] => 0x07
[0x80,0xFF] => 0x08
TestCase in C:
#include <stdio.h>
unsigned char check(unsigned char b) {
unsigned char c = 0x08;
unsigned char m = 0x80;
do {
if(m&b) { return c; }
else { c -= 0x01; }
} while(m>>=1);
return 0; //never reached
}
int main() {
unsigned char input[256] = {
0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0x3e,0x3f,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0x5b,0x5c,0x5d,0x5e,0x5f,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,
0x70,0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x7b,0x7c,0x7d,0x7e,0x7f,
0x80,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f,
0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9f,
0xa0,0xa1,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xaf,
0xb0,0xb1,0xb2,0xb3,0xb4,0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xbb,0xbc,0xbd,0xbe,0xbf,
0xc0,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xcb,0xcc,0xcd,0xce,0xcf,
0xd0,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,0xdb,0xdc,0xdd,0xde,0xdf,
0xe0,0xe1,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xeb,0xec,0xed,0xee,0xef,
0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa,0xfb,0xfc,0xfd,0xfe,0xff };
unsigned char truth[256] = {
0x00,0x01,0x02,0x02,0x03,0x03,0x03,0x03,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,
0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,
0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,
0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,
0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08};
int i,r;
int f = 0;
for(i=0; i<256; ++i) {
r=check(input[i]);
if(r !=(truth[i])) {
printf("failed %d : 0x%x : %d\n",i,0x000000FF & ((int)input[i]),r);
f += 1;
}
}
if(!f) { printf("passed all\n"); }
else { printf("failed %d\n",f); }
return 0;
}
I would like to simplify my check() function to not involve looping (or branching preferably). Is there a bit twiddling hack or hashed lookup table solution to compute the position of the most significant set bit in a byte?
Your question is about an efficient way to compute log2 of a value. And because you seem to want a solution that is not limited to the C language I have been slightly lazy and tweaked some C# code I have.
You want to compute log2(x) + 1 and for x = 0 (where log2 is undefined) you define the result as 0 (e.g. you create a special case where log2(0) = -1).
static readonly Byte[] multiplyDeBruijnBitPosition = new Byte[] {
7, 2, 3, 4,
6, 1, 5, 0
};
public static Byte Log2Plus1(Byte value) {
if (value == 0)
return 0;
var roundedValue = value;
roundedValue |= (Byte) (roundedValue >> 1);
roundedValue |= (Byte) (roundedValue >> 2);
roundedValue |= (Byte) (roundedValue >> 4);
var log2 = multiplyDeBruijnBitPosition[((Byte) (roundedValue*0xE3)) >> 5];
return (Byte) (log2 + 1);
}
This bit twiddling hack is taken from Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup where you can see the equivalent C source code for 32 bit values. This code has been adapted to work on 8 bit values.
However, you may be able to use an operation that gives you the result using a very efficient built-in function (on many CPU's a single instruction like the Bit Scan Reverse is used). An answer to the question Bit twiddling: which bit is set? has some information about this. A quote from the answer provides one possible reason why there is low level support for solving this problem:
Things like this are the core of many O(1) algorithms such as kernel schedulers which need to find the first non-empty queue signified by an array of bits.
That was a fun little challenge. I don't know if this one is completely portable since I only have VC++ to test with, and I certainly can't say for sure if it's more efficient than other approaches. This version was coded with a loop but it can be unrolled without too much effort.
static unsigned char check(unsigned char b)
{
unsigned char r = 8;
unsigned char sub = 1;
unsigned char s = 7;
for (char i = 0; i < 8; i++)
{
sub = sub & ((( b & (1 << s)) >> s--) - 1);
r -= sub;
}
return r;
}
I'm sure everyone else has long since moved on to other topics but there was something in the back of my mind suggesting that there had to be a more efficient branch-less solution to this than just unrolling the loop in my other posted solution. A quick trip to my copy of Warren put me on the right track: Binary search.
Here's my solution based on that idea:
Pseudo-code:
// see if there's a bit set in the upper half
if ((b >> 4) != 0)
{
offset = 4;
b >>= 4;
}
else
offset = 0;
// see if there's a bit set in the upper half of what's left
if ((b & 0x0C) != 0)
{
offset += 2;
b >>= 2;
}
// see if there's a bit set in the upper half of what's left
if > ((b & 0x02) != 0)
{
offset++;
b >>= 1;
}
return b + offset;
Branch-less C++ implementation:
static unsigned char check(unsigned char b)
{
unsigned char adj = 4 & ((((unsigned char) - (b >> 4) >> 7) ^ 1) - 1);
unsigned char offset = adj;
b >>= adj;
adj = 2 & (((((unsigned char) - (b & 0x0C)) >> 7) ^ 1) - 1);
offset += adj;
b >>= adj;
adj = 1 & (((((unsigned char) - (b & 0x02)) >> 7) ^ 1) - 1);
return (b >> adj) + offset + adj;
}
Yes, I know that this is all academic :)
It is not possible in plain C. The best I would suggest is the following implementation of check. Despite quite "ugly" I think it runs faster than the ckeck version in the question.
int check(unsigned char b)
{
if(b&128) return 8;
if(b&64) return 7;
if(b&32) return 6;
if(b&16) return 5;
if(b&8) return 4;
if(b&4) return 3;
if(b&2) return 2;
if(b&1) return 1;
return 0;
}
Edit: I found a link to the actual code: http://www.hackersdelight.org/hdcodetxt/nlz.c.txt
The algorithm below is named nlz8 in that file. You can choose your favorite hack.
/*
From last comment of: http://stackoverflow.com/a/671826/315052
> Hacker's Delight explains how to correct for the error in 32-bit floats
> in 5-3 Counting Leading 0's. Here's their code, which uses an anonymous
> union to overlap asFloat and asInt: k = k & ~(k >> 1); asFloat =
> (float)k + 0.5f; n = 158 - (asInt >> 23); (and yes, this relies on
> implementation-defined behavior) - Derrick Coetzee Jan 3 '12 at 8:35
*/
unsigned char check (unsigned char b) {
union {
float asFloat;
int asInt;
} u;
unsigned k = b & ~(b >> 1);
u.asFloat = (float)k + 0.5f;
return 32 - (158 - (u.asInt >> 23));
}
Edit -- not exactly sure what the asker means by language independent, but below is the equivalent code in python.
import ctypes
class Anon(ctypes.Union):
_fields_ = [
("asFloat", ctypes.c_float),
("asInt", ctypes.c_int)
]
def check(b):
k = int(b) & ~(int(b) >> 1)
a = Anon(asFloat=(float(k) + float(0.5)))
return 32 - (158 - (a.asInt >> 23))