How do I get the current interrupt state (enabled, disabled or current level) on a MC9S12ZVM processor - embedded

I'm working on a project using a MC9S12ZVM family processor and need to be able to get, save and restore the current interrupt enabled state. This is needed to access variables from the main line code that may be modified by the interrupt handler that are larger than word in size and therefore not atomic.
pseudo code: (variable is 32bits and -= isn't atomic anyhow)
state_save = current_interrupt_state();
DisableInterrupt();
variable -= x;
RestoreInterrupts(state_save);
Edit: I found something that works, but has the issue of modifying the stack.
asm(PSH CCW);
asm(SEI);
Variable++;
asm(PUL CCW);
This is ok as long as I don't need to do anything other than a simple variable++, but I don't like exiting a block with the stack modified.

It seems you are referring to the global interrupt mask. If so, then this is one way to disable it and then restore it to previous state:
static const uint8_t CCR_I_MASK = 0x10;
static uint8_t ccr;
void disable_interrupts (void)
{
__asm PSHA;
__asm TPA; // transfer CCR to A
__asm STA ccr; // store CCR in RAM variable
__asm PULA;
__asm SEI;
}
void restore_interrupts (void)
{
if((ccr & CCR_I_MASK) == 0)
{
__asm CLI; // i was not set, clear it
}
else
{
; // i was set, do nothing
}
}
__asm is specific to the Codewarrior compiler, with or without "strict ANSI" option set.

Ok, I've found an answer to my problem, with thanks to those who commented.
static volatile uint16_t v = 0u;
void testfunction(void);
void testfunction(void)
{
static uint16_t L_CCR;
asm( PSH D2 );
asm( TFR CCW, D2);
asm( ST D2, L_CCR );
asm( PUL D2 );
asm( SEI );
v++;
asm( PSH D2 );
asm( LD D2, L_CCR );
asm( TFR D2, CCW);
asm( PUL D2 );
}

Related

Microblaze How to use AXI Stream?

I have a microblaze instantiated with 16 stream interfaces with a custom IP attached to two. What is the correct header file or function to communicate over these interfaces in Vitis (Not HLS)?
Based on the full example that you can find here, I am going to provide a general idea:
Include the mb_interface.h in your C source
Use the putfsl and getfsl macros to write and read from the stream.
Such macros are wrapper around special assembly instructions that the microblaze will execute by writing the data on the axi stream interface. The ìd is the stream id. Here you can find all the possible functions and here you can explore the ISA.
#define putfsl(val, id) asm volatile ("put\t%0,rfsl" stringify(id) :: "d" (val))
The fundamental issue is that
#include "mb_interface.h"
/*
* Write 4 32-bit words.
*/
static void inline write_axis(volatile unsigned int *a)
{
register int a0, a1, a2, a3;
a3 = a[3]; a1 = a[1]; a2 = a[2]; a0 = a[0];
putfsl(a0, 0); putfsl(a1, 0); putfsl(a2, 0); putfsl(a3, 0);
}
int main()
{
volatile unsigned int outbuffer[BUFFER_SIZE] = { 0x0, 0x1, 0x2, 0x3 }
};
/* Perform transfers */
write_axis(outbuffer);
return 0;
}

Can we have dirty data on l1 cache in gpu?

I've read some of the common write policies in the microarchitecture of GPUs. For most of the GPU the written policy is the same as the below picture (the picture is from the gpgpu-sim manual). based on the below picture I have a question. can we have dirty data on the l1 cache?
The L1 on some GPU architectures is a write-back cache for global accesses. Note that this topic varies by GPU architecture, e.g. for whether global activity is cached in L1.
Speaking generally, then, yes you can have dirty data. By this I mean that the data in the L1 cache is modified (compared to what is otherwise in global space or the L2 cache) and it has not yet been "flushed" or updated into the L2 cache. (You can also have "stale" data - data in the L1 that has not been modified, but is not consistent with the L2.)
We can create a simple proof point for this (dirty data).
The following code, when executed on a cc7.0 device (and probably some other archtectures as well) will not give the expected answer of 1024.
This is due to the fact that the L1, which is a separate entity per SM, is not immediately flushed to the L2. It therefore has "dirty data" by the above definition.
(The code is broken for this reason. Don't use this code. It's just a proof point.)
#include <iostream>
#include <cuda_runtime.h>
constexpr int num_blocks = 1024;
constexpr int num_threads = 32;
struct Lock {
int *locked;
Lock() {
int init = 0;
cudaMalloc(&locked, sizeof(int));
cudaMemcpy(locked, &init, sizeof(int), cudaMemcpyHostToDevice);
}
~Lock() {
if (locked) cudaFree(locked);
locked = NULL;
}
__device__ __forceinline__ void acquire_lock() {
while (atomicCAS(locked, 0, 1) != 0);
}
__device__ __forceinline__ void unlock() {
atomicExch(locked, 0);
}
};
__global__ void counter(Lock lock, int *total) {
if (threadIdx.x == 1) {
lock.acquire_lock();
*total = *total + 1;
// __threadfence(); uncomment this line to fix
lock.unlock();
}
}
int main() {
int *total_dev;
cudaMalloc(&total_dev, sizeof(int));
int total_host = 0;
cudaMemcpy(total_dev, &total_host, sizeof(int), cudaMemcpyHostToDevice);
{
Lock lock;
counter<<<num_blocks, num_threads>>>(lock, total_dev);
cudaDeviceSynchronize();
cudaMemcpy(&total_host, total_dev, sizeof(int), cudaMemcpyDeviceToHost);
std::cout << total_host << std::endl;
}
cudaFree(total_dev);
}
In case there is any further doubt about whether this is a proper proof (e.g. to dispel arguments about things being "optimized into a register" etc.) we can study the resultant sass code. The end of the above kernel has code that looks like this:
/*0130*/ LDG.E.SYS R0, [R4] ; /* 0x0000000004007381 */
// load *total /* 0x000ea400001ee900 */
/*0140*/ IADD3 R7, R0, 0x1, RZ ; /* 0x0000000100077810 */
// add 1 /* 0x004fd00007ffe0ff */
/*0150*/ STG.E.SYS [R4], R7 ; /* 0x0000000704007386 */
// store *total /* 0x000fe8000010e900 */
/*0160*/ ATOMG.E.EXCH.STRONG.GPU PT, RZ, [R2], RZ ; /* 0x000000ff02ff73a8 */
//lock.unlock /* 0x000fe200041f41ff */
/*0170*/ EXIT ;
Since the result register has definitely been stored to the global space, we can infer that if another thread (in another SM) reads an unexpected value in global space for *total it must be due to the fact that the store from another SM has not reached the L2, i.e. has not reached device-wide consistency/coherency. Therefore the data in some other SM is "dirty". We can (presumably) rule out the "stale" case here (the data in the other L1 was written, but I have "old" data in my L1) because the global load indicated above does not happen until the lock is acquired in the SM.
Note that the above code "fails" on cc7.0 devices (and probably some other device architectures). It does not necessarily fail on the GPU you are using. But it is still "broken".

ADC interrupt doesnt work with TIMER3 interrupt that generates PWM

I've been trying to write some code on STM32F411re usign IAR workbench in order to learn more about Cortex. I tried to implement TIMER3 PWM mode (center-aligned) with TIMER 2 being called every (half a second, second doesnt matter as much performing LED blink) and ADC performing continious regular conversion on one channel. I've tried to implement it all using interrupts. TIMER3 interrupt is inteded to be generated on Overflow and underflow and within ISR i would change PWM width with value from ADC (changed with potentiometer).
Problem that i faced while creating project seems to be that, when TIMER3 is activated, program doesnt hit breakpoint ( does not enter) ADC ISR routine nor within any line of program within while(1) loop. When i comment TIMER 3, program normally goes through ADC ISR.
#include "stdio.h"
void Uart6Configuration(void);
void send_data (uint8_t c);
void init_PWM(void);
void init_ADC(void) ;
void init_Interupts(void);
unsigned long vrednost_ADC=0;
float temp=0;
unsigned long counter=0;
int main()
{
RCC->APB1ENR|=(1<<0); //TIMER 2
RCC->AHB1ENR|=(1<<0); //GPIOA
RCC->AHB1ENR|=(1<<2); //GPIOC
GPIOA->MODER|=(1<<10);
RCC->APB2ENR|=(1<<5); // USART6[PC6,PC7]
/* Define TIMER-a 3 */
RCC->APB1ENR|=(1<<1); //TIMER 3
GPIOB->MODER|=(1<<9);
GPIOB->AFR[0]|=(1<<17);
TIM2->PSC=89;
TIM2->ARR=0xFFFF;
TIM2->DIER|= (1<<0);
TIM2->EGR|= (1<<0);
Uart6Configuration();
init_PWM();
init_ADC();
init_Interupts();
TIM2->CR1|=(1<<0);
TIM3->CR1|=(1<<0);
while(!(TIM2->SR & (1<<0)));
ADC1->CR2|=(1<<30); // START ADC
/*GLAVNA PROGRAMSKA PETLJA*/
while(1)
{
counter++;
if(counter>100000)
{
printf("AD konverzija=%f \n\r",temp); //Terminal I/O
counter=0;
}
}
/* ************************/
return 0;
}
void TIM2_IRQHandler(void )
{
if(TIM2->SR & TIM_SR_UIF)
{
TIM2->SR &= ~TIM_SR_UIF;
GPIOA->ODR^=(1<<5);
}
TIM2->SR =0;
}
void Uart6Configuration (void)
{
GPIOC->MODER |= (2<<12); // --> Alternate Function for Pin PA11
GPIOC->MODER |= (2<<14); // --> Alternate Function for Pin PA12
GPIOC->OSPEEDR|=(3<<12)|(3<<14);
GPIOC->AFR[0] |= (8<<24); //AF7 bitovi 8,9,10,11 PC6
GPIOC->AFR[0] |= (8<<28); //AF7 bitovi 15,14,13,12 PC7
USART6->CR1=0;
USART6->CR1|=(1<<13);
USART6->CR1 &= ~(1<<12);
USART6->BRR=(3<<0)|(104<<4);
USART6->CR1|=(1<<2);
USART6->CR1|=(1<<3);
}
void send_data (uint8_t c)
{
while(!(USART6->SR & (1<<6)));
USART6->DR=c;
}
uint8_t UART6_GetChar (void)
{
/*********** STEPS FOLLOWED *************
1. Wait for the RXNE bit to set. It indicates that the data has been received and can be read.
2. Read the data from USART_DR Register. This also clears the RXNE bit
****************************************/
uint8_t temp;
while (!(USART2->SR & (1<<5))); // wait for RXNE bit to set
temp = USART2->DR; // Read the data. This clears the RXNE also
return temp;
}
void init_PWM(void)
{
/*PB_4*/
TIM3->PSC=15;
TIM3->ARR=750;
TIM3->CR1|= (1<<5)|(1<<6) | (1<<2); // PWM CENTAR EDGE MODE
TIM3->CCER|=(1<<0); //Capture/Compare 1 output enable.
TIM3->CCR1=500; //DUTY CYCLE
TIM3->CCMR1|=(1<<5)|(1<<6); // PWM MODE bit 5 i6
TIM3->DIER|=(1<<0);
}
void init_ADC(void)
{
RCC->APB2ENR|=(1<<8); // Clock za adc
GPIOA->MODER|=(1<<2)|(1<<3); // Analog mode PA.1
ADC1->SQR3|=(1<<0); // Choose channel ADC1/1
ADC1->CR1|=(1<<5); //EOCIE interupt generates when ADC finish conversion
ADC1->CR2|=(1<<1)|(1<<0); // Continious mode, ADC ON
}
void ADC_IRQHandler(void)
{
vrednost_ADC=ADC1->DR;
temp=(float)((vrednost_ADC/4095.0)*3.3) ;
}
void TIM3_IRQHandler(void )
{
if((TIM3->CNT & 10)<=0) // DETECTOVATI UNDERFLOW
{
TIM3->CCR1=(vrednost_ADC/4095)*1000;
TIM3->EGR|=(1<<0);
}
}
void init_Interupts(void)
{
NVIC_SetPriority (ADC_IRQn, (13));
NVIC_SetPriority (TIM2_IRQn, 14);
NVIC_SetPriority (TIM3_IRQn, 15);
NVIC_EnableIRQ(TIM2_IRQn);
NVIC_EnableIRQ(TIM3_IRQn);
NVIC_EnableIRQ(ADC_IRQn );
}```

STM32F769NI USB CDC host problem sending simple data to the device

I am making HID for some data acquisition system. There are a lot of sensors who store test data and when I need I get to them and connect via USB and take it. USB host sent 3 bytes and USB device, if bytes are correct, sends its stored data. Sounds simple.
Previously it was implemented on PC, but now I try to implement it on STM32F769 Discovery and have some serious problems.
I am using ARM Keil 5.27, code generated with STM32CubeMX 5.3.0. I tried just to make a plain simple program, later to integrate with the entire touchscreen interface. I tried to implement this code in main:
if (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
while (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
{
Transmission_function();
}
And the function itself:
#define DLE 0x10
#define STX 0x2
uint8_t tx_buf[]={DLE, STX, 120}, RX_FLAG;
uint32_t size_tx=sizeof(tx_buf);
void Transmission_function (void)
{
if (Appli_state == APPLICATION_READY)
{
i=0;
USBH_CDC_Transmit(&hUsbHostHS, tx_buf, size_tx);
HAL_Delay(50);
RX_FLAG=0;
}
}
It should send the message after I press the blue button on the Discovery board. All that I get is Hard Fault. While trying to debug, I tried manually to check after which action I get this error and it was functioning in stm32f7xx_ll_usb.c:
HAL_StatusTypeDef USB_WritePacket(USB_OTG_GlobalTypeDef *USBx, uint8_t *src,
uint8_t ch_ep_num, uint16_t len, uint8_t dma)
{
uint32_t USBx_BASE = (uint32_t)USBx;
uint32_t *pSrc = (uint32_t *)src;
uint32_t count32b, i;
if (dma == 0U)
{
count32b = ((uint32_t)len + 3U) / 4U;
for (i = 0U; i < count32b; i++)
{
USBx_DFIFO((uint32_t)ch_ep_num) = *((__packed uint32_t *)pSrc);
pSrc++;
}
}
return HAL_OK;
}
But trying to scroll back in disassembly I notice, that just before Hard Fault program was in this function inside stm32f7xx_hal_hcd.c, in case GRXSTS_PKTSTS_IN:
static void HCD_RXQLVL_IRQHandler(HCD_HandleTypeDef *hhcd)
{
USB_OTG_GlobalTypeDef *USBx = hhcd->Instance;
uint32_t USBx_BASE = (uint32_t)USBx;
uint32_t pktsts;
uint32_t pktcnt;
uint32_t temp;
uint32_t tmpreg;
uint32_t ch_num;
temp = hhcd->Instance->GRXSTSP;
ch_num = temp & USB_OTG_GRXSTSP_EPNUM;
pktsts = (temp & USB_OTG_GRXSTSP_PKTSTS) >> 17;
pktcnt = (temp & USB_OTG_GRXSTSP_BCNT) >> 4;
switch (pktsts)
{
case GRXSTS_PKTSTS_IN:
/* Read the data into the host buffer. */
if ((pktcnt > 0U) && (hhcd->hc[ch_num].xfer_buff != (void *)0))
{
(void)USB_ReadPacket(hhcd->Instance, hhcd->hc[ch_num].xfer_buff, (uint16_t)pktcnt);
/*manage multiple Xfer */
hhcd->hc[ch_num].xfer_buff += pktcnt;
hhcd->hc[ch_num].xfer_count += pktcnt;
if ((USBx_HC(ch_num)->HCTSIZ & USB_OTG_HCTSIZ_PKTCNT) > 0U)
{
/* re-activate the channel when more packets are expected */
tmpreg = USBx_HC(ch_num)->HCCHAR;
tmpreg &= ~USB_OTG_HCCHAR_CHDIS;
tmpreg |= USB_OTG_HCCHAR_CHENA;
USBx_HC(ch_num)->HCCHAR = tmpreg;
hhcd->hc[ch_num].toggle_in ^= 1U;
}
}
break;
case GRXSTS_PKTSTS_DATA_TOGGLE_ERR:
break;
case GRXSTS_PKTSTS_IN_XFER_COMP:
case GRXSTS_PKTSTS_CH_HALTED:
default:
break;
}
}
Last few lines from Dissasembly shows this:
0x080018B4 E8BD81F0 POP {r4-r8,pc}
0x080018B8 0000 DCW 0x0000
0x080018BA 1FF8 DCW 0x1FF8
Why it fails? How could I fix it? I do not have much experience with USB protocol.
I will post my walkaround this, but I am not sure why it worked. Solution was to use EXTI0 interrupt instead of just detection if PA0 is high, as I showed I used here:
if (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
while (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
Transmission_function();
I changed it to this:
void EXTI0_IRQHandler(void)
{
/* USER CODE BEGIN EXTI0_IRQn 0 */
if(Appli_state == APPLICATION_READY){
USBH_CDC_Transmit(&hUsbHostHS, Buffer, 3);
}
/* USER CODE END EXTI0_IRQn 0 */
HAL_GPIO_EXTI_IRQHandler(GPIO_PIN_0);
/* USER CODE BEGIN EXTI0_IRQn 1 */
/* USER CODE END EXTI0_IRQn 1 */
}

u-boot spi initialisation in omap3

I was looking into spi driver in u boot , here is a small snippet from
omap_spi.c
void spi_init(void)
{
gpMCSPIRegs = (MCSPI_REGS *)MCSPI_SPI1_IO_BASE;
unsigned long u, n;
/* initialize the multipad and interface clock */
spi_init_spi1();
/* soft reset */
CSP_BITFINS(gpMCSPIRegs->SYSCONFIG, SPI_SYSCONFIG_SOFTRESET, 1);
for (n = 0; n < 100; n++) {
u = CSP_BITFEXT(gpMCSPIRegs->SYSSTATUS,
SPI_SYSSTATUS_RESETDONE);
if (u)
break;
}
...more code
}
here in
omap_spi.h
#define CSP_BITFINS(var, bit, val) \
(CSP_BITFCLR(var, bit)); (var |= CSP_BITFVAL(bit, val))
my confusion here is that when they do soft reset , they call this CSP_BITFINS macro. inside this macro all they do is just manipulate bits and fill structures. where do they access that hardware registers to configure ?
If you look further, you'll find that the pointer they are using, gpMCSPIRregs, is volatile and pointing at the memory-mapper hardware registers. The bits they are setting/clearing are in the hardware registers.