I was looking into spi driver in u boot , here is a small snippet from
omap_spi.c
void spi_init(void)
{
gpMCSPIRegs = (MCSPI_REGS *)MCSPI_SPI1_IO_BASE;
unsigned long u, n;
/* initialize the multipad and interface clock */
spi_init_spi1();
/* soft reset */
CSP_BITFINS(gpMCSPIRegs->SYSCONFIG, SPI_SYSCONFIG_SOFTRESET, 1);
for (n = 0; n < 100; n++) {
u = CSP_BITFEXT(gpMCSPIRegs->SYSSTATUS,
SPI_SYSSTATUS_RESETDONE);
if (u)
break;
}
...more code
}
here in
omap_spi.h
#define CSP_BITFINS(var, bit, val) \
(CSP_BITFCLR(var, bit)); (var |= CSP_BITFVAL(bit, val))
my confusion here is that when they do soft reset , they call this CSP_BITFINS macro. inside this macro all they do is just manipulate bits and fill structures. where do they access that hardware registers to configure ?
If you look further, you'll find that the pointer they are using, gpMCSPIRregs, is volatile and pointing at the memory-mapper hardware registers. The bits they are setting/clearing are in the hardware registers.
Related
I've read some of the common write policies in the microarchitecture of GPUs. For most of the GPU the written policy is the same as the below picture (the picture is from the gpgpu-sim manual). based on the below picture I have a question. can we have dirty data on the l1 cache?
The L1 on some GPU architectures is a write-back cache for global accesses. Note that this topic varies by GPU architecture, e.g. for whether global activity is cached in L1.
Speaking generally, then, yes you can have dirty data. By this I mean that the data in the L1 cache is modified (compared to what is otherwise in global space or the L2 cache) and it has not yet been "flushed" or updated into the L2 cache. (You can also have "stale" data - data in the L1 that has not been modified, but is not consistent with the L2.)
We can create a simple proof point for this (dirty data).
The following code, when executed on a cc7.0 device (and probably some other archtectures as well) will not give the expected answer of 1024.
This is due to the fact that the L1, which is a separate entity per SM, is not immediately flushed to the L2. It therefore has "dirty data" by the above definition.
(The code is broken for this reason. Don't use this code. It's just a proof point.)
#include <iostream>
#include <cuda_runtime.h>
constexpr int num_blocks = 1024;
constexpr int num_threads = 32;
struct Lock {
int *locked;
Lock() {
int init = 0;
cudaMalloc(&locked, sizeof(int));
cudaMemcpy(locked, &init, sizeof(int), cudaMemcpyHostToDevice);
}
~Lock() {
if (locked) cudaFree(locked);
locked = NULL;
}
__device__ __forceinline__ void acquire_lock() {
while (atomicCAS(locked, 0, 1) != 0);
}
__device__ __forceinline__ void unlock() {
atomicExch(locked, 0);
}
};
__global__ void counter(Lock lock, int *total) {
if (threadIdx.x == 1) {
lock.acquire_lock();
*total = *total + 1;
// __threadfence(); uncomment this line to fix
lock.unlock();
}
}
int main() {
int *total_dev;
cudaMalloc(&total_dev, sizeof(int));
int total_host = 0;
cudaMemcpy(total_dev, &total_host, sizeof(int), cudaMemcpyHostToDevice);
{
Lock lock;
counter<<<num_blocks, num_threads>>>(lock, total_dev);
cudaDeviceSynchronize();
cudaMemcpy(&total_host, total_dev, sizeof(int), cudaMemcpyDeviceToHost);
std::cout << total_host << std::endl;
}
cudaFree(total_dev);
}
In case there is any further doubt about whether this is a proper proof (e.g. to dispel arguments about things being "optimized into a register" etc.) we can study the resultant sass code. The end of the above kernel has code that looks like this:
/*0130*/ LDG.E.SYS R0, [R4] ; /* 0x0000000004007381 */
// load *total /* 0x000ea400001ee900 */
/*0140*/ IADD3 R7, R0, 0x1, RZ ; /* 0x0000000100077810 */
// add 1 /* 0x004fd00007ffe0ff */
/*0150*/ STG.E.SYS [R4], R7 ; /* 0x0000000704007386 */
// store *total /* 0x000fe8000010e900 */
/*0160*/ ATOMG.E.EXCH.STRONG.GPU PT, RZ, [R2], RZ ; /* 0x000000ff02ff73a8 */
//lock.unlock /* 0x000fe200041f41ff */
/*0170*/ EXIT ;
Since the result register has definitely been stored to the global space, we can infer that if another thread (in another SM) reads an unexpected value in global space for *total it must be due to the fact that the store from another SM has not reached the L2, i.e. has not reached device-wide consistency/coherency. Therefore the data in some other SM is "dirty". We can (presumably) rule out the "stale" case here (the data in the other L1 was written, but I have "old" data in my L1) because the global load indicated above does not happen until the lock is acquired in the SM.
Note that the above code "fails" on cc7.0 devices (and probably some other device architectures). It does not necessarily fail on the GPU you are using. But it is still "broken".
I'm working on Keil software and using LM3S316 microcontroller. Usually we address registers in microcontrollers in form of:
#define GPIO_PORTC_DATA_R (*((volatile uint32_t *)0x400063FC))
My question is how can I access to single pin of register for example, if I have this method:
char process_key(int a)
{ PC_0 = a ;}
How can I get PC_0 and how to define it?
Thank you
Given say:
#define PIN0 (1u<<0)
#define PIN1 (1u<<1)
#define PIN2 (1u<<2)
// etc...
Then:
char process_key(int a)
{
if( a != 0 )
{
// Set bit
GPIO_PORTC_DATA_R |= PIN0 ;
}
else
{
// Clear bit
GPIO_PORTC_DATA_R &= ~PIN0 ;
}
}
A generalisation of this idiomatic technique is presented at How do you set, clear, and toggle a single bit?
However the read-modify-write implied by |= / &= can be problematic if the register might be accessed in different thread/interrupt contexts, as well as adding a possibly undesirable overhead. Cortex-M3/4 parts have a feature known as bit-banding that allows individual bits to be addressed directly and atomically. Given:
volatile uint32_t* getBitBandAddress( volatile const void* address, int bit )
{
__IO uint32_t* bit_address = 0;
uint32_t addr = reinterpret_cast<uint32_t>(address);
// This bit maniplation makes the function valid for RAM
// and Peripheral bitband regions
uint32_t word_band_base = addr & 0xf0000000u;
uint32_t bit_band_base = word_band_base | 0x02000000u;
uint32_t offset = addr - word_band_base;
// Calculate bit band address
bit_address = reinterpret_cast<__IO uint32_t*>(bit_band_base + (offset * 32u) + (static_cast<uint32_t>(bit) * 4u));
return bit_address ;
}
Then you can have:
char process_key(int a)
{
static volatile uint32_t* PC0_BB_ADDR = getBitBandAddress( &GPIO_PORTC_DATA_R, 0 ) ;
*PC0_BB_ADDR = a ;
}
You could of course determine and hard-code the bit-band address; for example:
#define PC0 (*((volatile uint32_t *)0x420C7F88u))
Then:
char process_key(int a)
{
PC0 = a ;
}
Details of the bit-band address calculation can be found ARM Cortex-M Technical Reference Manual, and there is an on-line calculator here.
I am making HID for some data acquisition system. There are a lot of sensors who store test data and when I need I get to them and connect via USB and take it. USB host sent 3 bytes and USB device, if bytes are correct, sends its stored data. Sounds simple.
Previously it was implemented on PC, but now I try to implement it on STM32F769 Discovery and have some serious problems.
I am using ARM Keil 5.27, code generated with STM32CubeMX 5.3.0. I tried just to make a plain simple program, later to integrate with the entire touchscreen interface. I tried to implement this code in main:
if (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
while (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
{
Transmission_function();
}
And the function itself:
#define DLE 0x10
#define STX 0x2
uint8_t tx_buf[]={DLE, STX, 120}, RX_FLAG;
uint32_t size_tx=sizeof(tx_buf);
void Transmission_function (void)
{
if (Appli_state == APPLICATION_READY)
{
i=0;
USBH_CDC_Transmit(&hUsbHostHS, tx_buf, size_tx);
HAL_Delay(50);
RX_FLAG=0;
}
}
It should send the message after I press the blue button on the Discovery board. All that I get is Hard Fault. While trying to debug, I tried manually to check after which action I get this error and it was functioning in stm32f7xx_ll_usb.c:
HAL_StatusTypeDef USB_WritePacket(USB_OTG_GlobalTypeDef *USBx, uint8_t *src,
uint8_t ch_ep_num, uint16_t len, uint8_t dma)
{
uint32_t USBx_BASE = (uint32_t)USBx;
uint32_t *pSrc = (uint32_t *)src;
uint32_t count32b, i;
if (dma == 0U)
{
count32b = ((uint32_t)len + 3U) / 4U;
for (i = 0U; i < count32b; i++)
{
USBx_DFIFO((uint32_t)ch_ep_num) = *((__packed uint32_t *)pSrc);
pSrc++;
}
}
return HAL_OK;
}
But trying to scroll back in disassembly I notice, that just before Hard Fault program was in this function inside stm32f7xx_hal_hcd.c, in case GRXSTS_PKTSTS_IN:
static void HCD_RXQLVL_IRQHandler(HCD_HandleTypeDef *hhcd)
{
USB_OTG_GlobalTypeDef *USBx = hhcd->Instance;
uint32_t USBx_BASE = (uint32_t)USBx;
uint32_t pktsts;
uint32_t pktcnt;
uint32_t temp;
uint32_t tmpreg;
uint32_t ch_num;
temp = hhcd->Instance->GRXSTSP;
ch_num = temp & USB_OTG_GRXSTSP_EPNUM;
pktsts = (temp & USB_OTG_GRXSTSP_PKTSTS) >> 17;
pktcnt = (temp & USB_OTG_GRXSTSP_BCNT) >> 4;
switch (pktsts)
{
case GRXSTS_PKTSTS_IN:
/* Read the data into the host buffer. */
if ((pktcnt > 0U) && (hhcd->hc[ch_num].xfer_buff != (void *)0))
{
(void)USB_ReadPacket(hhcd->Instance, hhcd->hc[ch_num].xfer_buff, (uint16_t)pktcnt);
/*manage multiple Xfer */
hhcd->hc[ch_num].xfer_buff += pktcnt;
hhcd->hc[ch_num].xfer_count += pktcnt;
if ((USBx_HC(ch_num)->HCTSIZ & USB_OTG_HCTSIZ_PKTCNT) > 0U)
{
/* re-activate the channel when more packets are expected */
tmpreg = USBx_HC(ch_num)->HCCHAR;
tmpreg &= ~USB_OTG_HCCHAR_CHDIS;
tmpreg |= USB_OTG_HCCHAR_CHENA;
USBx_HC(ch_num)->HCCHAR = tmpreg;
hhcd->hc[ch_num].toggle_in ^= 1U;
}
}
break;
case GRXSTS_PKTSTS_DATA_TOGGLE_ERR:
break;
case GRXSTS_PKTSTS_IN_XFER_COMP:
case GRXSTS_PKTSTS_CH_HALTED:
default:
break;
}
}
Last few lines from Dissasembly shows this:
0x080018B4 E8BD81F0 POP {r4-r8,pc}
0x080018B8 0000 DCW 0x0000
0x080018BA 1FF8 DCW 0x1FF8
Why it fails? How could I fix it? I do not have much experience with USB protocol.
I will post my walkaround this, but I am not sure why it worked. Solution was to use EXTI0 interrupt instead of just detection if PA0 is high, as I showed I used here:
if (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
while (HAL_GPIO_ReadPin(BUTTON_GPIO_Port, BUTTON_Pin))
Transmission_function();
I changed it to this:
void EXTI0_IRQHandler(void)
{
/* USER CODE BEGIN EXTI0_IRQn 0 */
if(Appli_state == APPLICATION_READY){
USBH_CDC_Transmit(&hUsbHostHS, Buffer, 3);
}
/* USER CODE END EXTI0_IRQn 0 */
HAL_GPIO_EXTI_IRQHandler(GPIO_PIN_0);
/* USER CODE BEGIN EXTI0_IRQn 1 */
/* USER CODE END EXTI0_IRQn 1 */
}
Can someone help me here to understand about arduino library variable name like what is tempTC, tempCJC, faultOpen, faultShortGND, faultShortVCC,
I am so confused. please help here ! thanks
/*
read_MAX31855.ino
TODO:
Clean up code and comment!!
Also make use of all library functions and make more robust.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
http://creativecommons.org/licenses/by-sa/3.0/
*/
#include <MAX31855.h>
// Adruino 1.0 pre-defines these variables
#if ARDUINO < 100
int SCK = 13;
int MISO = 12;
int SS = 10;
#endif
int LED = 9;
// Setup the variables we are going to use.
double tempTC, tempCJC;
bool faultOpen, faultShortGND, faultShortVCC, x;
bool temp_unit = 1; // 0 = Celsius, 1 = Fahrenheit
// Init the MAX31855 library for the chip.
MAX31855 temp(SCK, SS, MISO);
void setup() {
Serial.begin(9600);
pinMode(LED, OUTPUT);
}
void loop() {
x = temp.readMAX31855(&tempTC, &tempCJC, &faultOpen, &faultShortGND, &faultShortVCC, temp_unit);
Serial.print(tempTC);
Serial.print("\t");
Serial.print(tempCJC);
Serial.print("\t");
Serial.print(faultOpen);
Serial.print(faultShortGND);
Serial.println(faultShortVCC);
digitalWrite(LED, HIGH);
delay(500);
digitalWrite(LED, LOW);
delay(500);
}
tempTC: The thermocouple value read from the IC.
tempCJC: The Cold Junction-Compensated temperature value.
faultOpen faultShortGND faultShortVCC: Flags that indicate wether any of these faults occurred while reading the temperature.
All these are done directly in the MAX31855 IC, read via SPI with the library you posted.
I need to write a program to implement real time clock for ARM architecture. example: LPC213x
It should display Hour Minute and Seconds. I have no idea about ARM so having trouble getting started.
My code below is not working
// ...
int main (void) {
int hour=0;
int min=0;
int sec;
init_serial(); /* Init UART */
Initialize();
CCR=0x11;
PCONP=0x1815BE;
ILR=0x1; // Clearing Interrupt
//printf("\nTime is %02d:%02x:%02d",hour,min,sec);
while (1) { /* Loop forever */
}
}
void Initialize()
{
VPBDIV=0x0;
//CCR=0x2;
//ILR=0x3;
HOUR=0x0;
SEC=0x0;
MIN=0x0;
ILR = 0x03;
CCR = (1<<4) | (1<<0);
VICVectAddr13 = (unsigned)read_rtc;
VICVectCntl13 |= 0x20 | VIC_RTC;
VICIntEnable |= (1 << VIC_RTC);
}
/* Interrupt Service Routine*/
__irq void read_rtc()
{
int hour=0;
int min=0;
int sec;
ILR=0x1; // Clearing Interrupt
hour=(CTIME0 & MASKHR)>>16;
min= (CTIME0 & MASKMIN)>>8;
sec=CTIME0 & MASKSEC;
printf("\nTime is %02d:%02x:%02d",hour,min,sec);
//VICVectAddr=0xff;
VICVectAddr = 0;
}
According to this board description for the LPC213x, it is delivered with an example program called "Real-Time Clock - Demonstrates how the real-time clock can be used". This also implies that the board features real-time clock hardware, which is going to make it a lot easier.
I suggest you read up on that program, to figure out how to talk to the RTC hardware. The next step would be to solve the display requirements. The two obvious choices are either 7-segment LED displays, or an LCD.
Both are well-known technologies about which loads have been written, follow the Wikipedia links to find out more.
This is all for the LPC2468. We have a setTime function too, but I don't want to do ALL the work for you. ;) We have custom register files for ease of access, but if you look at the LPC manual, it's obvious where they correlate. You just have to shift values into the right place, and do bitwise operations. For example:
#define RTC_HOUR (*(volatile RTC_HOUR_t *)(RTC_BASE_ADDR + (uint32_t)0x28))
Time struture:
typedef struct {
uint8_t seconds; /* Second value - [0,59] */
uint8_t minutes; /* Minute value - [0,59] */
uint8_t hour; /* Hour value - [0,23] */
uint8_t mDay; /* Day of the month value - [1,31] */
uint8_t month; /* Month value - [1,12] */
uint16_t year; /* Year value - [0,4095] */
uint8_t wDay; /* Day of week value - [0,6] */
uint16_t yDay; /* Day of year value - [1,365] */
} rtcTime_t;
RTC functions:
void rtc_ClockStart(void) {
/* Enable CLOCK into RTC */
scb_ClockStart(M_RTC);
RTC_CCR.B.CLKSRC = 1;
RTC_CCR.B.CLKEN = 1;
return;
}
void rtc_ClockStop(void) {
RTC_CCR.B.CLKEN = 0;
/* Disable CLOCK into RTC */
scb_ClockStop(M_RTC);
return;
}
void rtc_GetTime(rtcTime_t *p_localTime) {
/* Set RTC timer value */
p_localTime->seconds = RTC_SEC.R;
p_localTime->minutes = RTC_MIN.R;
p_localTime->hour = RTC_HOUR.R;
p_localTime->mDay = RTC_DOM.R;
p_localTime->wDay = RTC_DOW.R;
p_localTime->yDay = RTC_DOY.R;
p_localTime->month = RTC_MONTH.R;
p_localTime->year = RTC_YEAR.R;
}
System control block functions:
void scb_ClockStart(module_t module) {
PCONP.R |= (uint32_t)1 << module;
}
void scb_ClockStop(module_t module) {
PCONP.R &= ~((uint32_t)1 << module);
}
If you need information about ARM then this ARM System Developer's Guide: Designing and Optimizing System Software may help you.
We used to do some thing like this for ARM.
#include "LPC21xx.h"
void rtc()
{
*IODIR1 = 0x00FF0000;
// Set LED ports to output
*IOSET1 = 0x00020000;
*PREINT = 0x000001C8;
// Set RTC prescaler for 12.000Mhz Xtal
*PREFRAC = 0x000061C0;
*CCR = 0x01;
*SEC = 0;
*MIN = 0;
*HOUR= 0;
}
A real-time clock (RTC) is a computer clock (most often in the form of an integrated circuit) that keeps track of the current time. Although the term often refers to the devices in personal computers, servers and embedded systems, RTCs are present in almost any electronic device which needs to keep accurate time.
You May refer this two link, i am sure it will give you further understanding :-
1). ARM Cortex Programming using CMSIS:- http://www.firmcodes.com/cmsis/
2). RTC Programming with ARM7:- http://www.firmcodes.com/microcontrollers/arm/real-time-clock-of-arm7-lpc2148/