I would like to have a dynamic Memory map, like example to have flash spliced in 5 sections and according to a define in some file .h to set a proper memory map. But have some problems to do it :)
So this region would be dynamic allocated by defines in some .h
MEMORY
{
if SOME_DEFINE == PART0
rom (rx) : ORIGIN = 0x00400000, LENGTH = 0x00040000 /* flash, 256K */
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00006000 /* sram, 24K */
else
rom (rx) : ORIGIN = 0x00400000, LENGTH = 0x00040000 /* flash, 256K */
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00006000 /* sram, 24K */
endif
}
I've addressed a similar need before using variables:
Define a master linker script, looking something like this:
$ head common_layout.ld
/* You can do something like this for optional sections */
CCFG_ORIGIN = DEFINED(CCFG_ORIGIN) ? CCFG_ORIGIN : 0;
CCFG_LENGTH = DEFINED(CCFG_LENGTH) ? CCFG_LENGTH : 0;
MEMORY
{
rom (rx) : ORIGIN = ROM_ORIGIN, LENGTH = ROM_LENGTH
ccfg (rx) : ORIGIN = CCFG_ORIGIN, LENGTH = CCFG_LENGTH
ram (rwx) : ORIGIN = RAM_ORIGIN, LENGTH = RAM_LENGTH
}
Then, for each chip you're dealing with, you can create a file with specifics for that chip (or have your build system create a temp file on the fly if it's really that dynamic):
$ cat chip_layout.ld
/* Memory Spaces Definitions */
ROM_ORIGIN = 0x00010000; /* Bootloader is at 0x0000 */
ROM_LENGTH = 0x00020000;
RAM_ORIGIN = 0x20000000;
RAM_LENGTH = 0x00020000;
Then point your build tool to something that stitches them together, i.e. gcc -Tlayout.ld ...
$ cat layout.ld
INCLUDE ./chip_layout.ld
INCLUDE ../kernel_layout.ld
Related
I have been looking at some linker scripts for embedded ARM processors. In one of them, there's something like this (minimal example):
MEMORY {
REGION : ORIGIN = 0x1000, LENGTH = 0x1000
}
SECTIONS {
.text : {
/* ... */
. = 0x20;
/* ... */
} > MEMORY
}
This linker script states that the section .text should go in the memory region REGION, which starts at 0x1000. However, within the section contents, the location is explicitly set to 0x20.
Is this location assignment relative to the start of the region that the section is in? Or absolute? In general, how do regions and location assignments work together?
I did a test. I created an assembly file with the following contents:
.text
.word 0x1234
Then I wrote a basic linker script as detailed in the question:
MEMORY {
REGION : ORIGIN = 0x100, LENGTH = 0x100
}
SECTIONS {
.text : {
. = 0x20;
*(.text);
} > REGION
}
I compiled the assembly file to an object file with GCC, then linked the object file into an "executable" with ld. Running objdump -s on the result, I found that 0x1234 was at address 0x120. This means that the location assignment is relative to the start of the memory region.
Since two days I am trying to make printf\sprintf working in my project...
MCU: STM32F722RETx
I tried to use newLib, heap3, heap4, etc, etc. nothing works. HardFault_Handler is run evry time.
Now I am trying to use simple implementation from this link and still the same problem. I suppose my device has some problem with double numbers, becouse program run HardFault_Handler from this line if (value != value) in _ftoa function.( what is strange because this stm32 support FPU)
Do you guys have any idea? (Now I am using heap_4.c)
My compiller options:
target_compile_options(${PROJ_NAME} PUBLIC
$<$<COMPILE_LANGUAGE:CXX>:
-std=c++14
>
-mcpu=cortex-m7
-mthumb
-mfpu=fpv5-d16
-mfloat-abi=hard
-Wall
-ffunction-sections
-fdata-sections
-O1 -g
-DLV_CONF_INCLUDE_SIMPLE
)
Linker options:
target_link_options(${PROJ_NAME} PUBLIC
${LINKER_OPTION} ${LINKER_SCRIPT}
-mcpu=cortex-m7
-mthumb
-mfloat-abi=hard
-mfpu=fpv5-sp-d16
-specs=nosys.specs
-specs=nano.specs
# -Wl,--wrap,malloc
# -Wl,--wrap,_malloc_r
-u_printf_float
-u_sprintf_float
)
Linker script:
/* Highest address of the user mode stack */
_estack = 0x20040000; /* end of RAM */
/* Generate a link error if heap and stack don't fit into RAM */
_Min_Heap_Size = 0x200; /* required amount of heap */
_Min_Stack_Size = 0x400; /* required amount of stack */
/* Specify the memory areas */
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 256K
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 512K
}
UPDATE:
I don't think so it is stack problem, I have set configCHECK_FOR_STACK_OVERFLOW to 2, but hook function is never called. I found strange think: This soulution works:
float d = 23.5f;
char buffer[20];
sprintf(buffer, "temp %f", 23.5f);
but this solution not:
float d = 23.5f;
char buffer[20];
sprintf(buffer, "temp %f",d);
No idea why passing variable by copy, generate a HardFault_Handler...
You can implement a hard fault handler that at least will provide you with the SP location to where the issue is occurring. This should provide more insight.
https://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html
It should let you know if your issue is due to a floating point error within the MCU or if it is due to a branching error possibly caused by some linking problem
I also had error with printf when using FreeRTOS for my SiFive HiFive Rev B.
To solve it, I rewrite _fstat and _write functions to change output function of printf
/*
* Retarget functions for printf()
*/
#include <errno.h>
#include <sys/stat.h>
int _fstat (int file, struct stat * st) {
errno = -ENOSYS;
return -1;
}
int _write (int file, char * ptr, int len) {
extern int uart_putc(int c);
int i;
/* Turn character to capital letter and output to UART port */
for (i = 0; i < len; i++) uart_putc((int)*ptr++);
return 0;
}
And create another uart_putc function for UART0 of SiFive HiFive Rev B hardware:
void uart_putc(int c)
{
#define uart0_txdata (*(volatile uint32_t*)(0x10013000)) // uart0 txdata register
#define UART_TXFULL (1 << 31) // uart0 txdata flag
while ((uart0_txdata & UART_TXFULL) != 0) { }
uart0_txdata = c;
}
The newlib C-runtime library (used in many embedded tool chains) internally uses it's own malloc-family routines. newlib maintains some internal buffers and requires some support for thread-safety:
http://www.nadler.com/embedded/newlibAndFreeRTOS.html
hard fault can caused by unaligned Memory Access:
https://www.keil.com/support/docs/3777.htm
I have several OVERRUN errors on UART peripheral because I keep receiving UART data while my code is stall because I'm executing a write operation on flash.
I'm using interrupts for UART and has it is explained on Application Note AN3969 :
EEPROM emulation firmware runs from the internal Flash, thus access to
the Flash will be stalled during operations requiring Flash erase or
programming (EEPROM initialization, variable update or page erase). As
a consequence, the application code is not executed and the interrupt
can not be served.
This behavior may be acceptable for many applications, however for
applications with realtime constraints, you need to run the critical
processes from the internal RAM.
In this case:
Relocate the vector table in the internal RAM.
Execute all critical processes and interrupt service routines from the internal RAM. The compiler provides a keyword to declare functions
as a RAM function; the function is copied from the Flash to the RAM at
system startup just like any initialized variable. It is important to
note that for a RAM function, all used variable(s) and called
function(s) should be within the RAM.
So I've search on the internet and found AN4808 which provides examples on how to keep the interrupts running while flash operations.
I went ahead and modified my code :
Linker script : Added vector table to SRAM and define a .ramfunc section
/* stm32f417.dld */
ENTRY(Reset_Handler)
MEMORY
{
ccmram(xrw) : ORIGIN = 0x10000000, LENGTH = 64k
sram : ORIGIN = 0x20000000, LENGTH = 112k
eeprom_default : ORIGIN = 0x08004008, LENGTH = 16376
eeprom_s1 : ORIGIN = 0x08008000, LENGTH = 16k
eeprom_s2 : ORIGIN = 0x0800C000, LENGTH = 16k
flash_unused : ORIGIN = 0x08010000, LENGTH = 64k
flash : ORIGIN = 0x08020000, LENGTH = 896k
}
_end_stack = 0x2001BFF0;
SECTIONS
{
. = ORIGIN(eeprom_default);
.eeprom_data :
{
*(.eeprom_data)
} >eeprom_default
. = ORIGIN(flash);
.vectors :
{
_load_vector = LOADADDR(.vectors);
_start_vector = .;
*(.vectors)
_end_vector = .;
} >sram AT >flash
.text :
{
*(.text)
*(.rodata)
*(.rodata*)
_end_text = .;
} >flash
.data :
{
_load_data = LOADADDR(.data);
. = ALIGN(4);
_start_data = .;
*(.data)
} >sram AT >flash
.ramfunc :
{
. = ALIGN(4);
*(.ramfunc)
*(.ramfunc.*)
. = ALIGN(4);
_end_data = .;
} >sram AT >flash
.ccmram :
{
_load_ccmram = LOADADDR(.ccmram);
. = ALIGN(4);
_start_ccmram = .;
*(.ccmram)
*(.ccmram*)
. = ALIGN(4);
_end_ccmram = .;
} > ccmram AT >flash
.bss :
{
_start_bss = .;
*(.bss)
_end_bss = .;
} >sram
. = ALIGN(4);
_start_stack = .;
}
_end = .;
PROVIDE(end = .);
Reset Handler : Added vector table copy SRAM and define a .ramfunc section
void Reset_Handler(void)
{
unsigned int *src, *dst;
/* Copy vector table from flash to RAM */
src = &_load_vector;
dst = &_start_vector;
while (dst < &_end_vector)
*dst++ = *src++;
/* Copy data section from flash to RAM */
src = &_load_data;
dst = &_start_data;
while (dst < &_end_data)
*dst++ = *src++;
/* Copy data section from flash to CCRAM */
src = &_load_ccmram;
dst = &_start_ccmram;
while (dst < &_end_ccmram)
*dst++ = *src++;
/* Clear the bss section */
dst = &_start_bss;
while (dst < &_end_bss)
*dst++ = 0;
SystemInit();
SystemCoreClockUpdate();
RCC->AHB1ENR = 0xFFFFFFFF;
RCC->AHB2ENR = 0xFFFFFFFF;
RCC->AHB3ENR = 0xFFFFFFFF;
RCC->APB1ENR = 0xFFFFFFFF;
RCC->APB2ENR = 0xFFFFFFFF;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOAEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOBEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOCEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIODEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOEEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOFEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOGEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOHEN;
RCC->AHB1ENR |= RCC_AHB1ENR_GPIOIEN;
RCC->AHB1ENR |= RCC_AHB1ENR_CCMDATARAMEN;
main();
while(1);
}
system_stm32f4xxx.c : Uncommented VECT_TAB_SRAM define
/*!< Uncomment the following line if you need to relocate your vector Table in
Internal SRAM. */
#define VECT_TAB_SRAM
#define VECT_TAB_OFFSET 0x00 /*!< Vector Table base offset field.
This value must be a multiple of 0x200. */
Added a definition of RAMFUNC to set section attributes :
#define RAMFUNC __attribute__ ((section (".ramfunc")))
Addded RAMFUNC before UART related function and prototypes so it gets run from RAM.
RAMFUNC void USART1_IRQHandler(void)
{
uint32_t sr = USART1->SR;
USART1->SR & USART_SR_ORE ? GPIO_SET(LED_ERROR_PORT, LED_ERROR_PIN_bp):GPIO_CLR(LED_ERROR_PORT, LED_ERROR_PIN_bp);
if(sr & USART_SR_TXE)
{
if(uart_1_send_write_pos != uart_1_send_read_pos)
{
USART1->DR = uart_1_send_buffer[uart_1_send_read_pos];
uart_1_send_read_pos = (uart_1_send_read_pos + 1) % USART_1_SEND_BUF_SIZE;
}
else
{
USART1->CR1 &= ~USART_CR1_TXEIE;
}
}
if(sr & (USART_SR_RXNE | USART_SR_ORE))
{
USART1->SR &= ~(USART_SR_RXNE | USART_SR_ORE);
uint8_t byte = USART1->DR;
uart_1_recv_buffer[uart_1_recv_write_pos] = byte;
uart_1_recv_write_pos = (uart_1_recv_write_pos + 1) % USART_1_RECV_BUF_SIZE;
}
}
My target runs properly with vector table and UART function in RAM but I still I get an overrun on USART. I'm also not disabling interrupts when performing the flash write operation.
I also tried to run code from CCM RAM instead of SRAM but has I saw on this post code can't be executed on CCM RAM on STMF32F4XX...
Any idea ? Thanks.
Any attempt to read from flash while a write operation is ongoing causes the bus to stall.
In order to not be blocked by flash writes, I think not only the the interrupt code, but the interrupted function has to run from RAM too, otherwise the core cannot proceed to a state when interrupts are possible.
Try relocating the flash handling code to RAM.
If it's possible, I'd advise switching to an MCU with two independent banks of flash memory, like the pin- and software-compatible 427/429/437/439 series. You can dedicate one bank to program code and the other to EEPROM-like data storage, then writing the second bank won't disturb code running from the first bank.
As suggested, it might be necessary to execute code from RAM; or, rather, make sure that no flash read operations are performed while the write is in progress.
To test, you might want to compile the entire executable for RAM, rather than flash (i.e., place everything into RAM and not use the flash at all).
You could then use gdb to load the binary and start execution... test your uart and make sure it is working as expected. At least this way you can be sure the flash is unused.
Some micros have READ WHILE WRITE sections that do not have a problem performing multiple operations simultaneously.
I have an image of dimensions 4096*4096 (so 67108864 bytes, since there are 4 channels) that I want to copy from a staging buffer to a device local image. The buffer already has the data stored and I have set up the image barriers properly, so now I want to perform the copy operation... Except it doesn't work. The validation layers give me this error message when I call vkCmdCopyBufferToImage() -
IMAGE(ERROR): object: 0x0 type: 6 location: 3903 msgCode: 417333590: vkCmdCopyBufferToImage(): pRegion[0] exceeds buffer size of 67108864 bytes. The spec valid usage text states 'The buffer region specified by each element of pRegions mustbe a region that is contained within srcBuffer' (https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspec.html#VUID-vkCmdCopyBufferToImage-pRegions-00171).
I can't find anything wrong with the values that I gave it though. The VkBufferImageCopy struct I passed to it looks like this-
VkBufferImageCopy bufImgCopy;
bufImgCopy.bufferOffset = 0;
bufImgCopy.bufferImageHeight = 0;
bufImgCopy.bufferRowLength = 0;
bufImgCopy.imageExtent = modelTexture.imgExtents; // 4096 * 4096 * 1
bufImgCopy.imageOffset = {0, 0, 0};
bufImgCopy.imageSubresource.aspectMask = modelTexture.subResource.aspectMask; // Colour attachment
bufImgCopy.imageSubresource.baseArrayLayer = modelTexture.subResource.baseArrayLayer; // 0
bufImgCopy.imageSubresource.layerCount = VK_REMAINING_ARRAY_LAYERS;
bufImgCopy.imageSubresource.mipLevel = 0;
I can't figure out why the api thinks the struct is specifying a size greater than the buffer size. The format of the image is VK_FORMAT_B8G8R8A8_UNORM.
EDIT
Here's the code that sets up the staging buffer-
stageBuf.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;
stageBuf.shareMode = VK_SHARING_MODE_EXCLUSIVE;
stageBuf.bufSize = static_cast<VkDeviceSize>(verts.size() * sizeof(vert) + indices.size() * sizeof(u32)) > modelImage.size ? static_cast<VkDeviceSize>(verts.size() * sizeof(vert) + indices.size() * sizeof(u32)) : modelImage.size;
// filled from the previous struct.
VkBufferCreateInfo info;
info.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
info.pNext = nullptr;
info.flags = 0;
info.queueFamilyIndexCount = bufInfo.qFCount;
info.pQueueFamilyIndices = bufInfo.qFIndices;
info.usage = bufInfo.usage;
info.sharingMode = bufInfo.shareMode;
info.size = bufInfo.bufSize;
if (vkCreateBuffer(device, &info, nullptr, &(bufInfo.buf)) != VK_SUCCESS)
{ //...
VkMemoryRequirements memReqs;
vkGetBufferMemoryRequirements(device, buf, &memReqs);
for (u32 type = 0; type < memProps.memoryTypeCount; ++type)
if ((memReqs.memoryTypeBits & (1 << type)) &&
((memProps.memoryTypes[type].propertyFlags & memFlags) == memFlags)) // The usual things to set buffers up.
{
VkMemoryAllocateInfo info;
info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
info.pNext = nullptr;
info.allocationSize = memReqs.size;
info.memoryTypeIndex = type;
if (vkAllocateMemory(device, &info, nullptr, &mem.memory) == VK_SUCCESS)
{ //....
// All this works perfectly except for the texture copy.
if (vkBindBufferMemory(device, buf, mem.memory, mem.offset) != VK_SUCCESS)
{ //...
I'm using this staging buffer for both the vertex and index buffers (which I have taken as a single buffer with offsets) as well as the image which I'm trying to copy to. The memory allocated is according to the size of the largest data structure.
As noted in the comments. Using VK_REMAINING_ARRAY_LAYERS is invalid for the layerCount of VkImageSubresourceRange, so you have to explicitly set the layerCount to the actual number of layers to copy.
I am trying to understand fcntl system call , which has this structure as the second parameter:
struct flock fl;
int fd;
fl.l_type = F_WRLCK; /* F_RDLCK, F_WRLCK, F_UNLCK */
fl.l_whence = SEEK_SET; /* SEEK_SET, SEEK_CUR, SEEK_END */
fl.l_start = 0; /* Offset from l_whence */
fl.l_len = 0; /* length, 0 = to EOF */
fl.l_pid = getpid(); /* our PID
I want to know what exactly is fl.l_start
so for example if I make it fl.l_start = 5, does that mean that protection is guaranteed only beyond 5 bytes from the beginning?
(i.e before 5 bytes, no lock is there so, all the processes can have access to it in a non-cooperative manner.)