Flash erase cycles guaranteed

Flash erase cycles guaranteed - embedded

My microcontroller (MM32Spin05PT) guarantees 10,000 erase cycles.
Suppose my flash page is already clear, and I re-clear it. So, will number of remaining guaranteed erase cycles reduce?
In other words, I just want to know if I erase a page if it's already clear, will it cause the life of flash memory to reduce?
If yes, I probably have to write a code to check if the flash page has data first, and then only, I must clear it..

Related

How do you design a configuration file system that copes with abrupt shutdown in embedded environment?

I'm designing a software that manages configuration file at application layer in embedded Linux.
Generally, it maintains two copies of the configuration file, one in RAM and one in flash memory. As soon as end-users update setting(s) by UI, the software saves it to the file in RAM, and then copy-paste it to the file in flash memory.
This scheme makes sure best stability in that the software reflects reality at the next second. However, the scheme compromises longevity to flash memory by accessing it every time.
As to longevity issue, I've thought about it by having a dedicated program doing this housekeeping, and adds this program to crontab then let it run like every 30 mins.
(Note: flash memory wears off only during erase cycles; the program only does housekeeping if the both files are not the same.)
But if the file in RAM is waiting for the program to do housekeeping and system shuts down unexpectedly, the file will lose.
So I'm thinking is there a way to have both longevity and not losing file at the same time? Or am I missing something?

There are many different reasons why flash can get corrupted: data retention over time, erase/write failures which are primarily caused by erase/write cycle wear, clock inaccuracies, read disturb in case of NAND flash, and even less likely errors sources such as cosmic rays or EMI. But also as in your case, algorithmic layer problems such as a flash erase/write getting interrupted by power loss or reset caused by EMI.
Similarly, there are many ways to deal with these various problems.
CRC16 or CRC32 depending on flash size is the classic way to deal with pretty much all possible flash errors, particularly with data retention since it most often manifests itself as single-bit errors, which CRC is great at discovering. It should ideally be designed so that the checksum is placed at the end of each erase-size segment. Or in case erase-size is very small (emulated eeprom/data flash etc), maybe a single CRC32 at the end of all data. Modern MCUs often have a CRC hardware peripheral which might be helpful.
Optionally you can let the CRC algorithm repair single bit errors, though this practice is often banned in high integrity systems.
ECC is used on NAND flash or in high integrity systems. Traditionally done through software (which is quite cumbersome), but lately also available through built-in hardware support, particularly on the "safety/chassis" kind of automotive microcontrollers. If you wish to use ECC then I highly recommend picking a part with such built-in support, then it can be used to replace manual CRC (which is somewhat painful to deal with real-time wise).
These parts with hardware ECC may also support a feature with an area where you can write down variables to have the hardware handle writing them to flash in the background, kind of similar to DMA.
Using the flash segment as FIFO. When storing reasonably small amounts of data in memory with large erase sizes, you can save flash erase/write cycles by only erasing the whole segment once, after which it will likely be set to "all ones" 0xFFFF... When writing, you look for the last available chunk of memory which is "all ones" and write there, even though the same data was previously written just before it. And when reading, you fetch the last written chunk before "all ones". Only when the whole erase size is used up do you perform an erase and start over from the beginning - data needs to be stored in RAM during this.
I strongly recommend picking a part with decent data flash though, meaning small erase sizes - so that you don't need to resort to hacks like this.
Mirror segments where all memory is stored as duplicates in two separate segments is mandatory practice for high integrity systems, though this can also be used to prevent corruption during power loss/unexpected resets and of course flash corruption in general. The idea is to always have at least one segment of intact data at all times, and optionally repair a corrupt one by overwriting it with the correct one at start-up. Also meaning that one segment must be verified to be correct and complete before writing to the next.
Keep the product cool. This is a hardware solution obviously, but data retention in particular is heavily affected by ambient temperature. The manufacturer normally guarantees some 15-20 years or so up to 85°C, but that might mean 100 years if you keep it at <25°C. As in, whenever possible, avoid mounting your MCU PCB near exhausts, oil coolers, hydraulics, heating elements etc etc.
Mirror segments in combination with CRC and/or ECC is likely the solution you are looking for here. Again, I strongly recommend to pick a MCU with dedicated data flash, meaning small erase segments and often far more erase/write cycles, ideally >100k.

What is the 'erased' value of a memory location on an sdhc card?

I am writing to a micro sd card (SDHC) for an embedded application. The application needs to be able to write very quickly to the card in real time.
I have seen that erasing the memory blocks beforehand makes the write a lot faster. Unfortunately I am struggling to get the erase command (and ACMD23) to work as the driver provided for the development board I am using is not complete.
Is there any way to erase the card by maybe writing an 'erased' value to the memory blocks beforehand? For example, if after erasing a block it becomes 0x12345678 can I just write this value instead to make it erased in order to get around using the erase command? Or is there some other way that the card marks a block as erased?
Thanks
I have tried writing 0xffffffff as the erased value but it has not helped.

I think you're misunderstanding how flash memory works.
Flash memory has blocks which are way bigger than what typical filesystems expect. Additionally, they have a limited number of erase cycles. Hence, the flash controller provides an abstraction that maps virtual sectors to physical blocks.
A sector that is "erased" is not actively erased at all. It's just unmapped, and an empty block (if available) is mapped in its place. In the background, the flash controller shuffles sectors around and erases physical blocks as they become wholly unused.
As you can see, the quality of the flash controller matters here. It's not even a driver issue, typically. The driver just sends the command; the flash controller executes them. If you need better performance, get a better SD card.

Does flash memory ever change on execution?

I'm required to do a code verification using CRC. In this case, all I do is pass every byte found flash memory through an algorithm to calculate the CRC and compare the result to a predefined CRC value.
However, I'm hung up on the idea that the flash memory might change at some point, causing the CRC verification to fail.
Assuming that the code isn't touched again whatsoever, is it possible that flash memory will change during execution? If so, what can cause it to change? And how do I avoid said change?

Flash memory only means it retains its content in the absence of power; flash memory is definitely erasable / reprogrammable. The separate term Read-Only-Memory (ROM) means it cannot be altered after the initial write.
But memory doesn't change unless a CPU instruction touches it or it degrades or is affected by external factors. Flash memory contents might last ten years unperturbed. Usually it is the number of read/write cycles that degrade flash memory before age does. High static electrical charges could corrupt flash but magnetic fields should have little effect.
If you have any influence over the hardware specs, ROM should be considered if that is the primary intent; it has several advantages over flash for this purpose.
Lastly, you note that you will pass "every byte" of memory through your CRC algorithm. If the correct checksum is to be stored within the same memory medium, you have a recursive problem of trying to precompute a checksum that contains it's own checksum. In most cases, the valid checksum should be located in some segment of the memory that is not itself subjected to the scan.

In any event if it were to change spontaneously that is exactly what your code verification is intended to catch and you would want and expect the CRC to fail, so there is no problem - it is doing its job
It is mostly not a matter of spontaneously changing, but rather one of not being written correctly in the first instance, or possibly protecting against malicious or accidental tampering. If part of the flash is used for variable non-volatile storage, you would obviously not include that area in your CRC. If you are partitioning the same flash into code space and NV storage, potentially errors in the NV storage code could inadvertently modify the code space, so your CRC protects against that and also external tampering (via JTAG for example).
Flash memory is subject to erase/write cycle endurance, and after a number of cycles some bits may stick "high". Endurance varies between parts from around 10000 to 100000, and is seldom a problem for code storage. Flash memory also has a nominal data retention time; this is normally quoted at 10 years; but that is worst-case extreme condition rating - again your CRC guards against these effects.

Could a tight loop destroy cells of a microcontroller's flash?

It is well-known that Flash memory has limited write endurance, less so that reads could also have an upper limit such as mentioned in this Flash endurance test's Conclusion (3rd point).
On a microcontroller the code is typically stored in Flash, and is executed by fetching code words directly from the Flash cells.(at least this is most commonly so on 8 bit micros, some 32 bit micros might have some small buffer).
Depending on the particular code, it might happen that a location is accessed extremely frequently, such as if on the main execution path there is some busy loop, such as a wait for an interrupt (for example from a timer, synchronizing execution to a fixed interval).This could generate 100K or even more (read) accesses per second on average to a single Flash cell (depending on clock and the particular code).
Could such code actually destroy the cells of the Flash underneath it?(Is there any necessity to be concerned about this particular problem when designing code for microcontrollers? Such as part of a system which is meant to operate for years? Of course the Flash could be periodically verified by CRC, but that doesn't prevent the system failing if it happens, only that the failure will more likely happen in a controlled manner)

Only erasing/writing will affect the memory cells, not reading. You don't need to consider the number of reads when designing the program.
Programmed flash memory does age though, meaning that the value of the cells might not be reliable after a certain amount of years. This is known as data retention and depends mainly on temperature. MCU manufacturers typically specify a worse case in years, assuming that the part is kept in maximum specified ambient temperature.
This is something to consider for products that are expected to live long (> 10 years), particularly in environments where high temperatures can be expected. CRC and/or ECC is a good counter-measure against data retention, although if you do find that a cell has been corrupted, it typically just means that the application should shut down to a non-recoverable safe state.

I know of two techniques to approach this issue:
1) One technique is to set aside a const 32-bit integer variable in the system code. Then calculate a CRC32 checksum of the compiled binary image, and inserting the checksum into the reserved variable using an ELF-editor.
A module in the system software will then calculate a CRC32 over the flash area occupied by the application and compare to the "stored" value.
If you are using GCC, the linker can define a symbol to tell you where the segment stops. This method can detect errors but cannot correct them.
2) Another technique is to use a microcontroller that supports Flash ECC. TI sells Cortex-R4 MCUs which support Flash ECC (Hercules series).

I doubt that this is a practical concern. The article you cited vaguely asserts that this can happen but with no supporting evidence or quantification of the effect. There is a vague, unsupported and unquantified reference in the introduction:
[...] flash degrades over time from erasing/writing (or even just reading, although that decay is slower) [...]
Then again in the conclusion:
We did not check flash decay for reads, but reading also causes long term decay. It would be interesting to see if we can read a spot enough times to cause failure.
The author may be referring to read-disturbance in NAND flash, but microcontrollers do not use NAND flash for code storage/execution since it is not random-access. Read disturb is not a permanent effect, erasing and re-writing the affected block restores endurance. NAND controllers maintain read counts for blocks and automatically copy and erase blocks as necessary. They also employ ECC to detect and correct errors, and identify "write-worn" areas.
There is the potential for long-term "bit-rot" but I doubt that it is caused specifically by reading rather just ageing.
Most RTOS systems spend the majority of their processing time in a do-nothing idle loop, and run happily 24/7 365 days a year. Some processors support a wait-for-interrupt instruction that halts the CPU in the idle loop, but by no means all, and it is not uncommon not to use such an instruction. Processors with flash accelerators or caches may also prevent continuous rapid fetch from a single location, but again that is by no means all microcontrollers.

Does restricting the frames/s of a game engine (via vertical sync or common throttling), inflict latencies also on audio and input subsystems?

I was contemplating the fact that inflicting frame per second restrictions is not ideal in regards to latency and performance since a monitor still has a chance to show something sooner (assuming no vertical sync which is mainly for stability).
However, it occurred to me that process might have ramifications on audio subsystems (or also input devices), assuming frames per second also governs the global loop of the engine itself (the case in almost all 3D accelerated applications).
Do audio cards/audio devices on an operating system have a concept of "Hertz" that might be related? Do we assume the faster the global loop of the application the better the latency for the audio subsystems?

Audio is not typically affected at all. The audio has a separate buffer which you fill up in advance, maybe a hundred milliseconds or more, and the sound card plays back from that regardless of what your game loop is doing.
It is possible, if you fill that buffer in your game loop, that taking too long to get back to the game loop will result in the sound buffer being empty and a looping sound being heard. To avoid this, developers will either use a big buffer or fill the buffer from a background thread.
As you might guess, audio is already running at some sort of latency, proportional to the amount of data in the buffer that the game attempts to keep in there at all times. This is usually not so noticeable since sound takes a non-negligible time to travel in real life anyway. Pro audio applications have to keep this buffer small for low latency and responsiveness, but they don't have graphical frames to worry about...
As far as input, yes, it is often affected. If a game does not decouple the rendering rate from the input handling rate, then there will be some additional latency on the way in. There will always be some additional perceived latency on the way out too, if you consider input latency as the length of time between performing an action and seeing it take effect. But that perceived latency may well be larger than the simulation latency, since the affected entity may have been altered at an earlier timestep than the one displayed in the next frame.

Typically the game processing and the display aren't coupled, so reducing the FPS of the display won't affect the central game processing, which would include things like input (and presumably audio).
This article explains it pretty well, including different options for having FPS linked to game speed or not.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas