Determining failing sectors on portable flash memory - hardware

I'm trying to write a program that will detect signs of failure for portable flash memory devices (thumb drives, etc).
I have seen tools in the past that are able to detect failing sectors and other kinds of trouble on conventional mechanical hard drives, but I fear that flash memory does not have the same kind of predictable low-level access to the hardware due to the internal workings of the storage. Things like wear-leveling and other block-remapping techniques (to skip over 'dead' sectors?) lead me to believe that determining if a flash drive is failing will be difficult at best, if not impossible (short of having constant read failures and device unmounts).
Flash drives at their end-of-life should be easy to detect (constant CRC discrepancies during reads and all-out failure). But what about drives that might be failing early? Are there any tell-tale signs like slower throughput speeds that might indicate a flash drive is going to fail much sooner than normal?
Along the lines of detecting potentially bad blocks, I had considered attempting random reads/writes to a file close to or exactly the size of the entire volume, but even then is it possible that the drive might report sizes under its maximum capacity to account for 'dead' blocks?
In short, is there any way to circumvent or at least detect (algorithmically or otherwise) the use of block-remapping or other life extension techniques for flash memory?
Let me end this question by expressing my uncertainty as to whether or not this belongs on serverfault.com . This is definitely a hardware-related question, but I also desire a software solution - preferably one that I can program myself.
If this question is misplaced, I will be happy to migrate it to serverfault - but I do need a programming solution. Please let me know if you need clarification :)
Thanks!

It's interesting if badblocks can help in this case

AFAIK, Wear leveling happens at the firmware level. The hardware does not know about the bad block, till such time the firmware detects one.
And there is no known way to find this bad sectors before hand. BTW, I guess, it is not bad sectors, but bad blocks. Once a sector is bad, the whole block is marked as bad ...

Related

is it recommended to use SPI flash to run code instead internal flash due to memory limitation of internal flash?

We used the LPC546xx family microcontroller in our project, currently, at the initial stage, we are finalizing the software and hardware requirements. The basic firmware size (which contains RTOS, 3rd party stack, library, etc...) currently is 480 KB. Now once full application developed than the size will exceed the internal flash size (512KB) and plus we needed storage which can hold firmware update image separately.
So we planned to use SPI flash (S25LP064A-JBLE, http://www.issi.com/WW/pdf/IS25LP032-064-128.pdf, serial flash memory) of 4MB\8MB to boot and run firmware.
is it recommended to run code from SPI flash? how can I map external flash memory directly to CPU memory space? Can anyone give an example that contains this memory mapping(linker script etc..) or demo application in which LPC546xx uses SPI FLASH?
Generally speaking it's not recommended, or differently put: the closer to the CPU the better. Both S25LP064A and LPC546xx however support XIP, so it is viable.
This is not a trivial issue as many aspects are affecting. I.e. issue is best avoided and should really have been ironed out in the planning stage. Embedded Systems are more about compromising than anything and making the right/better choices takes skill end experience.
Same question with replies on the NXP forum: link
512K of NVRAM is huge. There are almost certainly room for optimisations even if 3'rd party libraries are used.
On a related note this discussion concerning XIP should give valuable insight: link.
I would strongly encourage use of file-systems if not done already, for which external storage is much better suited. The further from the computational unit, the more relevant. That's not XIP and the penalty is copy-to-RAM either way you do it. I.e. performance will be slower. But in my experience, the need for speed has often-times not been thoroughly considered and at least partially greatly overestimated.
Regarding your mentioning of RTOS and FW-upgrade:
Unless it's a poor RTOS there's file-system awareness built in. Especially for FW upgrading (Note: you'll need room for 3 images, factory reset included), unless already supported by the SoC-vendor by some other means (OTA), it will make life much easier and less risky. If there's no FS-awareness, it can be added.
FW upgrade requires a lot of extra storage. More if simpler. Simpler is however also safer which especially for FW upgrades matters hugely. In the simplest case (binary flat image), you'll need at least twice the amount of memory you're already consuming.
All-in-all: I think the direction you're going is viable and depending on the actual situation perhaps your only choice.

Embedded System and Serial Flash wear issues

I am using a serial NOR flash (SPI Based) for my embedded application and also I have to implement a file system over it. That makes my NOR flash more prone to frequent erase and write cycles where having a wear level Algorithm comes into picture. I want to ask few questions regarding the same:
First, is it possible to implement a Wear Level Algorithm for Nor flash, if yes then why most of the time I find the solutions for NAND Flash and not NOR Flash?
Second, are serial SPI based low cost NAND flashes available, if yes then kindly share the part number for the same.
Third, how difficult is it to implement our own Wear Level algorithm?
Fourth, I have also read/heard that industrial grade NOR Flashes have higher erase/write cycles (in millions!!), is this understanding correct? If yes then kindly let me know the details of such SPI NOR Flash, which may also lead to avoiding implementation of wear level algorithm, if not completely then since I'm planning to implement my own wear level algorithm, it might give me a little room and ease in certain areas to implement the wear level algorithm.
The constraint to all these point is cost, I would want to have low cost solution to these issues.
Thanks in Advance
Regards
Aditya Mittal
(mittal.aditya12#gmail.com)
Implementing a wear-levelling algorithm is is not trivial, but not impossible either:
Your wear-levelling driver needs to know when disk blocks are no longer used by the filing system (this is known as TRIM support on modern SSDs). In practice, this means you need to modify your block driver API and filing systems above it, or have the wear-levelling driver aware of the filing system's free-space map. This second option is easy for FAT, but probably patented.
You need to reserve at least an erase-unit + a few allocation units to allow erase unit recycling. Reserving more blocks will increase performance
You'll want a background thread to perform asynchronous erase-unit recycling
You'll need to test, test an test again. When I last built one of these, we built a simulation of both flash and ran the real filing system on top it, and tortured the system for weeks.
There are lots and lots of patents covering aspects of wear-leveling. By the same token, there are two at least two wear-levelling layers in the Linux Kernel.
Given all of this, licensing a third-party library is probably cost-effective,
Atmel/Adesto etc. make those little serial flash chips by the billion. They also have loads of online docs. I suspect that the serial flash beetles don't implement wear-levelling because of cost - the devices they are typically used in are very cheap and tend to have a limited lifetime anyway. Bulk, 4-line NAND flash that is expected to see heavier and lengthy use, (eg. SD cards), have complex, (relatively), built-in controllers that can implement wear-levelling in a transparent manner.
I no longer use one-pin interface serial flash, partly due to the wear issue. An SD-card is cheap enough for me to use and, even if one does break, an on-site technician, (or even the customer), can easily swap it out.
Implementing a wear-levelling algo. is too expensive, both in terms of development time, (especially testing if the device has to support a file system that must not corrupt on power fail etc), and CPU/RAM for me to bother.
If your product is so cost-sensitive that you have to use serial NOR flash, I suggest that you ignore the issue.

Permanent DOS Attacks - Anyone Knowledgeable?

So, I'm looking into Permanent DOS attacks for a class, and I'm having a hard time coming up with concrete examples. There's a lot of information about Phlashing (flashing firmware to either brick the device, or put malicious firmware in its place, for those of you who don't know the term) but I'd like to have a broader set of examples.
That being said, there has to be a way to write code that will do something like wear out disk arms, right? Something that will have the disk seek to the end of the disk, then back to the front, on and on. Anyone have an example of how that would be accomplished? Is there some way to specify where to track to on a disk in C (similar to traversing to a certain point in a file, but for the entire HDD!)? If not, I guess there's always trying to force a file's location on the disk... which seems like less fun trying to accomplish. Again, can you do something like that programmatically?
If anyone has any insight into these types of attacks, or any good resources for me to check into, I'd appreciate it. Maybe you read a story about it on Slashdot a few years back? Let me know! The more info I can gather, the less likely I'll be forced to kill time during my talk by bricking my router in the class :) I'm not made of money OR routers!
Seems like these would primarily be limited to physical attacks and social engineering ("To enable your computer's hidden turbo function, remove the cover and pry this part). But:
Adjust screen refresh rates to insane values to blow older CRTs
Monkey with ACPI fan, charge, or battery controls if possible to cause overheating or battery failure.
Overwrite every rewritable storage device of every kind attached to any bus. Discover and overwrite any IDE, USB, etc... device you know the flash updater details for.
Of course nothing is permanent. You can replace the hard drive, BIOS chips, CPU, motherboard, memory, etc...
Although it is mostly fictional, the halt and catch fire operation would be a very convenient and permanent DOS attack.
Steve Gibson (google his name) has a paper he wrote a few years back about protocol-level vulnerabilities in TCP/IP. Some of it is still pertinent today.
Socially engineer the power company or ISP to turn off service at the location in question.
Many devices in the computer today have their own firmwares, including but not limited to CPU, DVD, HDD, VGA, motherboard (BIOS) etc. Most of these devices also have a way of updating their respective firmwares. Which can also be used to brick them pretty efficiently. Although this does require an individual approach to every device, often using privileged instructions and undocumented interfaces.
It's possible for a virus to do this. I seem to recall an actual virus doing this back in the day, but can't find anything to back that up.
I was able to find an article where the author has a conversation with the VP from Western Digital wherein he states a program could potentially access a hard drive's firmware causing such a DOS attack:
There are back doors if you will that allow us to get into places that the operating system can't go through the IDE connector
There used to be a few viruses that could cause old CRT monitors to break. They could cause invalid sync signals out the VGA point that would be too high in frequency for the video sweep. I also remember a few that would use bad sector flagging to draw images on the old versions of Scandisk (we are talking early 90’s or older.) I don't remember and of the names or have any references, but they used to be quite annoying.
Fortunately better circuits, memory protection, API abstraction have made such attacked very difficult to impossible.

Embedded app and wearing out flash disks

I have an embedded app that needs to do a lot of writing to a flash disk (or other). We cannot use a hard disk due to the environment. This is an industrial system subject to vibration and explosive fuel vapour.
The trouble is, flash has a lifecycle of around 100000 write cycles. Ample for your digital camera. Wears out after a year in our scenario.
Any alternatives that people have found work for them?
I was thinking of using FRAM but it's been done before here and it's slow and small.
As Nils says, commercial compact flash cards, and drive replacements (NAND) have wear levelling.
If you are using cheap onboard (NOR) flash you might have to do this yourself.
The best way is some sort of ring buffer where you are only appending data and then overwriting a full drive. Remember flash can only erase a full block (page) but can then append individual bytes to existing data in that page.
Also can you buffer a page in RAM and then write once or do you have to have individual bytes committed at all times?
Most app sheets for embedded processors will have examples of this.
You really need to provide much more information:
how much capacity do you need?
what costs are acceptable?
what physical form factor do you need?
what lifetime do you want?
If your storage needs aren't particularly huge and you can deal with the cost, There are battery-backed SRAM parts (up to at least 2 Megabytes per part) that are as fast as RAM (that's what they are) and have no limit on number of writes. But they cost a lot more than flash.
You could also get a drive with a SATA interface that's populated with DRAM.
This post referes to using embedded linux. Not sure if this is what you want.
I have a not to differnt system, but for medical use. We use a NOR flash for all parts that have low update frequency and NAND flash for the rest. I would recoment using UBI/UBIFS for the top layer om the MTD disk. UBI/UBIFS takes care of all the underlying problems for you. If you then design your system to have a lot larger physical flash than you need. Example: You need 100MB and then design your HW with 1GB flash. Then the data can be shuffeld around by UBI without any interaction from systems above.
UBIFS documentation
UBI documentation
As Michael Burr pointed out, we need more info. (Please answer his questions.)
I have an additional question: What kind of interface is this? PATA? SATA? USB?
As others have pointed out, any decent Flash Drive will provide some kind of wear leveling. Look for this in the datasheet for the device. Many vendors will boast about their wear-leveling technique.
You mention 100000 cycles. This seems pretty low to me. Most "industrial grade" flash drives can do a lot more than that (millions). Make sure you aren't using a bargain-basement device. A good flash drive will usually include an equation or calculator tool you can use to figure out the expected lifespan of the device.
(I can say from personal experience that some brands of flash drives hold up a lot better than others, particularly the "industrial" ones. Our drives go through some pretty brutal usage scenarios.)
The other thing that can help a lot is capacity. The higher capacity of flash drive, the more room the wear-leveling algorithm has to work with, which means a longer lifespan.
The other thing you can look at doing is software techniques to minimize the wearing of the flash components. Do you have a pagefile/swapfile? Maybe you don't need it. If you are creating/deleting lots of temporary files, move this to a RAM disk. Remember, it is erasure/reprogramming cycles that usually wears out a flash cell, so reducing those operations will usually help.
Use SD cards that have a built-in wear leveling controller. That way the write cycles get distributed over all the flash blocks and you get a very long life out of your flash.
I was thinking of using FRAM but it's
been done before here and it's slow
and small.
Compare with nvSRAM; that may provide the performance you need.
I have used a Compact Flash card in a embedded system with great success. It has a onboard controller that does all the thinking for you. Not all Compact Flash controllers are equal so get one that is a recent design and was intended to be used as a hard drive replacement as they have better wear levelling algorithms.

What does programming for PS3's Cell Processor entail?

How is programming for the Cell Processor on the PS3 different than programming for any other processor found on a normal desktop?
What kind of programming paradigms, techniques, and practices are used to fully utilize the Cell Processors potential?
All the articles I hear concerning PS3 development discuss, "Learning how to program on the Cell Processor." What does this really mean beyond some hand waving?
In addition to everything George mentions, the SPUs are really better thought of as streaming vector processors. They work best when you have an algorithm that works on long sequences of numerical data, which can be fed through the SPU's limited memory via DMA, rather than having the SPU load a chunk of memory, try to operate on it, find that it needs to follow a pointer to somewhere outside its memory, load that, keep going, find another one, and so on.
So, programming for them isn't a simple model of concurrency and threads; it's more like high performance numerical or scientific computation. It is also non-uniform memory access taken to an extreme.
Furthermore, every processor is in-order with deep pipelines, so the programmer has to be much more aware of data hazards and instruction bubbles and all the numerous micro-optimizations that we are told the compiler "should" take care of for us (but it really doesn't). Things like mispredicted branches, load-hit-stores, cache misses, etc. hurt a lot more than they would on an out-of-order processor that could juggle the order of operations around to hide such latencies.
For concrete examples, check out Mike Acton's CellPerformance blog. Mike is my favorite old-school assembly-happy perf curmudgeon in the business, and he's really earned his chops on this issue.
The Cell part of the PS3 consists of 6 SPU processors. They each have 256 KB of non-shared memory and are connected via a high-speed ring that allows for DMA between each other and the PowerPC host processor. They are not pipelined or cached. This makes it rather different than an multi-core x86 with shared memory, pipelining and caching. Also, the SPU processors do not use the same instruction set as the PowerPC so you've got some asymmetry there.
In short, your typical shared-memory, multithreaded program won't just drop onto the Cell without some work (with the caveat that computer science works hard at making different machines appear to be the same so some implementors try hard to automate the process).
At a high level the program will need to be broken up into tasks that fit within the Cell's hard memory limit. Those can run in parallel and each sub-task can be sequenced to an available Cell processor. At a low level, the compiler (or assembly programmer) will need to work harder to generate code that runs quickly on a processor -- no run-time trickery to make things go faster is available. The theory being that those programmer/compiler friendly features cost silicon and speed that can be better spent giving you more and faster SPUs. Of course, you're not getting any more SPU's on the PS3 but in the general case you'll get more SPUs per number of transistor available on chip.
Completely agree with George Philips and Crashworks. Only thing I'd add is that SPU programming is fundamentally about job management. To get the best out of the SPUs you need to keep them ticking over and feeding back results. There's no point in having one SPU chewing through some complex post-processing if your having to sit and wait for the results for a frame and the rest of your SPUs are sat idle. So how you distribute your jobs requires a lot of thought and this has a big impact on how you chunk up your data.
"All the articles I hear concerning PS3 development discuss, 'Learning how to program on the Cell Processor.' What does this really mean beyond some hand waving?"
Well, stuff you have to deal with on SPUs...
Atomic operations (lock-free try-discard style).
Strong distinction between memory areas. You have to know which pointer is pointing to which memory area or you'll screw everything up.
No enforced hardware distinction between data and code. This is actually a fun thing, you can setup dynamic code loading and essentially stream subroutines in and out. Self-modifying code is possible but not necessarily practical on SPU.
Lack of hardware debugging aids.
Limited memory size.
Fast memory access.
Instruction set balanced toward SIMD operations.
Floating point "gotchas".
You ideally want to keep the SPUs doing useful work all of the time, but it's really challenging. Not only are they not well suited for handling some types of problems, but often moving a system to be efficient on SPU can involve a complete redesign. Debugging problems that would be easy to catch on the PPU can sometimes take days on SPU.
I think when people use the phrase "learning how to program the cell" they are mostly hand waving. You can learn the basics in a week, the challenge comes in trying to apply that knowledge to real code... which often already exists and isn't in a form well-suited for use on SPU.