STM32F746 - SD Card CRC failing in 4-bit mode, but working fine in 1-bit mode - embedded

Recently, I bought a Nucleo-144 development board for the STM32F746. For the project I'm working on, I need to get consistent >3 MB/s write speeds to the SD card. Using STM32CubeIDE, I've been able to get SD 1-bit mode working with FatFS in both polling and DMA modes at full speed. However, switching to SD 4-bit mode, I start getting lots of IO errors relating to bad data CRCs while reading.
Details
In SD 4-bit polling mode, I can't even get a single block read to process correctly. Calling f_mount returns an IO error, and debugging it further reveals that the first call to HAL_SD_ReadBlocks, reading sector 0, fails with the error code SDMMC_ERROR_DATA_CRC_FAIL:
Inspecting the 512 byte data buffer it's read to from the card reveals that data is at least partially intact containing some strings you'd expect to see in the first sector:
Importantly, this buffer is corrupted in the exact same manner between each run of the software. If it was some kind of electrical interference problem, I'd expect to see different bytes being corrupted, but I don't. The buffer is identical between runs. Switching back to 1-bit mode and inspecting the data buffer, it's clearly in a lot better shape. The 4-bit buffer clearly has a lot of corrupted bits and bits that are missing entirely, offsetting everything. 4-bit mode is reading mostly junk, but consistently the same junk.
What I've Tried
Polling and DMA mode.
Both fail in a similar manner, although it's harder to debug DMA.
Decreasing the SDMMCCLK clock divider all the way down to 255, the highest divider (and lowest clock speed) it'll go.
On my older, cheaper, Lexar SD card read/writes in this mode work flawlessly (albeit very slowly).
On my newer, more expensive, Samsung SD card read/writes still fail with a SDMMC_ERROR_DATA_CRC_FAIL error. The data buffer appears much more intact, but it's clearly still garbage data.
Transfers with GPIO pull-ups applied to all SD pins (except clock) as well as without pull-ups.
No change, at least as far as I could tell.
Using multiple different SD cards.
Specifically, a Lexar "300x" 32 GB card and a Samsung "EVO Plus" 128 GB card.
As mentioned previously, decreasing the clock speed allowed one of my two cards to work.
However, my higher quality card still fails on the first read even at the minimum speed.
Wiring
Not sure how relevant this is, but figured I'd include it for sake of completion. This is how I have my SD card connected while prototyping. All of the cables are the same length, but perhaps they're interfering with each other even over such a short distance? I'm also using an Adafruit SD card breakout adapter for testing.
SD Card
GPIO Pin
CLK
PC12
D0
PC8
CMD
PD2
D3
PC11
D1
PC9
D2
PC10
Summary
It appears that with some cards, even at lower clock speeds, IO errors are incredibly common in SD 4-bit mode only. At higher clock speeds, all cards I'm able to test with start having IO errors in 4-bit mode. In SD 1-bit mode, however, even at the maximum clock speed I'm able to read and write fine.
I'd like to take advantage of the 4-bit mode for faster speeds. What am I doing wrong? Is it something electrical, like for example needing stronger pull-up resistors or shorter wires? Thanks, I really appreciate it!

I had similar issues on a H743ZI Nucleo. My code worked fine on two other H743 boards with onboard sdcard readers, but failed on the Nucleo with Adafruit breakout. I believe it was just due to signal integrity..
I see you tried dropping the clock divider down, but have you tried a slower SDMMC clock? This was what made the difference for me. Was failing at 48MHz, but fine at 24MHz and lower with 0 divider.

Related

How to log a particular address from an STM32 NUCLEO-F334R8 with an inbuilt ST-LINK in real time using SWD & openOCD without halting the processor?

I am trying to learn how to debug an MCU non-intrusively using SWD & openOCD.
while (1)
{
my_count++;
HAL_GPIO_TogglePin(LD2_GPIO_Port,LD2_Pin);
HAL_Delay(750);
}
The code running on my MCU has a free running counter "my_count" . I want to sample/trace the data stored in the address holding "my_count" in real time :
I was doing it this way:
while(1){// generic algorithm no specific language
mdw 0x00000000200000ac; //openOCD command to read from an address
}
0x200000ac is the address of the variable my_count from the .map file.
But, this method is very slow and experiences data drops at high frequencies.
Is there any other way to trace the data at high frequencies without experiencing data drops?
I made some napkin math, and I have an idea that may work.
As per Reference Manual, page 948, the max baud rate for UART of STM32F334 is 9Mbit/s.
If we want to send memory at the specific address, it will be 32 bits. 1 bit takes 1/9Mbps or 1.111*10^(-7)s, multiply that by 32 bits, that makes it 3.555 microseconds. Obviously, as I said, it's purely napkin math. There are start and stop bits involved. But we have a lot of wiggle room. You can easily fit 64 bits into transmission too.
Now, I've checked with the internet, it seems the ST-Link based on STM32F103 can have max baud rate of 4.5Mbps. A bummer, but we simply need to double our timings. 3.55*2 = 7.1us for 32-bit and 14.2us for 64-bit transmission. Even given there is some start and stop bit overhead, we still seem to fit into our 25us time budget.
So the suggestion is the following:
You have a timer set to 25us period that fires an interrupt, that activates DMA UART transmission. That way your MCU actually has very little overhead since DMA will autonomously handle the transmission, while your MCU can do whatever it wants in the meantime. Entering and exiting the timer ISR will be in fact the greatest part of the overhead caused by this, since in the ISR you will literally flip a pair of bits to tell DMA to send stuff over UART # 4.5Mbps.

GNU Radio and bladeRF on Raspberry Pi (simple FSK system)

I am having a problem porting a GNU Radio setup from PC (windows 10, USB3) to Raspberry Pi 2 (USB2). USB bandwidth and CPU should not be a problem I think (only around 30% utilization while running). Essentially it looks like the RPi is 'pausing' during transmission, while the PC is not. The receiver is running on PC in both cases. I am including a pic of what I see after the FSK demod when running transmitter on PC vs Pi (circled 'pause' area), as well as a picture of my (admittedly sloppy) schematic. Any help/tips is greatly appreciated.gnuradio schemreceived signals
Edit: It appears it may actually be processing limitations. Switching from 9400 baud to 2400 baud makes the issue go away. If anyone has experience with GNURadio...am I doing anything overly inefficient or should I just drop comm rate?
The first thing I would do would be to lower your sample rates.
You don't need 1.5Ms/s if you are going to keep only the lowest 32k in your low pass filter.
Then you could do the same for your second stage after the quadrature demod if it's not enough (by the way, the sample rate of your second low pass filter does not seem to match the actual sample rate of the stage which is still 1.5Ms/s if I'm not mistaken).
Anyway, Gnuradio uses a lot of processing power so try not to use a sampling rate way above what you actually need ;)
In your case, you could cut the incoming sample rate down to 64k (say 80 for safety). 18 times less samples to process might do the trick :)

NodeMCU SPI Module too fast

I really want to make use of the SPI module on my NodeMCU. SPI keeps my code clean and frees up some of my GPIO pins. I feel it is sending data too fast for my 74HC595 to keep up with. It was working for a bit, then stopped.
It seemed like there was a lot of noise on the line so I hooked up the logic analyzer and saw that when I was sending data, bits were flying across the line at almost 6 ns (which is awesome). I am driving a 595 and ultimately a stepper, that need data at a way slower rate. I have tried using the clock parameter in the setup call, I feel it never slows the SPI clock.
Is there any way to set the clock speed to something that would be more 595+stepper friendly?
Just looking at the docs in the most recent dev branch of NodeMCU (get it from the NodeMCU Build website) you can setup SPI with a divider to lower the data rate of the SPI transmissions (higher div, lower bit rate):
spi.setup(id, mode, cpol, cpha, databits, clock_div[, duplex_mode])
Parameters include:
clock_div - SPI clock divider, f(SPI) = f(CPU) / clock_div

Logging 16-bit data to an SD card at the rate of 44 kHz

I am using the STM32F4 microcontroller with a microSD card. I am capturing analogue data via DMA.
I am using a double buffer, taking 1280 (10*128 - 10 FFTs) samples at a time.
When one buffer is full I am setting a flag and I then look at 128 samples at a time and run an FFT calculation on it. All of this is running well.
The data is being sampled at the rate I want and FFT calculation is as I would expect. If I just let the program run for one second, I see that it runs the FFT approximately 343 times (44000/128).
But the problem is I would like to save 64 values from this FFT to the SD card.
I am using the HCC fat file system library.
Each loop of the FFT calculation I am copy the 64 values into an array.
After every 10 calculations I write the contents of this array to file and start again.
The array stores 640 float_32 values (10*64).
This works perfectly for a one-second test run. I get 22,000 values stored to the SD card.
But as I increase the time I start losing samples as it take the SD card longer to write. I need the SD card to store over 87 kbit/s (4 bytes * 64 * 343 = 87808) consistently. I have tried increasing the DMA buffer sample size and then the number of times it writes, but didn't find it helped.
I am using an 8G microSD card, class 4. I formatted the SD card to the default FAT32 allocation unit size 2048.
How should I organize the buffering of data to allow for this? I thought using fewer writes might help. Would a queue help? How would I implement this and would anyone have an example?
I saw that clifford had a similar problem and he was using a queue, How can I use an SD card for logging 16-bit data at 48 ksamples/s?.
In my case I got it to work by trying a large number of different cards - they vary a great deal. If I had enough RAM available for a longer buffer that would have worked too.
If you are not using an RTOS, the queue buffering option may not be available to you, or at least would be non-trivial to implement.
Using an RTOS queue, I suggest that you create a queue of messages each of length 64*sizeof(float_32), the number of messages in the queue will be determined by the ammount of card latency you need to deal with; a length of 343 for example, will sustain a card stall of 1 second, and will require 87Kb of RAM. The application will then have a high priority thread performing the FFT and placing data in the queue, while a low priority thread takes data from the queue and writes to the file.
You might improve performance further by accumulating multiple message blocks in your DMA buffer before initiating a write, and there may be some benefit in carefully selecting an optimum DMA buffer length.
Flash is very, very sensitive to overwrites. Writing 3kB and then a further 3kB may count as an overwrite of the first 4 kB. In your case, there's no good reason why you'd want such small writes anyway. I'd advise 16 kB writes (32 frames/write * 64 samples/frame * 4 bytes/sample). You'd need 5 or 6 writes per second, which should be well in spec of any old SD card.
Now it's quite likely that you'd get another 1280 samples it while writing; you'll have to deal with that on another thread. Should be no problem as the writing should block without using CPU (it's a low-level Flash delay)
The most probable cause of the problem might be the way you are interfacing the card through the library.
SD cards over the SPI protocol (which I assume being used here) can be read or written in 512 byte sector units, some SD commands making it possible to stream (to perform sequential sector access faster). An important element of the SD card SPI protocol are various delays, where you have to poll the card whether you could start an operation (such as writing data to a sector).
You should read the library's API to discover how its writing process might work. You will need to perform some regular action which in the end would poll the card to know whether the writing process could continue. Some cards might require a set number of accesses before becoming ready for an operation, some others might use timeouts for state transitions. It might not work well to have the function called relatively rarely (such as once in 2-3 milliseconds) anticipating the card getting ready meanwhile. You have to keep on nagging it whether it completed already.
Just from own experiences with SD interfacing.

SD card initialization using SPI

I saw a lot of information about MMC/SD cards and I tried to make a library to read this (modifying the Procyon AVRlib).
But I have some problems here. I don't change the original code and tried here. My problem is about the initialization of an SD card. I have two here, a 256 MB and another 1 GB.
I send the init commands in this order: CMD0, CMD55, ACMD41, and CMD1.
But the 256 MB SD card only returns a 0x01 response for each command. I send the CMD1 a lot of times and the 256 MB SD card always returns only 0x01, never 0x00.
The 1 GB SD is more crazy... CMD0 returns with 0x01. Nice, but the CMD55 command responds with 0x05. At other times it responds with 0xC1 and also sometimes responds with 0xF0 with a 0x5F in the next interation...
Around the Internet there is information and examples, but it is a bit confused. Here in my project, I must use a 1 GB card and I'm trying with a microSD card with an SD adapter (I think that this is not the problem).
How do I fix this problem?
PS: My problem is like the problem in Stack Overflow question Initializing SD card in SPI issues, but the solution didn't solve my problem. The 1 GB SD card only returns 0x01 ever... :cry:
Why do you need CMD1? And did you read the note below it, that says "CMD1 is a valid command for the thin (1.4 mm) standard size SD memory card only if used after re-initializing a card (not after power on reset)."?
About the 1 GB card, ideas that come to mind:
After every command (send command, get reply), do you send 8 dummy bytes before making CS high?
The values returned seem weird (0x05 doesn't have busy bit set, so WTF?), maybe there's a hardware issue?
Does the card work otherwise?
Maybe this helps a bit:
SD Specifications Part 1 Physical LayerSimplified Specification
A simple explanation of MMC/SD usage over SPI is provided here. I have used the associated FAT file-system library too and it works well.
However, the solution may not work for some makes of cards. For such cards, you may have to edit the procedure/library. That may be why your 1 GB card acts differently -- it may be a different make of card. The SPI mode of certain cards may not be that popular for commercial equipment, and thus may be more deviated in specification by some card manufacturers.
If you bit bang the commands and clocks, you may have more control and confidence that those procedures are correct. That is useful because you need some solid ground to build on to progress bit by bit. I found that the <400 kHz 80 clocks was critical on one card, but could run at more than 2 MHz on another.
Try to progress one command at a time that is reliable for both cards.