Circular Buffer in Flash - embedded

I need to store items of varying length in a circular queue in a flash chip. Each item will have its encapsulation so I can figure out how big it is and where the next item begins. When there are enough items in the buffer, it will wrap to the beginning.
What is a good way to store a circular queue in a flash chip?
There is a possibility of tens of thousands of items I would like to store. So starting at the beginning and reading to the end of the buffer is not ideal because it will take time to search to the end.
Also, because it is circular, I need to be able to distinguish the first item from the last.
The last problem is that this is stored in flash, so erasing each block is both time consuming and can only be done a set number of times for each block.

First, block management:
Put a smaller header at the start of each block. The main thing you need to keep track of the "oldest" and "newest" is a block number, which simply increments modulo k. k must be greater than your total number of blocks. Ideally, make k less than your MAX value (e.g. 0xFFFF) so you can easily tell what is an erased block.
At start-up, your code reads the headers of each block in turn, and locates the first and last blocks in the sequence that is ni+1 = (ni + 1) MODULO k. Take care not to get confused by erased blocks (block number is e.g. 0xFFFF) or data that is somehow corrupted (e.g. incomplete erase).
Within each block
Each block initially starts empty (each byte is 0xFF). Each record is simply written one after the other. If you have fixed-size records, then you can access it with a simple index. If you have variable-size records, then to read it you have to scan from the start of the block, linked-list style.
If you want to have variable-size records, but avoid linear scan, then you could have a well defined header on each record. E.g. use 0 as a record delimiter, and COBS-encode (or COBS/R-encode) each record. Or use a byte of your choice as a delimiter, and 'escape' that byte if it occurs in each record (similar to the PPP protocol).
At start-up, once you know your latest block, you can do a linear scan for the latest record. Or if you have fixed-size records or record delimiters, you could do a binary search.
Erase scheduling
For some Flash memory chips, erasing a block can take significant time--e.g. 5 seconds. Consider scheduling an erase as a background task a bit "ahead of time". E.g. when the current block is x% full, then start erasing the next block.
Record numbering
You may want to number records. The way I've done it in the past is to put, in the header of each block, the record number of the first record. Then the software has to keep count of the numbers of each record within the block.
Checksum or CRC
If you want to detect corrupted data (e.g. incomplete writes or erases due to unexpected power failure), then you can add a checksum or CRC to each record, and perhaps to the block header. Note the block header CRC would only cover the header itself, not the records, since it could not be re-written when each new record is written.

Keep a separate block that contains a pointer to the start of the first record and the end of the last record. You can also keep more information like the total number of records, etc.
Until you initially run out of space, adding records is as simple as writing them to the end of the buffer and updating the tail pointer.
As you need to reclaim space, delete enough records so that you can fit your current record. Update the head pointer as you delete records.
You'll need to keep track of how much extra space has been freed. If you keep a pointer to end of the last record, the next time you need to add a record, you can compare that with the pointer to the first record to determine if you need to delete any more records.
Also, if this is NAND, you or the flash controller will need to do deblocking and wear-leveling, but that should all be at a lower layer than allocating space for the circular buffer.

I think I get it now. It seems like your largest issue will be, having filled the available space for recording, what happens next? The new data should overwrite the oldest data, which is I believe what you mean by a circular buffer. But since the data is not fixed length you may overwrite more than one record.
I'm assuming that the amount of variability in length is high enough that padding everything out to a fixed length isn't an option.
Your write segment needs to keep track of the address that represents the start of the next record to write. If you know the size of a block to write ahead of time, you can tell if you are going to end up at the end of the logical buffer and start over at '0'. I wouldn't split a record up with some at the end and some at the beginning.
A separate register can track the beginning; this is the oldest data that hasn't been overwritten yet. If you went to read out the data this is where you would start.
The data writer then would check, given the write-start address and the length of data its about to commit, if it should bump the read register, which would examine the first block and see the length, and advance to the next record, until there is enough room to write whatever the data is. There will be a gap of junk data that lives between the end of the written data and the start of the oldest data, probably. But this way, you can just be writing an address or two as overhead, and not rearranging blocks.
At least, that's probably what I would do. HTH

The "circular" in a flash can be done on basis of block size, which means that you must declare how much blocks of the flash you allocate for this buffer.
The actual size of the buffer will be at each particular time between n-1 (n is the number of blocks) and n.
Each block should start with an header that contains sequential number or timestamp that could be used to determine which block is older than the other.
Each Item encapsulated with an header and a footer. the default header contains whatever you want but according to this header you must know the size of the item. The default footer is 0xFFFFFFFF. This value indicates a null termination.
In your RAM you must save a pointer to the oldest block and the latest block and pointer to the oldest item and latest item. On power up you go over all blocks find the relevant blocks and load this members.
When you want to store a new item, you check if the latest block contain enough space for this item. If it does you save the item at the end of the previous item and the change the previous footer to point to this item. If it does not contain enough space you need to erase the oldest block. Before you erase this block change the oldest block members (RAM) to point on the next block and the oldest item to point on the first item in this block.
Then you can save the new item in this block and change the footer of the latest item to point this item.
I know that the explanation may sounds complicated but the process is very simple and if you write it correct you can make it even power fail safe (always keep in you mind the order of the writes).
Pay attention that the circularity of the buffer is not saved in the flash but the flash only contains a blocks with items that you can decide according to the blocks headers and items headers what is the order of these items

I see three options:
option1: is to pad everything out to the same size, this is simple, store a pointer to the head and tail of the buffer so you know where to write and where to start reading from, use the size of each object to get an offset to the next, this means you need to transverse the buffer as you would a linked list, aka its slow if you need item 5000.
option2: is to store only pointers to the real data in the circular buffer, that way when you loop around you don't have to deal with size mis-matchs. if you store the real data in a circular buffer and don't pad it out you could run into a situations where your over witting multiple items with 1 new data object, i assume this is not ok.
store the actual data elsewhere in flash, most flash will have some sort of wear leveling built in, if so you don't need to worry about overwriting the same location multiple times, the IC will figure out where to actually store it on chip, just write to to the next available free space.
this means you need to pick a maximum size for the circular buffer how you do this depends on the data variability. If the size of the data just change much, say by only a few bytes, then you should just pad it out and use option 1. If the size changes wildly and unpredictably, choose the largest size it could be and figure out how many objects of that size would fit in your flash, use that as the max number of entries in the buffer. This means you waste a bunch of space.
option 3: if the object can really be any size, your at the point where you should just use a file system, name the files in order and loop back when your full keeping in mind if your new entry is large you may have to delete multiple old entries to fit it in. This is really just an extension of option 2 as option2 is in many ways a simple file system.

Related

2D Chunked Grid Fast Access

I am working on a program that stores some data in cells (small structs) and processes each one individually. The processing step accesses the 4 neighbors of the cell (2D). I also need them partitioned in chunks because the cells might be distributed randomly trough a very large surface, and having a large grid with mostly empty cells would be a waste. I also use the chunks for some other optimizations (skipping processing of chunks based on some conditions).
I currently have a hashmap of "chunk positions" to chunks (which are the actual fixed size grids). The position is calculated based on the chunk size (like Minecraft). The issue is that, when processing the cells in every chunk, I lose a lot of time doing a lookup to get the chunk of the neighbor. Most of the time, the neighbor is in the same chunk we are processing, so I did a check to prevent looking up a chunk if the neighbor is in the same chunk we are processing.
Is there a better solution to this?
This lacks some details, but hopefully you can employ a solution such as this:
Process the interior of a chunk (ie excluding the edges) separately. During this phase, the neighbours are for sure in the same chunk, so you can do this with zero chunk-lookups. The difference between this and doing a check to see whether a chunk-lookup is necessary, is that there is not even a check. The check is implicit in the loop bounds.
For edges, you can do a few chunk lookups and reuse the result across the edge.
This approach gets worse with smaller chunk sizes, or if you need access to neighbours further than 1 step away. It breaks down entirely in case of random access to cells. If you need to maintain a strict ordering for the processing of cells, this approach can still be used with minor modifications by rearranging it (there wouldn't be strict "process the interior" phase, but you would still have a nice inner loop with zero chunk-lookups).
Such techniques are common in general in cases where the boundary has different behaviour than the interior.

Text editor with multipe undo/redo in c

I'm starting a school project: we have to code an efficient text editor in c. To command this i can use:
(row1,row2)c
(row1,row2)d
(row1,row2)p
(number)u
(number)r
These commands are used to change text between row1 and row2 (the c), delete text between row1 and row2 (the d, the text will be replaced with a single dot), print on stdout rows between row1 and row2 (the p), undo (number) times or redo (number times) (this last two commands doesn't affect print, just c and d).
To start I was thinking what data structure I can use.
I thought, for the rows, to use a single link list with number of row and a second list (for the text itself).
This because the code has to be efficient in time and space
But I don't find a good way to implement undo/redo in my case: I was thinking two create two stacks, one for undo and one for redo. Every command I give it's inserted in undo stack and, if I undo something i delete the first action in undo stack and put it in redo stack
But I don't know ho to write how to write these commands: i was thinking to save a complementary command, so I can run this command and return in a previous statement. Then, when I undo, i create complementary command in redo stack, and I delete this stack every new command to free space
I hope it's understandable, I just want your opinion about this possible structure
NB I can code only in c11 with stdlib and stdio theoretically, but I can copy and modify other libraries' functions if it's needed
---UPDATE---
I was thinking if it's better to use a R/B Tree for keeping the rows structure. This because it would take O(log(n)) to search the X-th row and edit it, instead of O(n)
The only problem it's, when I have to change many rows in just a command (e.g 1,521c) it takes longer to search every row
Maybe a sort of hybrid could be a good choice: i use RBT structure to find the start row address, then I use the list structure to find the others. So every node of this tree has 2 address for RBT and 1 address for list
Your design ideas are spot on.
The part needed is how to represent the undo and redo entries.
A redo entry could be a struct that indicates what span of text to replace and the text to replace it with. A "span" here gives the offsets into the text, and since it could be an empty span (just a position), that suggests using a half-open interval [start .. end) or a start and length. That can express any single text-change operation. In theory the replacement text could be a 0-length string; if not yet, anticipate that future assignments may add feature requests.
An undo entry can be the same struct, describing as you noted the complementary text replacement operation.
The other design decision is how to represent the document text. The simplest thing is a sequential buffer of characters, in which case every insertion requires moving all following text downwards after ensuring the memory buffer is large enough.
An alternative is a list of lines of text, each line being a separate memory node. That way, inserting, deleting, and replacing lines doesn't have to move the bulk of the text around in memory, just some of the line node pointers. Furthermore, for line replacement commands the redo/undo entries can just list which range of line pointers to replace with other line pointers.
Suppose you create just one stack (array?) and call it 'history'. As commands are made, add their 'antidote' (or pointers to them) to the stack, and adjust a pointer/counter to the last command. As your user steps back ('undo'), replace each command with it's 'antidote' (The code to put it there in the first place could be reused), so it's there for subsequent 'redo', and reposition the counter as needed . You'll have to allocate storage for deleted text, and link it to the stack position (2 dimensional (pointer?) array, or perhaps a struct?). If your stack gets full, delete the oldest- it's now 'out of range', and move everything accordingly... Or... Just allocate more memory... ;-)
Just an idea...
Remember, if it works properly, it isn't wrong. Perhaps just not the most efficient or effective way of doing it...
Don't forget to clear the stack on 'save', and most importantly, release any allocated memory on 'terminate'.
Mike.

Elm: Search Number in Bytes

I'm trying to find some exif data in an image.
So first I need to find the number 0x45786966 ('Exif' as unsignedInt32) and store the offset.
The next two bytes should be zeros and after that the endianness as unsignedInt16 (either 0x4d4d or 0x4949) which should be stored too.
I can get the image as Bytes with the elm/file module.
But how do I search the 'Exif' start and parse the endianness in those Bytes?
I looked at the loop-example from elm/bytes but do not fully understand it.
First it reads the length of a list (unsignedInt32) and then it reads byte by byte?
How would this work if I want to read unsignedInt32s instead of bytes?
How do I set an offset to indicate where functions like unsignedInt32 should read next?
The example is talking about structured data with a known size field at the start. In your case, what you want to do is a search, so it is a rather different problem.
The problem is elm/bytes isn't really designed to handle searching. If you can guarantee the part you are looking for will be byte aligned, it may well be possible to do this, but given just what you have said, there isn't an easy way, as you can't iterate bit-by-bit.
You would have to read in values without alignment and then manually search for the part of the number you want within that. Given the difficulty and inefficiency of that approach, I would recommend using ports instead for that use case.
If you can guarantee that what you are searching for will be byte-aligned (or better yet, aligned to the length of your number), you can decode a byte at a time until you find what you are looking for. There is no way to read from a given offset, if you want to read to a certain point, you'd need to read and throw away values.
To do this, you would want to set up a loop where your state contains how much of the value you are looking for you have found. Each step, you check if you have the whole thing (success), you have the next part (continue), or you have something different (reset the state to search from the start again). If you reach the end without finding it, you have failed.

Redis, how does SCAN cursor "state management" work?

Redis has a SCAN command that may be used to iterate keys matching a pattern etc.
Redis SCAN doc
You start by giving a cursor value of 0; each call returns a new cursor value which you pass into the next SCAN call. A value of 0 indicates iteration is finished. Supposedly no server or client state is needed (except for the cursor value)
I'm wondering how Redis implements the scanning algorithm-wise?
You may find answer in redis dict.c source file. Then I will quote part of it.
Iterating works the following way:
Initially you call the function using a cursor (v) value of 0. 2)
The function performs one step of the iteration, and returns the
new cursor value you must use in the next call.
When the returned cursor is 0, the iteration is complete.
The function guarantees all elements present in the dictionary get returned between the start and end of the iteration. However it is possible some elements get returned multiple times. For every element returned, the callback argument 'fn' is called with 'privdata' as first argument and the dictionary entry'de' as second argument.
How it works
The iteration algorithm was designed by Pieter Noordhuis. The main idea is to increment a cursor starting from the higher order bits. That is, instead of incrementing the cursor normally, the bits of the cursor are reversed, then the cursor is incremented, and finally the bits are reversed again.
This strategy is needed because the hash table may be resized between iteration calls. dict.c hash tables are always power of two in size, and they use chaining, so the position of an element in a given table is given by computing the bitwise AND between Hash(key) and SIZE-1 (where SIZE-1 is always the mask that is equivalent to taking the rest of the division between the Hash of the key and SIZE).
For example if the current hash table size is 16, the mask is (in binary) 1111. The position of a key in the hash table will always be the last four bits of the hash output, and so forth.
What happens if the table changes in size?
If the hash table grows, elements can go anywhere in one multiple of the old bucket: for example let's say we already iterated with a 4 bit cursor 1100 (the mask is 1111 because hash table size = 16).
If the hash table will be resized to 64 elements, then the new mask will be 111111. The new buckets you obtain by substituting in ??1100 with either 0 or 1 can be targeted only by keys we already visited when scanning the bucket 1100 in the smaller hash table.
By iterating the higher bits first, because of the inverted counter, the cursor does not need to restart if the table size gets bigger. It will continue iterating using cursors without '1100' at the end, and also without any other combination of the final 4 bits already explored.
Similarly when the table size shrinks over time, for example going from 16 to 8, if a combination of the lower three bits (the mask for size 8 is 111) were already completely explored, it would not be visited again because we are sure we tried, for example, both 0111 and 1111 (all the variations of the higher bit) so we don't need to test it again.
Wait... You have TWO tables during rehashing!
Yes, this is true, but we always iterate the smaller table first, then we test all the expansions of the current cursor into the larger table. For example if the current cursor is 101 and we also have a larger table of size 16, we also test (0)101 and (1)101 inside the larger table. This reduces the problem back to having only one table, where the larger one, if it exists, is just an expansion of the smaller one.
Limitations
This iterator is completely stateless, and this is a huge advantage, including no additional memory used.
The disadvantages resulting from this design are:
It is possible we return elements more than once. However this is usually easy to deal with in the application level.
The iterator must return multiple elements per call, as it needs to always return all the keys chained in a given bucket, and all the expansions, so we are sure we don't miss keys moving during rehashing.
The reverse cursor is somewhat hard to understand at first, but this comment is supposed to help.

Bitcoin : Edit CTxOut class in bitcoins source

Can we edit and add new field to CTxOut class in order to send additional information to transaction object and then to the blockchain?
I got the following answer from here
Note that you should only do this, putting extra data into the block chain, if it is really necessary. The block chain has to be stored by every full node, so try not to take up all our hard drive space with unnecessary stuff whenever possible.
With that said, if you do want to add extra data to your transaction, then add an additional output to the transaction, for which the scriptPubKey has the following form:
OP_RETURN {80 bytes of whatever data you want}
80 bytes was chosen because it is big enough for a 64 byte hash and 16 other extra bytes of data, but not big enough to store anything maliciously big (like a movie collection). This transaction output is automatically un-spendable, and so will not be kept in the UTXO set in any pruning. The other UTXOs from your transaction will still be safe.