Micrium uC/FS file system Mount delay - embedded

I am working on uC/FS Fat16 filesystem on Nor flash with spi. Volume mount takes 3 Minutes, Even after initial(first time) mount, Each power-on mount takes 3 minutes. How this timing can be reduced.

Related

Efficient way to take hot snapshots from redis in production?

We have redis cluster which holds more than 2 million and these keys has been updated with the time interval of 1 minute. Now we have a requirement to take the snapshot of the redis db in a particular interval For eg every 10 minute. This snapshot should not pause the redis command execution.
Is there any async way of taking snapshot from redis ?
It would be really helpful if we get any suggestion on open source tools or frameworks.
The Redis BGSAVE is async and takes a snapshot.
It calls the fork() function of the OS. According to the Redis manual,
Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great
Two million updates in one minutes, that is 30K+ QPS.
So you really have to try it out, run the benchmark that similutes your business, then issue BGSAVE, monitor the I/O and CPU usage of your system, and see if there's a spike in your redis calling latency.
Then issue LASTSAVE, which will tell you when your last success snapshot happened. So you can adjust your backup schedule.

scrapinghub starting job too slow

I am new in scraping and I am running different jobs on scrapinghub. I run them via their API. The problem is that starting the spider and initializing it takes too much time like 30 seconds. When I run it locally, it takes up to 5 seconds to finish the spider. But in scrapinghub it takes 2:30 minutes. I understand that closing a spider after all requests are finished takes a little bit more time, but this is not a problem. Anyway, my problem is that from the moment I call the API to start the job (I see that it appear in running jobs instantly, but takes too long to make the first request) and the moment the first request is done, I have to wait too much. Any idea how I can make it to last as shortly as locally? Thanks!
I already tried to put AUTOTHROTTLE_ENABLED = false as I saw in some other question on stackoverflow.
According to scrapy cloud docs:
Scrapy Cloud jobs run in containers. These containers can be of different sizes defined by Scrapy Cloud units.
A Scrapy Cloud provides: 1 GB of RAM, 2.5GB of disk space,1x CPU and 1 concurrent crawl slot.
Resources available to the job are proportional to the number of units allocated.
It means that allocating more Scrapy Cloud units can solve your problem.

Restoring Large State in Apache Flink Streaming Job

We have a cluster running Hadoop and YARN on AWS EMR with one core and one master, each with 4 vCores, 32 GB mem, 32 GB disc. We only have one long-running YARN application, and within that, there are only one or two long-running Flink applications, each with a parallelism of 1. Checkpointing has a 10 minute interval with a minimum of 5 minutes between. We use EventTime with a window of 10 minutes and a watermark duration of 15 seconds. The state is stored in S3 through the FsStateBackend with async snapshots enabled. Exactly-Once checkpointing is enabled as well.
We have UUIDs set up for all operators but don't have HA set up for YARN or explicit max parallelism for the operators.
Currently, when restoring from a checkpoint (3GB) the processing holds at the windowing until a org.apache.flink.util.FlinkException: The assigned slot <container_id> was removed error is thrown during the next checkpoint. I have seen that all but the operator with the largest state (which is a ProcessFunction directly after the windowing), finish checkpointing.
I know it is strongly suggested to use RocksDB for production, but is that mandatory for a state that most likely won't exceed 50GB?
Where would be the best place to start addressing this problem? Parallelism?

Avoid redis processes to make RDB snapshots on the same time

Imagine setup of Redis Cluster for example, or just usual sharded setup, where we have N > 1 Redis processes per physical node. All our processes have same redis.conf and enabled SAVE options there with same SAVE period. So, if all our main Redis processes started on the same time - all of them will start SAVE on the same time or around it.
When we have 9 Redis processes and all of them start RDB snapshotting on the same time it:
Affects performance, because we make 9 forked processes that start consume CPU and do IO on the same time.
Requires too much reserved additional memory that can't be used as actual storage, because on write-heavy application Redis may use up to 2x the memory normally used during snapshotting. So... if we want to have redis processes for 100Gb on this node - we should take additional 100Gb for forking all processes on the same time to be safe.
Is there any best practice to modify this setup and make Redis processes start saving one by one or at least with some randomization?
I have only one idea with disabling schedule in redis.conf and write cron script that will start save one by one with time lag. But this solution looks like a hack and it should be some other practices here.

Redis / Limit of .rdb file

I use Redis, and it save it .rdb file (every transaction).
I notice that the .rdb on the production grows 15 MB every day (Now it stands on 75 MB).
Is there any limit to the .rdb file? Is this affect on the performance of the Redis DB?
The rdb on disk has no direct impact on the running redis instance.
The only limit seems to be the filesystem.
We have a 10 GB compressed rdb which is in-memory about 28 GB in size and also had much bigger ones.
But, you may encounter interrupts if you save such large datasets like ours to disk. (even if you use http://redis.io/commands/bgsave )
When the forked redis process writes the latest diff, redis is unresponsive until it's completely written to disk. This time span depends on different values like writes during bgsave, overall amount of keys, size of hashes and so on.
And, be sure to correctly set up the "save" configuration depending on your needs.