Is there any way to separate APC opcode (file) cache and user cache - apc

My user cache have high change (expire/put) rate so create very high fragmentation and force too many cache flush.
Beside some time i need to flush only one of them.
Is there any method to separate or allow create many separate cache space with APC?

Related

For Ignite Persistence, how to control maximum data size on disk?

How can I limit maximum size on disk when using Ignite Persistence? For example, my data set in a database is 5TB. I want to cache maximum of 50GB of data in memory with no more than 500GB on disk. Any reasonable eviction policy like LRU for on-disk data will work for me. Parameter maxSize controls in-memory size and I will set it to 50GB. What should I do to limit my disk storage to 500GB then? Looking for something like maxPersistentSize and not finding it there.
Thank you
There is no direct parameter to limit the complete disk usage occupied by the data itself. As you mentioned in the question, you can control in-memory regon allocation, but when a data region is full, data pages are going to be flushed and loaded on demand to/from the disk, this process is called page replacement.
On the other hand, page eviction works only for non-persistent cluster preventing it from running OOM. Personally, I can't see how and why that eviction might be implemented for the data stored on disk. I'm almost sure that other "normal" DBs like Postgres or MySQL do not have this option either.
I suppose you might check the following options:
You can limit WAL and WAL archive max sizes. Though these items are rather utility ones, they still might occupy a lot of space [1]
Check if it's possible to use expiry policies on your data items, in this case, data will be cleared from disk as well [2]
Use monitoring metrics and configure alerting to be aware if you are close to the disk limits.

Choose to store table exclusively on disk in Apache Ignite

I understand the native persistence mode of Apache Ignite allows to store as much as possible data in memory - and the potential remaining data on disk.
Is it possible to manually choose which table I want to store in memory and which I want to store EXCLUSIVELY on disk? If I want to save costs, should I just give Ignite a lot of disk space and just a small amount of memory? What if I know some tables should return results as fast as possible while other tables have lower priorities in terms of speed (even if they are accessed more often)? Is there any feature to prioritize data storage into memory at table level or any other level?
You can define two different data regions - one with small amount of memory and enabled persistence and second without persistence, but with bigger max memory size: https://apacheignite.readme.io/docs/memory-configuration
You can't have a cache (which contains rows for a table) to be stored exclusively on disk.
When you add a row to table it gets stored in Durable Memory, which is always located in RAM. Later it may be flushed to disk via Checkpointing process, which will use checkpoint page buffer, which is also in RAM. So you can have a separate region with low memory usage (see another answer) but you can't have data exclusively on disk.
When you access data it will always be pulled from disk to Durable Memory, too.

Maintaining logs in redis cache using java

Requirement - That our application processes files containing records and we have to maintain the log for the records in every file. The log file could easily be 100 MB at times in size.
Solution - Since database operation would be very heavy, so we wanted to go for in-memory cache. Write the logs for a particular file into a redis key (key might be the unique file name itself). Later when the user wants to see the log file, application should be able to read the contents from the cache using the unique key file name and write its content into a file which the user can see/download.
Question - Is this a good idea that, we keep appending the logs for a particular file to the same key and later when we have to write to the file, we read from the key and write the contents to the file? Basically the value of the redis key would always be string and its size might run into 100 MBs in size. Will there be any problems because of this?
You can achieve with redis easily, but don't forget that redis is in-memory store (make sure you don't run out of RAM). Ask yourself why you want to go for in-memory store over normal disk operations while dealing with files. If you feel like more frequent read operations happens and accessing time is crucial go ahead with redis.
Regarding size - 100MB is not a problem, in redis string can hold upto 512MB & List, Set, Hashes can hold >4billion records
I prefer MongoDB(which is a disk-based document store) for this kind of operations over redis.
Consider looking at this link to know when redis is awesome.

How to share the APC user cache between CLI and Web Server instances?

I am using PHP's APC to store a large amount of information (with apc_fetch(), etc.). This information occasionally needs analyzed and dumped elsewhere.
The story goes, I'm getting several hundred hits/sec. These hits increase various counters (with apc_inc(), and friends). Every hour, I would like to iterate over all the values I've accumulated, and do some other processing with them, and then save them on disk.
I could do this as a random or time-based switch in each request, but it's a potentially long operation (may require 20-30 sec, if not several minutes) and I do not want to hang a request for that long.
I thought a simple PHP cronjob would do the task. However, I can't even get it to read back cahe information.
<?php
print_r(apc_cache_info());
?>
Yeilds a seemingly different APC memory segment, with:
[num_entries] => 1
(The single entry seems to be a opcode cache of itself)
While my webserver, powered by nginx/php5-fpm, yields:
[num_entries] => 3175
So, they are obviously not sharing the same chunk of memory. How can I either access the same chunk of memory in the CLI script (preferred), or if that is simply not possible, what would be the absolute safest way to execute a long running sequence on say, a random HTTP request every hour?
For the latter, would using register_shutdown_function() and immediately set_time_limit(0) and ignore_user_abort(true) do the trick to ensure execution completes and doesn't "hang" anyone's browser?
And yes, I am aware of redis, memcache, etc that would not have this problem, but I am stuck to APC for now as neither could demonstrate the same speed as APC.
This is really a design issue and a matter of selecting preferred costs vs. payoffs.
You are thrilled by the speed of APC since you do not spend time to persist the data. You also want to persist the data but now the performance hit is too big. You have to balance these out somehow.
If persistence is important, take the hit and persist (file, DB, etc.) on every request. If speed is all you care about, change nothing - this whole question becomes moot. There are cache systems with persistent storage that can optimize your disk writes by aggregating what gets written to disk and when but you will generally always have a payoff between the two with varying tipping points. You just have to choose which of those suits your objectives.
There will probably never exist an enduring, wholesome technological solution to the wolf being sated and the lamb being whole.
If you really must do it your way, you could have a cron that CURLs a special request to your application which would trigger persisting your cache to disk. That way you control the request, its timeout, etc., and don't have to worry about everything users might do to kill their requests.
Potential risks in this case, however, are data integrity (as you will be writing the cache to disk while it is being updated by other requests in the meantime) as well as requests being served while you are persisting the cache paying the performance hit of your server being busy.
Essentially, we introduced a bundle of hay to the wolf/lamb dilemma ;)

RavenDB disk storage

I have a requirement to keep the RavenDB database running when the disk for main database and index storage is full. I know I can configure provide a drive for storage with config option - Raven/IndexStoragePath
But I need to design for the corner case of when the this disk is full. What is the usual pattern used in this situation. One way is to halt all access while shutting the service down and updating the config file programatically and then start the service - but it is a bit risky.
I am aware of sharding and this question is not related to that , assume that sharding is enables and I have multiple shards and I want to increase storage for each shard by adding a new drive to each. Is there an elegant solution to this?
user544550,
In a disk full scenario, RavenDB will continue to operate, but will refuse to accept further writes.
Indexing will fail as well and eventually mark the indexes as permanently failing.
What is your actual scenario?
Note that in RavenDB, indexes tend to be significantly smaller than the actual data size, so the major cause for disk space utilization is actually the main database, not the indexes.