how to use total-max-memory property - apache

While creating a region in geode you can specify --total-max-memory which should limit the amount of memory used the the region entries.
ref: https://geode.apache.org/docs/guide/tools_modules/gfsh/command-pages/create.html#topic_54B0985FEC5241CA9D26B0CE0A5EA863
I created a region of type PARTITION_OVERFLOW with total-max-memory set, I can see that this attribute is there in the the partition attributes for the region on server, but when the amount of data crossed the total-max-memory limit it did not start overflowing old entries to disk, after some time(memory usage is almost 10x greater than total-max-memory) the heap lru(which is based on total jvm head) kicks in and starts evicting entries.
Is there any additional setting which has to be done to trigger eviction when total-max-memory limit is reached for a region.

The total-max-memory is option is not used in geode. Following is the reference JIRA https://issues.apache.org/jira/browse/GEODE-2719.

Related

Limit Infinispan File Store Size

I would like to cache very large amounts of data in an Infinispan 13 cache that uses passivation to the disk. I've accomplished this with the following configuration:
<persistence passivation="true">
<file-store purge="true"/>
</persistence>
<memory storage="OFF_HEAP" max-size="1GB" when-full="REMOVE"/>
However, now I would like to set the maximum size for the file-store to i.e. 50GB and have the cache delete overflowing entries completely.
Is there a way to do this? I could not find any option to limit the size of a file-store in the documentation.
Thank you!
There is no way to specifically limit the total size of the files stored. Depending upon your use case setting the compaction-ratio lower which should help free some space. https://docs.jboss.org/infinispan/13.0/configdocs/infinispan-config-13.0.html under file-store
You can use expiration though to remove entries after a given period of time. https://infinispan.org/docs/stable/titles/configuring/configuring.html#expiration_configuring-memory-usage This will remove those entries from the cache, which in turn would hit that compaction-ratio sooner to clean up old files.

how to reduce the number of containers in the query

I have a query using to much containers and to much memory. (97% of the memory used).
Is there a way to set the number of containers used in the query and limit the max memory?
The query is running on Tez.
Thanks in advance
Controlling the number of Mappers:
The number of mappers depends on various factors such as how the data is distributed among nodes, input format, execution engine and configuration params. See also How initial task parallelism works
MR uses CombineInputFormat, while Tez uses grouped splits.
Tez:
set tez.grouping.min-size=16777216; -- 16 MB min split
set tez.grouping.max-size=1073741824; -- 1 GB max split
Increase these figures to reduce the number of mappers running.
Also Mappers are running on data nodes where the data is located, that is why manually controlling the number of mappers is not an easy task, not always possible to combine input.
Controlling the number of Reducers:
The number of reducers determined according to
mapreduce.job.reduces
The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas Hive uses -1 as its default value. By setting this property to -1, Hive will automatically figure out what should be the number of reducers.
hive.exec.reducers.bytes.per.reducer - The default in Hive 0.14.0 and earlier is 1 GB.
Also hive.exec.reducers.max - Maximum number of reducers that will be used. If mapreduce.job.reduces is negative, Hive will use this as the maximum number of reducers when automatically determining the number of reducers.
Simply set hive.exec.reducers.max=<number> to limit the number of reducers running.
If you want to increase reducers parallelism, increase hive.exec.reducers.max and decrease hive.exec.reducers.bytes.per.reducer.
Memory settings
set tez.am.resource.memory.mb=8192;
set tez.am.java.opts=-Xmx6144m;
set tez.reduce.memory.mb=6144;
set hive.tez.container.size=9216;
set hive.tez.java.opts=-Xmx6144m;
The defaultĀ settings mean that the actual Tez task will use the mapper's memory setting:
hive.tez.container.size = mapreduce.map.memory.mb
hive.tez.java.opts = mapreduce.map.java.opts
Read this for more details: Demystify Apache Tez Memory Tuning - Step by Step
I would suggest to optimize query first. Use map-joins if possible, use vectorising execution, add distribute by partitin key if you are writing partitioned table to reduce memory consumption on reducers and write good sql of course.

Is there any upper limit on the number of members a sorted set in redis can store?

Is there any upper limit on the number of members a sorted set in redis can store?
For example, according to this link, 2^32 - 1 different members can only be stored in redis set, list. No such upper limit is mentioned for redis sorted set. So should I assume that, the upper limit depends on the memory that is available or there is a fixed number?
The same limit - 2^32-1 - applies to Redis' Sets and Sorted Sets as well.
An excerpt from the Data types page at redis.io:
The max number of members in a set is 232 - 1 (4294967295, more than 4 billion of members per set).
While not mentioned in that page, both Sets and Sorted Sets use the same underlying data structure (which, in turn, is a hash). Hence, they share the same limit.

How to calculate redis memory used percentage on ElastiCache

I want to monitor my redis cache cluster on ElastiCache. From AWS/Elasticache i am able to get metrics like FreeableMemory and BytesUsedForCache. If i am not wrong BytesUsedForCache is the memory used by cluster(assuming there is only one node in cluster). I want to calculate percentage uses of memory. Can any one help me to get percentage of Memory uses in Redis.
We had the same issue since we wanted to monitor the percentage of ElastiCache Redis memory that is consumed by our data.
As you wrote correctly, you need to look at BytesUsedForCache - that is the amount of memory (in bytes) consumed by the data you've stored in Redis.
The other two important numbers are
The available RAM of the AWS instance type you use for your ElastiCache node, see https://aws.amazon.com/elasticache/pricing/
Your value for parameter reserved-memory-percent (check your ElastiCache parameter group). That's the percentage of RAM that is reserved for "nondata purposes", i.e. for the OS and whatever AWS needs to run there to manage your ElastiCache node. By default this is 25 %. See https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/redis-memory-management.html#redis-memory-management-parameters
So the total available memory for your data in ElastiCache is
(100 - reserved-memory-percent) * instance-RAM-size
(In our case, we use instance type cache.r5.2xlarge with 52,82 GB RAM, and we have the default setting of reserved-memory-percent = 25%.
Checking with the info command in Redis I see that maxmemory_human = 39.61 GB, which is equal to 75 % of 52,82 GB.)
So the ratio of used memory to available memory is
BytesUsedForCache / ((100 - reserved-memory-percent) * instance-RAM-size)
By comparing the freeableMemory and bytesUsedForCache metrics, you will have the available memory for the Elasticache non-cluster mode (not sure if it applies to cluster-mode too).
Here is the NRQL we're using to monitor the cache:
SELECT Max(`provider.bytesUsedForCache.Sum`) / (Max(`provider.bytesUsedForCache.Sum`) + Min(`provider.freeableMemory.Sum`)) * 100 FROM DatastoreSample WHERE provider = 'ElastiCacheRedisNode'
This is based on the following:
FreeableMemory: The amount of free memory available on the host. This is derived from the RAM, buffers and cache that the OS reports as freeable.AWS CacheMetrics HostLevel
BytesUsedForCache: The total number of bytes allocated by Redis for all purposes, including the dataset, buffers, etc. This is derived from used_memory statistic at Redis INFO.AWS CacheMetrics Redis
So BytesUsedForCache (amount of memory used by Redis) + FreeableMemory (amount of data that Redis can have access to) = total memory that Redis can use.
With the release of the 18 additional CloudWatch metrics, you can now use DatabaseMemoryUsagePercentage and see the percentage of memory utilization in redis.
View more about the metric in the memory section here
You would have to calculate this based on the size of the node you have selected. See these 2 posts for more information.
Pricing doc gives you the size of your setup.
https://aws.amazon.com/elasticache/pricing/
https://forums.aws.amazon.com/thread.jspa?threadID=141154

What is the maximum value size you can store in redis?

Does anyone know what the maximum value size you can store in redis? I want to use redis as a message queue with celery to store some small documents that need to be processed by a worker on another server, and I want to make sure the documents aren't going to be too big.
I found one page with a reference to 1GB, but when I followed the link on the page for where they got that answer the link wasn't valid anymore. Here is the link:
http://news.ycombinator.com/item?id=1182005
All string values are limited to 512 MiB. This is the size limit you probably care most about.
EDIT: Because keys in Redis are strings, the maximum key size is 512 MiB. The maximum number of keys is 2^32 - 1 = 4,294,967,295.
Values, on the other hand, can vary in size depending on their type. For aggregate data types (i.e. hash, list, set, and sorted set), the maximum value size is 512 MiB for each element, although the data structure itself can have up to 2^32 - 1 elements.
https://redis.io/topics/data-types
https://redis.io/topics/faq#what-is-the-maximum-number-of-keys-a-single-redis-instance-can-hold-and-what-is-the-max-number-of-elements-in-a-hash-list-set-sorted-set
http://groups.google.com/group/redis-db/browse_thread/thread/1c7e33fbc98734b3?fwc=2
Article about Redis Memory Usage can help you to roughly determine how much memory your database would take.
It's in the order of the amount of RAM you have, at least, so unless you plan on puting multi-gigabyte objects in there I wouldn't worry. I've had sets that were hundreds of megabytes big without a problem, but I don't know the exact limits.
A String value can accommodate the size of max 512MB. But according to this link, the size can be increased.