ElastiCache cloudwatch metrics for Redis: currItems for a single database - redis

I have set up a metric for the aws interface on the ElastiCache redis cluster. I'm looking at a value of currItems superior to a certain number for a given period (say 1000 for 1 minute)
The issue I have is that I have two databases in Redis, name 0 and 1. I would like to only get the currItems for database 0, not database 1, since database 1 is holding values for a longer period of time and make the whole metric look much bigger than it should be (since I want the current items of database 0)
Is there a way to create a metric that would only get the currItems of the database 0?

You will have to create an application for this or use existing redis tools.
https://stackoverflow.com/questions/8614737/what-are-some-recommended-tools-to-monitor-a-redis-database
If you are using new relic

Related

lock redis key when two processes are accessing the same key

Application 1 set a value in Redis.
And we have two instance of application 2 which are running and we would like only one instance should read this value from Redis (please note application 2 takes around 30 sec to 1 min process data )
Can Instance-1 application 2 acquire lock redis key which is created by application 1 , so that instance-2 of application 2 will not read and do the same operation ?
No, there's no concept of record lock in Redis. If you need to achieve some sort of locking you have to use another set of data structures to mimic that behavior. For example
List: You can use a list and then POP the item from the list or...
Redis Stream: Using Redis Stream with ConsumerGroup so that each consumer in your Group only sees a portion of the whole data the needs to be processed and it guarantees you that, when an item is delivered to a consumer, it is not going to be delivered to another one.

Storing time intervals efficiently in redis

I am trying to track server uptimes using redis.
So the approach I have chosen is as follows:
server xyz will keep on sending my service ping indicating that it was alive and working in the last 30 seconds.
My service will store a list of all time intervals during which the server was active. This will be done by storing a list of {startTime, endTime} in redis, with key as name of the server (xyz)
Depending on a user query, I will use this list to generate server uptime metrics. Like % downtime in between times (T1, T2)
Example:
assume that the time is T currently.
at T+30, server sends a ping.
xyz:["{start:T end:T+30}"]
at T+60, server sends another ping
xyz:["{start:T end:T+30}", "{start:T+30 end:T+60}"]
and so on for all pings.
This works fine , but an issue is that over a large time period this list will get a lot of elements. To avoid this currently, on a ping, I pop the last element of the list, check if it can be merged with the latest time interval. If it can be merged, I coalesce and push a single time interval into the list. if not then 2 time intervals are pushed.
So with this my list becomes like this after step 2 : xyz:["{start:T end:T+60}"]
Some problems I see with this approach is:
the merging is being done in my service, and not redis.
incase my service is distributed, The list ordering might get corrupted due to multiple readers and writers.
Is there a more efficient/elegant way to handle this , like maybe handling merging of time intervals in redis itself ?

How to interpret "evicted_keys" from Redis Info

We are using ElastiCache for Redis, and are confused by its Evictions metric.
I'm curious what the unit is on the evicted_keys metric from Redis Info? The ElastiCache docs say it is a count: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.Redis.html but for our application we have observed the "Evictions" metric (which is derived from evicted_keys) fluctuates up and down, indicating it's not a count. I would expect a count to never decrease, since we cannot "un-evict" a key. I'm wondering if evicted_keys is actually a rate (eg, evictions/sec), which would explain why it can fluctuate.
Thanks you in advance for any responses!
From INFO command:
evicted_keys: Number of evicted keys due to maxmemory limit
To learn more about evictions see Using Redis as an LRU cache - Eviction policies
This counter is zero when the server starts, and it is only reset if you issue the CONFIG RESETSTAT command. However, on ElastiCache, this command is not available.
That said, ElastiCache derives the metric from this value, by calculating the difference between data-points.
Redis evicted_keys 0 5 12 18 22 ....
CloudWatch Evictions 0 5 7 6 4 ....
This is the usual pattern in CloudWatch metrics. This allows you to use SUM if you want the cumulative value, but also to detect rate changes or spikes easily.
Think for example you want to alarm if evictions are more than 10,000 over one minute period. If ElastiCache stores the cumulative value from Redis straight as a metric, this would be hard to accomplish.
Also, by committing the metric only as evicted keys for the period, you are protected of the data distortion of a server-reset or a value overflow. While the Redis INFO value would go back to zero, on ElastiCache you still get the value for the period and you can still do running sum over any period.

Publishing table count stats with cloudwatch put-metric-data

I've been tasked with monitoring a data integration task, and I'm trying to figure out the best way to do this using cloudwatch metrics.
The data integration task populates records in 3 database tables. What I'd like to do is publish custom metrics each day, with the number of rows that have been inserted for each table. If the row count for one or more tables is 0, then it means something has gone wrong with the integration scripts, so we need to send alerts.
My question is, how to most logically structure the calls to put-metric-data.
I'm thinking of the data being structured something like this...
Namespace: Integrations/IntegrationProject1
Metric Name: RowCount
Metric Dimensions: "Table1", "Table2", "Table3"
Metric Values: 10, 100, 50
Does this make sense, or should it logically be structured in some other way? There is no inherent relationship between the tables, other than that they're all associated with a particular project. What I mean is, I don't want to be infering some kind of meaningful progression from 10 -> 100 -> 50.
Is this something that can be done with a single call to the cloudwatch put-metric-data, or would it need to be 3 seperate calls?
Seperate calls I think would look something like this...
aws cloudwatch put-metric-data --metric-name RowCount --namespace "Integrations/IntegrationProject1" --unit Count --value 10 --dimensions Table=Table1
aws cloudwatch put-metric-data --metric-name RowCount --namespace "Integrations/IntegrationProject1" --unit Count --value 100 --dimensions Table=Table2
aws cloudwatch put-metric-data --metric-name RowCount --namespace "Integrations/IntegrationProject1" --unit Count --value 50 --dimensions Table=Table3
This seems like it should work, but is there some more efficient way I can do this, and combine it into a single call?
Also is there a way I can qualify that the data has a resolution of only 24 hours?
Your structure looks fine to me. Consider having a dimension for your stage: beta|gamma|prod.
This seems like it should work, but is there some more efficient way I can do this, and combine it into a single call?
Not using the AWS CLI, but if you used any SDK e.g. Python Boto3, you can publish up to 20 metrics in a single PutMetricData call.
Also is there a way I can qualify that the data has a resolution of only 24 hours?
No. CloudWatch will aggregate the data it receives on your behalf. If you want to see a daily datapoint, you can change the period to 1 day when graphing the metric on the CloudWatch Console.

How to create inhouse funnel analytics?

I want to create in-house funnel analysis infrastructure.
All the user activity feed information would be written to a database / DW of choice and then, when I dynamically define a funnel I want to be able to select the count of sessions for each stage in the funnel.
I can't find an example of creating such a thing anywhere. Some people say I should use Hadoop and MapReduce for this but I couldn't find any examples online.
Your MapReduce is pretty simple:
Mapper reads row of a session in log file, its output is (stag-id, 1)
Set number of Reducers to be equal to the number of stages.
Reducer sums values for each stage. Like in wordcount example (which is a "Hello World" for Hadoop - https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Example%3A+WordCount+v1.0).
You will have to set up a Hadoop cluster (or use Elastic Map Reduce on Amazon).
To define funnel dynamically you can use DistributedCache feature of Hadoop. To see results you will have to wait for MapReduce to finish (minimum dozens of seconds; or minutes in case of Amazon's Elastic MapReduce; the time depends on the amount of data and the size of your cluster).
Another solution that may give you results faster - use a database: select count(distinct session_id) group by stage from mylogs;
If you have too much data to quickly execute that query (it does a full table scan; HDD transfer rate is about 50-150MB/sec - the math is simple) - then you can use a distributed analytic database that runs over HDFS (distributed file system of Hadoop).
In this case your options are (I list here open-source projects only):
Apache Hive (based on MapReduce of Hadoop, but if you convert your data to Hive's ORC format - you will get results much faster).
Cloudera's Impala - not based on MapReduce, can return your results in seconds. For fastest results convert your data to Parquet format.
Shark/Spark - in-memory distributed database.