Clear cache in Apache Ignite every n seconds - ignite

How do we empty the cache every n seconds (so that we can run queries on the data that has come in for the n second window - batch window querying)? I could only find FIFO and LRU based eviction policies in the ignite code where the eviction policy is based on the cache entry getting added or modified.
I understand we can have a sliding window using CreatedExpiryPolicy
cfg.setExpiryPolicyFactory(FactoryBuilder.factoryOf(new CreatedExpiryPolicy(new Duration(SECONDS, 5))));
But I don't think this will help me maintain batch windows. Neither will FIFO or LruEvictionPolicy.
I need some eviction policy which is based on some static time window (every 5 seconds for example).
Will I have to write my own implementation?

Well, it's possible to change ExpiryPolicy for each added entry with
IgniteCache.withExpiryPolicy
and calculate remaining time every time, but it will be too big overhead - each entry will have their own EvictionPolicy.
I would recommend to schedule job that will clear the cache using cron based scheduling:

Related

Hangfire job creation throughput performance with Redis RDB

In official documentation, there is a chart, which tells, that creation job throughput with Redis RDB could be around 6 000 jobs per second. I have tried different Hangfire, Redis and HW configurations, but I always get max around 200 jobs per second. I even created simple example that reproduces it (Hangfire configuration, job creation).
Am I doing something wrong? What job creation throughput performance are you getting?
I am using latest versions: Hangfire 1.7.24, Hangfire.Pro 2.3.0, Hangfire.Pro.Redis 2.8.10 and Redis 6.2.1.
The point is that in the referenced sample application, background jobs are being created sequentially, one after another. In this case background jobs aren't created fast enough due to I/O delays (round-trips to the storage), to result in better throughput. And since there's also a call to Hangfire.Console that requires even more I/O, creation process is performed even slower.
Try to create background jobs in a Parallel.For loop to create background job in parallel and amortize the latency. And try to create all the background jobs before starting the server to make a clear distinction between created/sec and performed/sec metrics as shown below, otherwise everything will be mixed up.
var sw = Stopwatch.StartNew();
Parallel.For(0, 100000, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount }, i =>
{
BackgroundJob.Enqueue(() => Empty());
});
Console.WriteLine(sw.Elapsed);
using (new BackgroundJobServer())
{
Console.ReadLine();
}
On my development machine I've got 7.7 sec to create 100,000 background jobs (~13,000 jobs/sec) and Dashboard UI told me that perform rate is ~3,500 jobs/sec that's a bit lower than displayed on the chart, but that's because there are more extension filters now in Hangfire than 6 years ago when that chart was created. And if we clear them with GlobalJobFilters.Filters.Clear(), we'll get about 4,000 jobs/sec.
To avoid the confusion, I've removed the absolute numbers from those charts today. Absolute numbers are different for different environments, e.g. on-premise (can be faster) and cloud (will be slower). That chart was created to show the relative difference between SQL Server and Redis in different modes, which is approximately the same in different env, not to show the precise numbers that depend on a lot of factors, especially when network is involved.

ADF Dataflows; Do I have any control or influence over cluster startup time. (NOT "TTL")

Yes, I know about TTL; Yes, I'm configuring that; No, that's not what I'm asking about here.
Spinning up an initial cluster for a Dataflow takes around 5 minutes.
Starting acquiring compute from an existing "warm" cluster (i.e. one which has been left 'Alive' using TTL), for a new dataflow still appears to take 1-2 minutes.
Those are pretty large numbers, especially if you have a multi-step ETL process, and have broken up your pipeline to separate concerns (or if you're executing the dataflows in a loop, to process data per-source-day)
Controlling the TTL gives me some control over which of those two possibilities I'm triggering, but even 2 minutes can be a quite substantial overhead. (I have a pipeline where fully half the execution time is waiting for those 1-2 minute 'Acquire Compute' startups)
Do I have any control at all, over how long startup takes in each case? Is there anything that I can do to speed up the startup, or anything that I should avoid to prevent making things even worse!
There's a new feature in town, to fix exactly this problem.
Release blog:
https://techcommunity.microsoft.com/t5/azure-data-factory/how-to-startup-your-data-flows-execution-in-less-than-5-seconds/ba-p/2267365
ADF has added a new option in the Azure Integration Runtime for data flow TTL: Quick re-use. ... By selecting the re-use option with a TTL setting, you can direct ADF to maintain the Spark cluster for that period of time after your last data flow executes in a pipeline. This will provide much faster sequential executions using that same Azure IR in your data flow activities.

Redis - decrease TTL by single command

Let's say I have redis record with TTL = 1 hour.
Then some event occurs and I want to reset the TTL of this item to min(current-TTL, 5min). So it would decrease the TTL to 5 mins. if it is not already lower.
The underlying use-case is, there can be too frequent invalidation of the cache and the "old" cache is almost as good as the fresh, if it is not older then 5mins. from the first change.
I know I can fetch the TTL in one command and update it with second, but would prefer to set it by single command for various reasons. Is there a way?
Edit: There will be many keys I would need to decrease by single command. So I would like to avoid data round-trips between redis and client library for each record.
There is no single command to do that, but you can wrap the logic in a server-side Lua script and invoke that with a single command. Refer to the EVAL command for more information.

Idle Queue utilization in Capacity Scheduler - EMR

I configured capacity scheduler and schedule jobs in specific Queues. However, I see there are times when jobs in some Queues complete faster while other Queues have jobs waiting on the previous ones to commplete. This creates a scenario where half of my capacity is idle and other half is busy with jobs waiting to get resources.
Is there any config that I can tweak to maximize my utilization. I want to route waiting jobs to other queues where resources are available. Attached is a screenshot -
Seems like an issue with Capacity-Scheduler here, I switched to Fair-scheduler and definitely see huge improvements in cluster utilization, ~75% and way better than 40s with caoacity-scheduler
So the reason behind is when multiple users submits jobs to a same queue it can consume max resources, but a single user can't consume more than the capacity even though max capacity is greater than that.
So if you specify yarn.scheduler.capacity.root.QUEUE-1.capacity: 20 this to capacity-scheduler.xml one user can't take more than 20% resources for QUEUE-1 queue even though your cluster have free resources.
By default this user-limit-factor is set to 1. So if you set it to 2 your job can use 40% of resources if maximum allocated resources is greater than or equals to 40.
yarn.scheduler.capacity.root.QUEUE-1.user-limit-factor: 2
Please follow this blog

Hitting redis server with redis hash using JMeter (using redis-dataset plugin)

I have a redis server running and I wanted to use JMeter to get the benchmarks and to find in how much time it hits 20K transactions per second. I have a hash setup. How should I go about querying it. I have put one of the keys as redis key and have put one of the fields of the hash as variable name.
If I use constant throughput timer, what should I enter in the name field.
Thanks in advance.
If you're planning to use Constant Throughput Timer and your target it to get 20k requests per second load you need to configure it as follows:
Target Throughput: 1200000 (20k per second * 60 seconds in minute)
Calculate Throughput based on: all active threads
See How to use JMeter's Throughput Constant Timer article for more details.
Few more recommendations:
Constant Throughput Timer can only pause the threads so make sure you have enough virtual users on Thread Group level
Constant Throughput Timer is accurate enough on "minute" level, so make sure your test lasts long enough so the timer will be correctly applied. Also consider reasonable ramp-up period.
Some people find Throughput Shaping Timer easier to use
20k+ concurrent threads is normally something you cannot achieve using single machine so it is likely you'll need to consider Distributed Testing when multiple JMeter instances act as a cluster.