Create alarm template with CloudWatch - amazon-cloudwatch

Still discovering CloudWatch alarm. I have the metric M1, which have a Dimension "Namespace".
I have let say 10 different namespace.
I wish to create a different alarm per namespace, but the only way I find is to create 10 different alarm, one for each namespace.
Is is it possible to create alarm template with CloudWatch? I create one alarm, grouping by namespace. Meaning that the metrics of 3 different namespaces goes above the configured threshold, then I will have 3 alerts instead of just 1.

Related

AWS MemoryDB minimum number of nodes

I'm trying to use AWS MemoryDB for an application that has high availability requirements, but has small amount of data to store.
I want to minimize costs, and was going to go with a cluster that has 1 shard, and 2 nodes in separate availability zones.
However, I'm seeing this warning within the web console:
"Warning: To architect for high availability, we recommend that you retain at least 3 nodes per shard (1 primary and 2 replicas)."
I can't find any explanation for why 3 nodes would be necessary instead of 2. Does anyone know the reason for that recommendation? And does it hold with a small dataset within 1 shard?

bigquery reservations are automatic?

Very basic question. If I purchase flex slots on BigQuery related with a specific project id, without (1) creating a reservation manually, and (2) assigning those slots, are my queries related to this project automatically going to be billed using flex slots?
I assume so - the unclear documentation suggests that a 'default' reservation is created when you purchase slots. Therefore, I imagine BigQuery recognizes the user's intention, unless otherwise specified, is to use the purchased capacity.
It would be a double whammy though if I was charged on-demand pricing while my slots were idle. And, I sense that given I reserved 100 slots, my queries feel slower. But I can't see a way to confirm the jobs used reservations.
Reservations
After you purchase slots, you can assign them to different buckets, called reservations. Reservations let you allocate the slots in ways that make sense for your particular organization.
A reservation named default is automatically created when you purchase
slots.
There is nothing special about the default reservation — it's created as a convenience. You can decide whether you need additional reservations or just use the default reservation.
For example, you might create a reservation named prod for production workloads, and a separate reservation named test for testing. That way, your test jobs won't compete for resources that your production workloads need. Or, you might create reservations for different departments in your organization.
Assignments
To use the slots that you purchase, you will assign projects, folders, or organizations to reservations. Each level in the resource hierarchy inherits the assignment from the level above it, unless you override. In other words, a project inherits the assignment of its parent folder, and a folder inherits the assignment of its organization.
When a job is started from a project that is assigned to a reservation, the job uses that reservation's slots.
If a project is not assigned to a reservation (either directly or by
inheriting from its parent folder or organization), the jobs in that
project use on-demand pricing.
None assignments represent an absence of an assignment. Projects assigned to None use on-demand pricing. The common use case for None assignments is to assign an organization to the reservation and to opt-out some projects or folders from that reservation by assigning them to None. For more information, see Assign a project to None.
Creating assignments
When you create an assignment, you specify the job type for that assignment:
QUERY: Use this reservation for query jobs, including SQL, DDL, DML, and BigQuery ML queries.
PIPELINE: Use this reservation for load, export, and other pipeline jobs.
By default, load and export jobs are free and use a shared pool of slots. BigQuery does not make guarantees about the available capacity of this shared pool. If you are loading large amounts of data, your job may wait as slots become available. In that case, you might want to purchase dedicated slots and assign pipeline jobs to them. We recommend creating an additional dedicated reservation with idle slot sharing disabled.
When load jobs are assigned to a reservation, they lose access to the free pool. Monitor performance to make sure the jobs have enough capacity. Otherwise, performance could actually be worse than using the free pool.
ML_EXTERNAL: Use this reservation for BigQuery ML queries that use services that are external to BigQuery.
Certain BigQuery ML queries use services that are external to BigQuery. To use reserved slots with these external services, create an assignment with job type ML_EXTERNAL.
Screenshots
A full screen guide how to work with Reservations and Assignments is here.

How do I run a data-dependent function on a partitioned region in a member group?

My team uses Geode as a makeshift analytics engine. We store a collection of massive raw data objects (200MB+ each) in Geode, but these objects are never directly returned to the client. Instead, we rely heavily on custom function execution to process these data sets inside Geode, and only return the analysis result set.
We have a new requirement to implement two tiers of data analytics precision. The high-precision analytics will require larger raw data sets and more CPU time. It is imperative that these high-precision analyses do not inhibit the low-precision analytics performance in any way. As such, I'm looking for a solution that keeps these data sets isolated to different servers.
I built a POC that keeps each data set in its own region (both are PARTITIONED). These regions are configured to belong to separate Member Groups, then each server is configured to join one of the two groups. I'm able to stand up this cluster locally without issue, and gfsh indicates that everything looks correct: describe member shows each member hosting the expected regions.
My client code configures a ClientCache that points at the cluster's single locator. My function execution command generally looks like the following:
FunctionService
.onRegion(highPrecisionRegion)
.setArguments(inputObject)
.filter(keySet)
.execute(function);
When I only run the high-precision server, I'm able to execute the function against the high-precision region. When I only run the low-precision server, I'm able to execute the function against the low-precision region. However, when I run both servers and execute the functions one after the other, I invariably get an exception stating that one of the regions cannot be found. See the following Gist for a sample of my code and the exception.
https://gist.github.com/dLoewy/c9f695d67f77ec18a7e60a25c4e62b01
TLDR key points:
Using member groups, Region A is on Server 1 and Region B is on Server 2.
These regions must be PARTITIONED in Production.
I need to run a data-dependent function on one of these regions; The client code chooses which.
As-is, my client code always fails to find one of the regions.
Can someone please help me get on track? Is there an entirely different cluster architecture I should be considering? Happy to provide more detail upon request.
Thanks so much for your time!
David
FYI, the following docs pages mention function execution on Member Groups, but give very little detail. The first link describes running data-independent functions on member groups, but doesn't say how, and doesn't say anything about running data-dependent functions on member groups.
https://gemfire.docs.pivotal.io/99/geode/developing/function_exec/how_function_execution_works.html
https://gemfire.docs.pivotal.io/99/geode/developing/function_exec/function_execution.html
Have you tried creating two different pools on the client, each one targeting a specific server-group, and executing the function as usual with onRegion?, I believe that should do the trick. For further details please have a look at Organizing Servers Into Logical Member Groups.
Hope this helps. Cheers.
As the region data is not replicated across servers it looks like you need to target the onMembers or onServers methods as well as onRegion.

Can I "pin" a Geode/Gemfire region to a specific subset of servers?

We make heavy use of Geode function execution to operate on data that lives inside Geode. I want to configure my cluster so functions we execute against a certain subset of data are always serviced by a specific set of servers.
In my mind, my ideal configuration looks like this: (partitioned) Region A is only ever serviced by servers 1 and 2, while (partitioned) Region B is only ever serviced by servers 3, 4, and 5.
The functions that we execute against the two regions have very different CPU/network requirements; we want to isolate the performance impacts of one region from the other, and ideally be able to tune the hardware for each server accordingly.
Assuming, operationally that you're using gfsh to manage your cluster, you could use groups to logically segregate your cluster by assigning each server to a relevant group. Creating regions then simply requires you to also indicate which 'group' a region should be created on. Functions should already be constrained to execute against a given region with FunctionService.onRegion() calls.
Note: If you're perusing the FunctionService API, don't be tempted to use the onMember(group) methods as those unfortunately only work for peer-to-peer (server-to-server) calls. I'm assuming here that you're doing typical client-server calls. Of course, if you are doing p2p function calls then those methods would be totally relevant.
You can split your servers into different groups and then create the regions into these specific groups, allowing you to correctly "route" the function executions. You can get more details about this feature in Organizing Peers into Logical Member Groups.
Hope this helps. Cheers.

How to automate Cloud Watch Dashboards.?

I have several dashboards in CloudWatch that represent a view of my infrastructure: Number of instances from an autoscaling-group that are currently running, the CPU/Disk usage per instance, etc. However, when I update an autoscaling-group, I have to manually update the dashboards (autoscaling-group ID) to include its EC2 instances in the display. I'm looking for some kind of metric/dimension that can filter autoscaling-groups by tag. Is it possible, if yes then how? if no, how can I make it differently?
Thanks.
You can build lambda function to do this job.
Configure lambda function to trigger every few minutes and check the autoscaling group for addition/removal of instances and update the cloudwatch dashboard accordingly
1 . Create a lambda function to check the instance present in the ASG and update the CW dashboard based on the instances present in the dashboard.
2 . Create a cloudwatch rule to trigger the lambda function every 5/10 minutes