Cloudwatch Stats Count if greater than zero - amazon-cloudwatch

In Cloudwatch Log Insights, we have a query which totals some transactions based on the logs. We'd like to add one more count - that is the number of transactions that have a value above zero or is not null for a given query.
fields #timestamp, #message
| filter #message like /ingest success/
| fields concat(data.transaction.source.BusinessName, '-', toupper(data.transaction.orderType)) as clientOrderMode
| stats count(), sum(data.transaction.order.paymentAmount),sum(data.transaction.order.serviceCharge),sum(data.transaction.order.gratuity),
count(if(data.transaction.order.gratuity>0)),sum(data.transaction.guest.emailMarketingOptIn) by clientOrderMode
| sort data.transaction.source.OBBusinessName asc
The above clearly doesn't work, but hopefully you can see what I'm trying to achieve - the number of orders where gratuity is greater than zero.
Any advice, gratefully received.
Thanks

Related

Parse/Ignore specific string in CloudWatch Logs Insights

I have the following AWS Cloudwatch query:
fields #timestamp, #message
| filter #message like /(?i)(error|except)/
| filter !ispresent(level) and !ispresent(eventType)
| stats count(*) as ErrorCount by #message
| sort ErrorCount desc
Results end up looking something like this with the message and a count:
The first 4 results are actualy the same error. However, since they have different (node:*) values at the beginning of the message, it ends up grouping them as different errors.
Is there a way for the query to parse/ignore the (node:*) part so that the first 4 results in the image would be considered just one result with a total count of 2,997?

Display empty bin as a zero value in AWS Log Insights graph

With this count query by bin:
filter #message like / error /
| stats count() as exceptionCount by bin(30m)
I get a discontinuous graph, which is hard to grasp:
Is is possible for AWS Cloudwatch Log Insights to consider the empty bin as zero count to get a continuous graph?
Found your question looking for my own answer to this.
The best that I came up with is to calculate a 'presence' field and then use sum to get 0's in the time bins.
I used strcontains, which returns a 1 when matched or 0 when not. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html#CWL_QuerySyntax-operations-functions
Mine looks like this:
fields #timestamp, #message
| fields strcontains(#message, 'Exit status 1') as is_exit_message
| stats sum(is_exit_message) as is_exit_message_count by bin(15m) as time_of_crash
| sort time_of_crash desc
So, yours would be:
fields strcontains(#message, 'error') as is_error
| stats sum(is_error) as exceptionCount by bin(30m)
Use strcontains + sum or parse + count.
The point is not using filter. You should query all of logs.

How to aggregate logs by field and then by bin in AWS CloudWatch Insights?

I'm trying to do a query that will first aggregate by field count and after by bin(1h) for example I would like to get the result like:
# Date Field Count
1 2019-01-01T10:00:00.000Z A 123
2 2019-01-01T11:00:00.000Z A 456
3 2019-01-01T10:00:00.000Z B 567
4 2019-01-01T11:00:00.000Z B 789
Not sure if it's possible though, the query should be something like:
fields Field
| stats count() by Field by bin(1h)
Any ideas how to achieve this?
Is this what you need?
fields Field | stats count() by Field, bin(1h)
If you want to create a line chart, you can do it by separately counting each value that your field could take.
fields
Field = 'A' as is_A,
Field = 'B' as is_B
| stats sum(is_A) as A, sum(is_B) as B by bin(1hour)
This solution requires your query to include a string literal of each value ('A' and 'B' in OP's example). It works as long as you know what those possible values are.
This might be what Hugo Mallet was looking for, except the avg() function won't work here so he'd have to calculate the average by dividing by a total
Not able to group by a certain field and create visualizations.
fields Field
| stats count() by Field, bin(1h)
Keep getting this message
No visualization available. Try this to get started:
stats count() by bin(30s)

Sorting problem regarding Last Modified Date in Splunk query

I have a problem regarding sorting in SPLUNK.
I want to make automated reports and I want to sort in a calendar the amount of tickets one day.
A ticket has these time stamps:
ACTUAL_END_DATE="2018-10-29 01:00:00.0",
ACTUAL_START_DATE="2018-10-29 00:00:00.0",
CLOSED_DATE="2019-06-16 12:56:00.0",
COMPLETED_DATE="2019-06-06 10:47:46.0",
EARLIEST_START_DATE="2018-10-23 11:20:42.0",
LAST_MODIFIED_DATE="2019-06-16 12:56:07.0",
RFA_DATE="2018-10-23 11:20:42.0",
RFC_DATE="2018-10-22 15:19:00.0",
SFA_DATE="2019-06-06 10:47:02.0",
SFR_DATE="2019-06-06 10:46:52.0",
SCHEDULED_DATE="2019-06-06 10:47:06.0",
SCHEDULED_END_DATE="2018-10-29 01:00:00.0",
SCHEDULED_START_DATE="2018-10-29 00:00:00.0",
SUBMIT_DATE="2018-10-22 15:18:53.0",
I sort by two tokens, the earliest is "#mon" and the latest is "now".
Unfortunately, it sorts by LAST_MODIFIED_DATE and I have 62 tickets in one day. All that have ACTUAL_START_DATE in different months, as you can change a ticket after it closed to add details.
This is my query:
stats latest(STATUS_REASON) as STATUS_REASON latest(CHANGE_REQUEST_STATUS) as CHANGE_REQUEST_STATUS latest(_time) as _time latest(CHANGE_TIMING) as CHANGE_TIMING by INFRASTRUCTURE_CHANGE_ID
| where CHANGE_REQUEST_STATUS !="Cancelled"
| timechart count span=1D
How can I sort them and get rid of the count from LAST_MODIFIED_DATE and have them shown by ACTUAL_START_DATE?
The timechart command is ordering by _time, not by LAST_MODIFIED_DATE (although the two fields may have the same values). To use a different field, assign that field's value to _time.
stats latest(STATUS_REASON) as STATUS_REASON latest(CHANGE_REQUEST_STATUS) as CHANGE_REQUEST_STATUS latest(_time) as _time latest(CHANGE_TIMING) as CHANGE_TIMING by INFRASTRUCTURE_CHANGE_ID
| where CHANGE_REQUEST_STATUS !="Cancelled"
| eval _time = strptime(ACTUAL_START_DATE, "%Y-%m-%d %H:%M:%S.%N")
| timechart count span=1D

Splunk index usage search adding column titled NULL to results

I'm running a fairly simple search to identify index usage on my Splunk install by source, as we're running through the Enterprise 30-day trial with the intention of using Splunk Free after it expires:
index=_internal source=*license_usage.log | eval MB=b/1024/1024 | timechart span=1d sum(MB) by s where count in top50
The results for all of my data sources are returned as expected but there's an additional column titled "NULL" at the end of the results:
Splunk index search NULL column
All of my data has an input source and when I click on the column and choose to view the data, it brings back no results.
Can anyone help me understand what this NULL column is please? If it's correct it suggests I'm using over the 500MB/day limit for Splunk Free, which I need to address before the trial period ends.
The NULL column appears because some events do not have an 's' field. You only want to sum those events with an s field so modify your query to
index=_internal source=*license_usage.log type=Usage
| eval MB=b/1024/1024
| timechart span=1d sum(MB) by s where count in top50