Extract numeric value from string in Cloudwatch to use in metrics (e.g. "64MB") - amazon-cloudwatch

Is it possible to create a metric that extracts a numeric value from a string in Cloudwatch logs so I can graph / alarm it?
For example, the log may be:
20190827 1234 class: File size: 64MB
I realize I can capture the space delimited fields by using a filter pattern like: [date, time, class, word1, word2, file_size]
And file_size will be "64MB", but how do I convert that to a numeric 64 to be graphed?
Bonus question, is there any way of matching "File size:" as one field instead of creating a field for each space delimited word?

Use abs to cast to number, or any other numberic function
Using Glob Expressions
fields #message
| parse #message "File size: *MB" as size
| filter abs(size)<64
| limit 20
Using Regular Expressions
fields #message
| parse #message /File size:\s+(?<size>\d+)MB/
| filter abs(size)<64
| limit 20
To learn how glob or regular expression can be used, see Cloud Watch Logs Query Syntax

Related

CloudWatch Log Insights using "in" to match any message that has any item in array

I have an array with a list of unique literal strings (ids) and I want to use the "in" keyword to test for set membership. I've used the following query, the ephemeral field "id" extracts the id from the message.
fields #timestamp,#message, #logStream
| filter #message like /mutation CreateOrder/
| parse #message 'Parameters: *}], "id"=>"*"}}, "graphql"*' as rest_of_message, id
| parse #message '"variables"=>{"createOrderInput"=>*}, "graphql"' as variables
| filter id in ["182841661","182126710"]
| sort #timestamp desc
| limit 10000
| display id, variables
It was my assumption that it would match any message whose ephemeral field "id" matches any of the literal ids in the array. However, it's only matching the message that contain the first literal id in the array.
I've searched for both ids using the "like" key word and they both come up in the selected period.
Is it possible to do what I want to do? Is there a better way of doing it?

How to only extract match strings from a multi-value field and display in new column in SPLUNK Query

i am trying to extract matched strings from the multivalue field and display in another column. I have tried various options to split the field by delimiter and then mvexpand and then user where/search to pull those data. I was trying to find if there is an easier way to do this without all this hassle in SPLUNK query.
Example: Lets say i have below multi-value column1 field with data separated by delimiter comma
column1 = abc1,test1,test2,abctest1,mail,send,mail2,sendtest2,new,code,results
I was splitting this column using delimiter |eval column2=split(column1,",") and using regex/where/search to search for data with *test* in this column and return results, where i was able to pull the results but the column1 still shows all the values abc1,test1,test2,abctest1,mail,send,mail2,sendtest2,new,code,results , what i want is either to trim1 column1 to show only words match with test or show those entries in new column2 which should only show this words test1,test2,abctest1,sendtest2 as they were only matching *test*.
I would appreciate your help, thanks.
Found the answer after posting this question, its just using exiting mvfilter function to pull the match resutls.
column2=mvfilter(match(column1,"test"))
| eval column2=split(column1,",") | search column2="*test*"
doesn't work, as the split creates a multi-value field, which is a single event containing a single field containing many values. The search for *test* will still find that event, even though it contains abc1, etc... as there is at least one field that is *test*.
What you can use is the mvfilter command to narrow down the multi-value field to the events you are after.
| eval column2=split(column1,",") | eval column2=mvfilter(match(column2,".*test.*"))
Alternatively to this approach, you can use a regular expression to extract what you need.
| rex field=column1 max_match=0 "(<?column2>[^,]*test[^,]*)"
Regardless, at the end, you would need to use mvjoin to join your multiple values into a single string
| eval column2=mvjoin(column2, ",")

Splunk - Extract multiple values not equaling a value from a string

In Splunk I'm trying to extract multiple parameters and values that do not equal a specific word from a string. For example:
Anything in this field that does not equal "negative", extract the parameter and value:
Field:
field={New A=POSITIVE, New B=NEGATIVE, New C=POSITIVE, New D=BAD}
Result:
New A=POSITIVE
New C=POSITIVE
New D=BAD
Try this search. It uses a regular expression to extract parameters and values where the value is not "NEGATIVE".
index=foo
| rex field=field max_match=0 "(?<result>New \w=(?!NEGATIVE)\w+)"
| mvexpand result
To extract the actual field/value pairs, add this to the end of Rich's solution
| rename _raw as _orig_raw, result AS _raw
| extract pairdelim="," kvdelim="=" clean_keys=true
| rename _orig_raw as _raw

Format a number to NOT have commas (1,000,000 -> 1000000) in Google BigQuery

In Bigquery: How do we format a number that will be part of the result set that should be not having commas: like 1,000,000 to 1000000 ?
I am assuming that your data type is string here.
You can use the REGEXP_REPLACE function to remove certain symbols from strings.
SELECT REGEXP_REPLACE("1,000,000", r',', '') AS Output
Returns:
+-----+---------+
| Row | Output |
+-----+---------+
| 1 | 1000000 |
+-----+---------+
If your data contains strings with and without commas, this function will return the ones without as they are so you don't need to worry about filtering the input.
Documentation for this function can be found here.

Is there a way to set the maximum length of the string in the column's name in BigQuery?

I can't find any documentation for it, should I believe that it is a maximum of 128 characters?
Let's check by generating a long column name:
#standardSQL
SELECT
STRING_AGG(
CODE_POINTS_TO_STRING([MOD(c, 26) + TO_CODE_POINTS('a')[OFFSET(0)]]),
'')
FROM UNNEST(GENERATE_ARRAY(0, 127)) AS c;
+----------------------------------------------------------------------------------------------------------------------------------+
| f0_ |
+----------------------------------------------------------------------------------------------------------------------------------+
| abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx |
+----------------------------------------------------------------------------------------------------------------------------------+
Now we can use it in a query:
bq query --use_legacy_sql=false "SELECT 1 AS abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx;"
Waiting on <job ID> ... (1s) Current status: DONE
+----------------------------------------------------------------------------------------------------------------------------------+
| abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx |
+----------------------------------------------------------------------------------------------------------------------------------+
| 1 |
+----------------------------------------------------------------------------------------------------------------------------------+
Okay, so 128 characters is all right. What if we use one more?
bq query --use_legacy_sql=false "SELECT 1 AS abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxy;"
Waiting on <job ID> ... (0s) Current status: DONE
BigQuery error in query operation: Error processing job 'bigquerytestdefault:bqjob_r5056943d6408b629_0000015cc29ae7ae_1': Invalid field name "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxy". Fields
must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long.
I get an error about the length of the name. This is documented as part of the tables reference, saying:
[Required] The field name. The name must contain only letters (a-z,
A-Z), numbers (0-9), or underscores (_), and must start with a letter
or underscore. The maximum length is 128 characters.