What is the cloudwatch insights logs equivalent of SQL's "WHERE IN ('foo', 'bar')" - amazon-cloudwatch

This is not valid:
fields #timestamp, log
| sort #timestamp desc
| filter kubernetes.pod_name like /my-pod/
| filter log.someId IN (156446, 156447, 156448, 156449, 156450, 156451, 156453, 156454, 156455, 156456, 156457, 156458, 156459, 156460, 156461, 156462, 154832, 154379, 154380, 154381, 154382, 153597, 145666, 145647, 145627, 139961, 139967, 142303, 142597, 130045, 129441, 131003, 123103, 122227, 122294 )
the last part is not correct. What is the correct way to accomplish this?

Figured it out. Just use [] instead of (). So:
fields #timestamp, log
| sort #timestamp desc
| filter kubernetes.pod_name like /my-pod/
| filter log.someId IN [156446, 156447, 156448, 156449, 156450, 156451, 156453, 156454, 156455, 156456, 156457, 156458, 156459, 156460, 156461, 156462, 154832, 154379, 154380, 154381, 154382, 153597, 145666, 145647, 145627, 139961, 139967, 142303, 142597, 130045, 129441, 131003, 123103, 122227, 122294]

Related

CloudWatch Logs Insights display a filed from the Json in the log message

This is my log entry from AWS API Gateway:
(8d036972-0445) Method request body before transformations: {"TransactionAmount":225.00,"OrderID":"1545623982","PayInfo":{"Method":"ec","TransactionAmount":225.00},"CFeeProcess":0}
I want to write a CloudWatch Logs Insights query which can display AWS request id, present in the first parenthesis and the order id present in the json.
I'm able to get the AWS request id by parsing the message. How can I get the OrderID json field?
Any help is greatly appreciated.
| parse #message "(*) Method request body before transformations: *" as awsReqId,JsonBody
#| filter OrderID = "1545623982" This did not work
| display awsReqId,OrderID
| limit 20
You can do it with two parse steps, like this:
fields #message
| parse #message "(*) Method request body before transformations: *" as awsReqId, JsonBody
| parse JsonBody "\"OrderID\":\"*\"" as OrderId
| filter OrderID = "1545623982"
| display awsReqId,OrderID
| limit 20
Edit:
Actually, they way you're doing it should also work. I think it doesn't work because you have 2 space characters between brackets and the word Method here (*) Method. Try removing 1 space.

How to reference an eval variable in query

I am trying to access a variable (in this example; sampleFromDate and sampleToDate) from a sub-query. I have defined the variables with syntax eval variableName = value and would like to access with syntax filterName=$variableName$. See the example below where I am trying to access values using earliest=$sampleFromDate$ latest=$sampleToDate$
index=*
earliest=-8d latest=-1d
| eval sampleToDate=now()
| eval sampleFromDate=relative_time(now(), "-1d")
| appendcols [
search (index=*)
earliest=$sampleFromDate$ latest=$sampleToDate$
]
This produces the error:
Invalid value "$sampleFromDate$" for time term 'earliest'
The value of sampleFromDate is in the format seconds since epoch time, e.g.
1612251236.000000
I know I can do earliest=-d latest=now() - but I don't want to do this because I want to reference the variables in several locations and output them at the end.
Why are you trying to eval those time values?
Just do:
index=* earliest=-8d latest=-1d
| <rest of search>
| appendcols [
search (index=*) earliest=-1d
| <rest of appended search>
]
There's no need to explicitly set latest unless you want something other than now()

AWS Log Insights query with string contains

how do I query with contains string in AWS Log insights
fields #timestamp, #message
filter #message = "user not found"
| sort #timestamp desc
| limit 20
fields #timestamp, #message
filter #message strcontains("User not found")
| sort #timestamp desc
| limit 20
This should work fine
fields #timestamp, #message
| filter #message like /user not found/
| sort #timestamp desc
| limit 20
I think you need to select them as fields and then filter on their value. e.g:
fields #timestamp, #message, strcontains(#message, "user not found") AS unf
| filter unf=1
| sort #timestamp desc
| limit 20
Or use regex
fields #timestamp, #message
| filter #message like /User\snot\sfound/
| ...
(haven't tested them)
I recently ran into the same scenario. strcontains takes the input string as the first argument and the search value as the second. so in your case the following should work fine.
fields #timestamp, #message
| filter strcontains(#message, "User not found")
| sort #timestamp desc
| limit 20
I was looking for contains and in filters. Allowed filtering options are:
'in', 'and', 'or', 'not', 'like', '=~', '~=', '|', '|>', '^', '*', '/', '%', '+', '-', '<', '>', '<=', '>=', '=', '!='
So the solution using like seems also the optimal version in terms of operator.
fields #timestamp, #message
| filter #message like /user not found/
| sort #timestamp desc
| limit 20
Nevertheless there's another possibility to parse the message itself and do an equal comparison for use cases where one needs to be more exact. For formatted log rows like:
2020-12-24T19:08:18.180+01:00 [main] INFO com.foo.bar.FooBar - My log message!
You can parse substrings from the message and assign them to a field which can then be filtered using equal operator ("="). In the example below you can see no "INFO" String in the message can interfere with filtering severity:
fields #timestamp, #message
| parse #message "[*] * *" as #level, #severity, #info
| filter #logStream like "my/stream/within/loggroup"
| filter #severity="INFO"
| sort #timestamp desc
| limit 20

How to pass variables as expressions in Bigquery timestamp_diff

Currently, I am trying to check the timestamp difference in hours with expressions passed as a variables through the command line. But I am unable to get the desired output when passing through variables.
a=2019-11-1812:49:43
b=2020-04-04 20:32:33
timediff=$(bq query --nouse_legacy_sql \ 'SELECT TIMESTAMP_DIFF(TIMESTAMP "'$a'", TIMESTAMP "$b", HOUR);')
Looks like the variables I am passing are not recognized. Can someone help me understand the correct way of doing it?
In addition to Hemant's answer to further contribute with the community I will provide an alternative method.
As stated in the documentation, it is possible to use parameterized queries in BigQuery using the Command-Line interface (CLI). You need to use the flag --parameter within your bq query command in order to specify the varibles/parameters you will use.
This flag must be in the format name:type:value. Although, if type is omitted it will used as STRING. As an example:
timediff= $(bq query --use_legacy_sql=false
--parameter='ts_value:TIMESTAMP:2016-12-07 08:00:00'
--parameter='ts_value1:TIMESTAMP:2016-12-07 09:00:00'
'SELECT
TIMESTAMP_DIFF(#ts_value,#ts_value1, HOUR)')
echo $timediff
And the output is:
+-----+
| f0_ |
+-----+
| -1 |
+-----+
You could use --format=csv to format the output as a line:
f0_ -1
In addition, I would like to add that you can use aliases to simplify your query. For instance:
alias bq_set="bq query --use_legacy_sql=false --format=pretty"
timediff=$(bq_set
--parameter='ts_value:TIMESTAMP:2016-12-07 08:00:00'
--parameter='ts_value1:TIMESTAMP:2016-12-07 09:00:00'
'SELECT
TIMESTAMP_DIFF(#ts_value,#ts_value1, HOUR)')
echo $timediff
The output:
+-----+
| f0_ |
+-----+
| -1 |
+-----+
As you can see it was just an alternative to simply your query.
Try using single quotes around the variables, but double-quotes around the entire query. For example:
a='2019-11-18 12:49:43'
b='2020-04-04 20:32:33'
timediff=$(bq query --format=csv --nouse_legacy_sql "SELECT TIMESTAMP_DIFF(TIMESTAMP '$a', TIMESTAMP '$b', HOUR);" | awk
'NR>1')
echo $timediff
-3319

BigQuery select alias using regex_extract_all in standard mode

I'm unable to reference a SELECT alias in BigQuery (standard mode).
Trying to do this query:
SELECT
REGEXP_EXTRACT_ALL(text,
r"(<div \w+>)") AS matches
FROM
regex.test
WHERE
matches IS NOT NULL
Here are steps to reproduce.
bq mk regex
bq mk -t regex.test id:integer,text:string
echo '{"id":1, "text":"<div a>"}' | bq insert regex.test
echo '{"id":2, "text":"<div b>"}' | bq insert regex.test
echo '{"id":3, "text":"<div>"}' | bq insert regex.test
bq query --use_legacy_sql=false "select REGEXP_EXTRACT_ALL(text, r\"(<div \w+>)\") AS matches FROM regex.test WHERE id IS NOT NULL"
+--------------+
| matches |
+--------------+
| [u'<div b>'] |
| [] |
| [u'<div a>'] |
+--------------+
When I try to reference the matches alias, I see an error:
bq query --use_legacy_sql=false "select REGEXP_EXTRACT_ALL(text, r\"(<div \w+>)\") AS matches FROM regex.test WHERE matches IS NOT NULL"
Error in query string: Error processing job 'myname': Unrecognized name:
matches
I am unable to reference the alias matches, and am unable to filter those results WHERE matches IS NOT NULL.
Does anyone know what I'm doing incorrectly here?
Thanks!
Even in BQ, you can't use a column alias in the where clause. Just use a subquery:
SELECT t.*
FROM (SELECT REGEXP_EXTRACT_ALL(text, r"(<div \w+>)") AS matches
FROM regex.test
) t
WHERE ARRAY_LENGTH(matches) > 0
Check out SELECT list aliases visibility
The reason why comparing with NULL does't work for REGEXP_EXTRACT_ALL is because
it returns array so checking with length is the way. Comparing with NULL still will work for REGEXP_EXTRACT
In addition, ideally you should be able use REGEX_MATCH to filter out records w/o matches, but looks like there is an issue with this function in standard mode