ElastAlert flatline not finding results - lucene

Creating a flatline alert type using the ElastAlert framework.
When I use the query in the Kibana UI with the exact same syntax it returns results, but ElastAlert isn't returning any results.
Here's my elastalert-rule-file.xml
name: Test Flatline
type: flatline
run_every:
seconds: 15
relalert:
minutes: 0
es_host: localhost
es_port: 9200
threshold: 1
timeframe:
minutes: 5
index: my-index-*
filter:
- query:
query_string:
query: "_type:metric" # this returns results in both kibana and elastalert
#query: "_type:metric AND _exists_:My\ Field\ With\ Spaces.value" # this returns results in kibana but not in elastalert
timestamp_type: unix_ms
alert:
- command
command: ["my-bash-script.sh"]
So I tried play around with the query and if I just specify _type:metric then the search results in Kibana seem to match those in ElastAlert.
However when I attempt to use the query with the _exists_ lucene syntax in the second query ElastAlert doesn't return anything while Kibana seems to be fine with the syntax.
Any ideas?

I got it...just forgot to post an answer.
Apparently for the field with spaces you need to escape the backslashes so the line in question would look like this:
query: "_type:metric AND _exists_:My\\ Field\\ With\\ Spaces.value"
Furthermore, in the special case where you are using Ansible (YAML) configuration you need to add a backslash to escape each backslash.
So the entry in a YAML file would look something like this:
query: "My\\\\ field\\\\ With\\\\ Spaces.value"

You can avoid escaping by using double quotes for the field data:
query: '_type:metric AND _exists_:"My Field With Spaces.value"'

Related

Delimit BigQuery REGEXP_EXTRACT strings in Google Cloud Build YAML script

I have a complex query that creates a View within the BigQuery console.
I have simplified it to the following to illustrate the issue
SELECT
REGEXP_EXTRACT(FIELD1, r"[\d]*") as F1,
REGEXP_REPLACE(FIELD2, r"\'", "") AS F2,
FROM `project.mydataset.mytable`
Now I am trying to automate the creation of the view with cloud build.
I cannot workout how to delimit the strings inside the regex to work with both yaml and SQL.
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bq'
args: [
'mk',
'--use_legacy_sql=false',
'--project_id=${_PROJECT_ID}',
'--expiration=0',
'--view=
REGEXP_EXTRACT(FIELD1, r"[\d]*") as F1 ,
REGEXP_REPLACE(FIELD2, r"\'", "") AS F2,
REGEXP_EXTRACT(FIELD3, r"\[(\d{3,12}).*\]") AS F3
FROM `project.mydataset.mytable`"
'${_TARGET_DATASET}.${_TARGET_VIEW}'
]
I get the following error
Failed to trigger build: failed unmarshalling build config
cloudbuild/build-views.yaml: json: cannot unmarshal number into Go
value of type string
I have tried using Cloud Build substitution parameters, and as many combinations of SQL and YAML escape sequences as I can think of to find a working solution.
Generally, you want to use block scalars in such cases, as they do not process any special characters inside them and are terminated via indentation.
I have no idea how the command is supposed to look, but here's something that's at least valid YAML:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bq'
args:
- 'mk'
- '--use_legacy_sql=false'
- '--project_id=${_PROJECT_ID}'
- '--expiration=0'
- >- # folded block scalar; newlines are folded into spaces
--view=
REGEXP_EXTRACT(FIELD1, r"[\d]*") as F1,
REGEXP_REPLACE(FIELD2, r"\'", "") AS F2,
REGEXP_EXTRACT(FIELD3, r"\[(\d{3,12}).*\]") AS F3
FROM `project.mydataset.mytable`"
'${_TARGET_DATASET}.${_TARGET_VIEW}'
- dummy value to show that the scalar ends here
A folded block scalar is started with >, the following minus tells YAML to not append the final newline to its value.

How can we replace value of hive variables to check for any errors

We have a query in which we have defined more than 50 variables.
we call this hql via shell script, most of the times i get into syntax issue where i have not defined hive variables properly in the query.
Example
set hive var0=value0;
set hive var1=value1;
set hive var2=value2;
select * from ${hiveconf:var0} where col1=${hiveconf:var1} and col2=${hiveconf:var2};
I want to to check the above query result after replacing hive variables,
So is there a way to check if the variables are parsed in the right way or are there any syntax errors.
Please let me know for any alternatives as well.
Better use hivevar namespace for the same.
You can print all variable using ! echo command:
set hivevar:var0=value0;
hive> ! echo Variable hivevar:var0 is ${hivevar:var0};
Result:
Variable hivevar:var0 is value0
Also use explain extended <query> - it will print detailed query plan with predicates and fail if it is syntax error.
Update:
Also you can use SELECT for doing the same and Hive can execute simple queries without MR started if hive.fetch.task.conversion is set to more or minimal. If you are using Qubole, add also limit 1 to the query:
set hive.fetch.task.conversion=more;
select 'Variable hivevar:var0 is', '${hivevar:var0}' limit 1;
Why you may need to do this using SELECT? For example for easy checking parameter using casting or some UDF. If you need to check if parameter is of type DATE, use
set hive.fetch.task.conversion=more;
select 'Variable hivevar:var0 is', date '${hivevar:var0}' limit 1;
In this case if ${hivevar:var0} is not date, then type cast exception will be thrown and script execution terminated.
along with hivevar namespace, we can use one more property hive.root.logger=INFO,console.
this will display the query after replacing the variable value, from which we can find out the issue.
cat test.hql
set hivevar:var1=${hivevar:var11};
set hivevar:var2=2345;
select ${hivevar:var11};
select ${hivevar:var2};
hive command - hive --hiveconf hive.root.logger=INFO,console --hivevar var11=1234 -f test.hql
output on console
select 1234
2018-10-17T08:23:31,632 INFO [main] ql.Driver: Completed executing command(queryId=-4dd6-493f-88be-03810f847fe7); Time taken: 0.003 seconds
OK
2018-10-17T08:23:31,632 INFO [main] ql.Driver: OK
2018-10-17T08:23:31,670 INFO [main] io.NullRowsInputFormat$NullRowsRecordReader: Using null rows input format
1234

How to skip errors when using json_extract_path_text in Redshift?

I have the following query:
SELECT 'curl -s http://www.mde.operator.com/MRE/api?profile=CANCEL_AUTH&mode=assync-oneway&Auth='||json_extract_path_text(external_reference_id,'transactionIdAuth') + '&NUM=' + phone FROM dbo.cancelled WHERE id like '%Auth%';
It will bring more than 60 thousands results, but the json is broken and I cannot manage to delete the broken lines.
Is there any way to skip the rows which shows any kind of errors?
Note: It isn't null rows.
I've already try:
json_extract_path_text(regexp_replace(event_properties,'\\\\.',''),'someValue')
You can make them null rows though, by setting the null_if_invalid argument of the json_extract_path_text function to true.
Source: https://docs.aws.amazon.com/redshift/latest/dg/JSON_EXTRACT_PATH_TEXT.html

BigQuery bq command with asterisk (*) doesn't work in Compute Engine

I have a directory with a file named file1.txt
And I run the command:
bq query "SELECT * FROM [publicdata:samples.shakespeare] LIMIT 5"
In my local machine it works fine but in Compute Engine I receive this error:
Waiting on bqjob_r2aaecf624e10b8c5_0000014d0537316e_1 ... (0s) Current status: DONE
BigQuery error in query operation: Error processing job 'my-project-id:bqjob_r2aaecf624e10b8c5_0000014d0537316e_1': Field 'file1.txt' not found.
If the directory is empty it works fine. I'm guessing the asterisk is expanding the file(s) into the query but I don't know why.
Apparently the bq command which is located at /usr/bin/bq has the following script:
#!/bin/sh
exec /usr/lib/google-cloud-sdk/bin/bq ${#}
which expands the asterisk.
As a current workaround I'm calling /usr/lib/google-cloud-sdk/bin/bq directly.

Using timestamp literals in a WHERE clause with bq tool

I had a look at the BigQuery command line tool documentation and I saw that you are able to use timestamp literals in a WHERE clause. The documentation shows the following example:
$ bq query "SELECT name, birthday FROM dataset.table WHERE birthday <= '1959-01-01 01:02:05'"
Waiting on job_6262ac3ea9f34a2e9382840ee11538ef ... (0s) Current status: DONE
+------+---------------------+
| name | birthday |
+------+---------------------+
| kim | 1958-06-24 12:18:35 |
+------+---------------------+
As the dataset.table is not a public dataset, I build an example using the wikipedia dataset.
SELECT title, timestamp, SEC_TO_TIMESTAMP(timestamp) AS human_timestamp
FROM publicdata:samples.wikipedia
HAVING human_timestamp>'2008-01-01 01:02:03' LIMIT 5
The example works on the BigQuery Browser but it does not on the bq tool. Why? I tried to use scape characters and several combinations of single and double quotes without success. It is a Windows issue? Here goes a screenshot:
EDIT: This is BigQuery CLI 2.0.18
I know that "It works on my machine" isn't a satisfying answer, but I've tried this on my Mac and on a windows machine, and it appears to work fine on both. Here is the output from my windows machine for the same query you've specified:
C:\Users\Jordan Tigani>bq query "SELECT title, timestamp, SEC_TO_TIMESTAMP(timestamp) AS human_timestamp FROM publicdata:samples.wikipedia HAVING human_timestamp>'2008-01-01 01:02:03' LIMIT 5"
Waiting on bqjob_r607b7a74_00000144b71ddb9b_1 ... (0s) Current status: DONE
Can you make sure that the quotes you're using aren't pasted smart quotes and there aren't any stray unicode characters that might confuse the parsing?
One other hint is to use the --apilog=- option, which tells BigQuery to print out all interaction with the server to stdout. You can then see exactly what is getting sent to the BigQuery backend, and verify that the quotes are as expected.
I found out that the problem is due to the greater operator > in the Windows command line. It does not have anything to do with the google-cloud-sdk, sorry.
It seems that you have to use the scape to echo the sign in the command line: ^>
I found it at google groups (by Todd and Margo Chester), and the official reference at Microsoft site.