Include multiple SQL queries in one BigQueryInsertJobOperator Airflow

Include multiple SQL queries in one BigQueryInsertJobOperator Airflow - google-bigquery

The syntax looks like this,
configuration={
"query": {
"query": "{% include ['query_2.sql'] %}",
"useLegacySql": False,
}
Is there any way I can do it?
"query": "{% include ['query_2.sql', 'query_3.sql', 'query_4.sql'] %}"

configuration={
"query": {
"query": "{% for f in ['query_1.sql','query_2.sql'] %}"
"{% include f %}"
";"
"{% endfor %}",
"useLegacySql": False }

You can create a stored procedure listing all the queries you need and then you call it like this:
t1 = BigQueryInsertJobOperator(
task_id="exec_sp",
project_id=PROJECT_ID,
configuration={
"query": {
"query": f"CALL `{PROJECT_ID}.{DATASET}.stored_procedure_name` ();",
"useLegacySql": False,
"timeoutMs": 3600000,
}
}
)

Related

Shopware 6 custom endpoint for admin api doesn't return only the requested fields

I have a custom endpoint for the admin api which returns some products.
{% block response %}
{% set criteria = {
"includes": {
"product": ["id", "name"],
"cover": ['id']
},
"associations": {
"cover": {},
},
}%}
{% set products = services.repository.search('product', criteria) %}
{% set response = services.response.json({ 'products': products }) %}
{% do hook.setResponse(response) %}
{% endblock %}
When i call this endpoint i want to receive only the products ids and names as described in the criteria. But i get the full object.
I posted an example response below:
{
"apiAlias": "api_feed_data_response",
"products": {
"entity": "product",
"total": 50,
"aggregations": [],
"page": 1,
"limit": 100,
"elements": [
{
"parentId": null,
"childCount": 0,
"taxId": "a4440e5086e14cf793627c9d670694aa",
"manufacturerId": "cc1c20c365d34cfb88bfab3c3e81d350",
"unitId": null,
"active": true,
"displayGroup": "75eef7154e9f22229e54e37b3ca0e54b",
"manufacturerNumber": null,
"ean": null,
"sales": 0,
...
}
How can i get only the requested fields (id, name) ?

Starting with 6.5 of shopware it will be possible with:
{% block response %}
{% set criteria = {
"fields": {
"product": ["id", "name"],
"cover": ['id']
},
"associations": {
"cover": {},
},
}%}
...
{% endblock %}
So you should use fields instead of includes.
The difference between those two is that includes are only filtered out at the CRUD-API level,but before the response is encoded in the API layer the entities contain all fields. As you have a custom API-endpoint the filtering does not work in this case.
The fields are filtered out directly at the DB level, so that only the defined fields are fetched from the database, that means that your response in your custom API endpoint will also only include those fields.

nested select query in elasticsearch

I have to convert the following query in elasticsearch :
select * from index where observable not in (select observable from index where tags = 'whitelist')
I read that I should use a Filter in a Not Filter but I don't understand how to do.
Can anyone help me?
Thanks
EDIT:
I have to get all except those that have 'whitelist' tag but I need to check also that nothing of the blacklist element is contained into the whitelist.

Your SQL query can be simplified to this:
select * from index where tags not in ('whitelist')
As a result the "corresponding" ES query would be
curl -XPOST localhost:9200/index/_search -d '{
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": {
"terms": {
"tags": [
"whitelist"
]
}
}
}
}
}
}
}'
or another using the not filter instead of bool/must_not:
curl -XPOST localhost:9200/index/_search -d '{
"query": {
"filtered": {
"filter": {
"not": {
"terms": {
"tags": [
"whitelist"
]
}
}
}
}
}
}'

Define a dynamic not_analyzed field for a nested document

I have a document like below, the "tags" field is a nested document, and I want to make all child field for tags document to be index = not_analyzed. The problem is that field in tags will be dynamic. any tag could possible.
So how I can define dynamic mapping for this.
{
strong text'level': 'info',
'tags': {
'content': u'Nov 6 11:07:10 ja10 Keepalived_healthcheckers: Adding service [172.16.08.105:80] to VS [172.16.1.21:80]',
'id': 1755360087,
'kid': '2012121316',
'mailto': 'yanping3,chunying,pengjie',
'route': 15,
'service': 'LVS',
'subject': 'LVS_RS',
'upgrade': 'no upgrade configuration for this alert'
},
'timestamp': 1383707282.500464
}

I think you can use dynamic templates for this. For example following shell script creates dynamic_mapping_test index with dynamic template set when indexing field tags.*, mapping is set to type:string and index:not_analyzed.
echo "Delete dynamic_mapping_test"
curl -s -X DELETE http://localhost:9200/dynamic_mapping_test?pretty ; echo ""
echo "Create dynamic_mapping_test with nested tags and dynamic_template"
curl -s -X POST http://localhost:9200/dynamic_mapping_test?pretty -d '{
"mappings": {
"document": {
"dynamic_templates": [
{
"string_template": {
"path_match": "tags.*",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
],
"properties": {
"tags": {
"type": "nested"
}
}
}
}
}' ; echo ""
echo "Display mapping"
curl -s "http://localhost:9200/dynamic_mapping_test/_mapping?pretty" ; echo ""
echo "Index document with new property tags.content"
curl -s -X POST "http://localhost:9200/dynamic_mapping_test/document?pretty" -d '{
"tags": {
"content": "this CONTENT should not be analyzed"
}
}' ; echo ""
echo "Refresh index"
curl -s -X POST "http://localhost:9200/dynamic_mapping_test/_refresh"
echo "Display mapping again"
curl -s "http://localhost:9200/dynamic_mapping_test/_mapping?pretty" ; echo ""
echo "Index document with new property tags.title"
curl -s -X POST "http://localhost:9200/dynamic_mapping_test/document?pretty" -d '{
"tags": {
"title": "this TITLE should not be analyzed"
}
}' ; echo ""
echo "Refresh index"
curl -s -X POST "http://localhost:9200/dynamic_mapping_test/_refresh"; echo ""
echo "Display mapping again"
curl -s "http://localhost:9200/dynamic_mapping_test/_mapping?pretty" ; echo ""

I suggest, all string "not_analyzed", and all numbers to long and "not_analyzed".
Because default string analyzed have more memory and file size.
I have reduced size and search fields' full word
range search long type.
{
"mappings": {
"_default_": {
"_source": {
"enabled": true
},
"_all": {
"enabled": false
},
"_type": {
"index": "no",
"store": false
},
"dynamic_templates": [
{
"el": {
"match": "*",
"match_mapping_type": "long",
"mapping": {
"type": "long",
"index": "not_analyzed"
}
}
},
{
"es": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
}
}

I don't think there is any way to specify mapping while indexing the data. So, as an alternative, you can modify your tags document to have the following mapping:
{ tags: {
properties: {
tag_type: {type: 'string', index: 'not_analyzed'}
tag_value: {type: 'string', index: 'not_analyzed'}
}
}
}
Here, tag_type can contain the any of the values (content, id, kid, mailto, etc.), and tag_values can contain the actual value of the field that is named in tag_type.

prefix fuzzy query (not using query_string)

I want to do prefix fuzzy search on single term.
Basically I want to get same result as if this search request has been sent:
{
"from": 0,
"size": 100,
"query": {
"query_string": {
"query": "dala~*"
}
},
"filter": {}
}
but without query_string syntax parsing. Search above should match to Dallas term.

In ElasticSearch, if you set fuzzy_prefix_length, you should be able to specify just the fuzzy tilde and get prefix matching:
{
"from": 0,
"size": 100,
"query": {
"query_string": {
"query": "dala~",
"fuzzy_prefix_length": 3
}
},
"filter": {}
}
Similar in spirit to this question

elastich search missing filter on query

it's the first time I use the 'missing' parameter and I am not sure if I am doing something wrong as i am not getting what i expect.
Can someone please tell me if the missing condition is correctly integrated in this query? it should created 5 facets, counting for each one only the occurrences for which decimallatitude field is 'not set in the index' or its value is null.
curl -XGET http://my_url:9200/idx_occurrence/Occurrene/_search?pretty=true -d '{
"filter": {
"missing": {
"field": "decimallatitude",
"existence": true,
"null_value": true
}
},
"query": {
"query_string": {
"fields": ["dataset"],
"query": "3",
"default_operator": "AND"
}
},
"facets": {
"test": {
"terms": {
"field": ["kingdom_interpreted"],
"size": 5
}
}
}
}
'

As you can see on the Search API - Filter page, the filter is applied to your query results but not to the facets. To make it work for facets, try using the Filtered Query instead
curl -XGET http://my_url:9200/idx_occurrence/Occurrene/_search?pretty=true -d '{
"query": {
"filtered": {
"filter": {
"missing": {
"field": "decimallatitude",
"existence": true,
"null_value": true
}
},
"query": {
"query_string": {
"fields": ["dataset"],
"query": "3",
"default_operator": "AND"
}
}
}
},
"facets": {
"test": {
"terms": {
"field": ["kingdom_interpreted"],
"size": 5
}
}
}
}
'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Include multiple SQL queries in one BigQueryInsertJobOperator Airflow - google-bigquery

The syntax looks like this, configuration={ "query": { "query": "{% include ['query_2.sql'] %}", "useLegacySql": False, } Is there any way I can do it? "query": "{% include ['query_2.sql', 'query_3.sql', 'query_4.sql'] %}"

configuration={ "query": { "query": "{% for f in ['query_1.sql','query_2.sql'] %}" "{% include f %}" ";" "{% endfor %}", "useLegacySql": False }

Related

Shopware 6 custom endpoint for admin api doesn't return only the requested fields

nested select query in elasticsearch

Define a dynamic not_analyzed field for a nested document

prefix fuzzy query (not using query_string)

elastich search missing filter on query

Categories

Resources