Update Query: Snowflake table variant column - sql

I am still learning Snowflake, any help would be really appreciated
I have a table(tbl1) that has a variant column(column_json) which looks like below:
{
"catalog": [
{
"img_href": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179361.jpg",
"name": "ADITI HAND BLOCKED PRINT",
"price": 16
},
{
"img_href": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179330.jpg",
"name": "TORBAY HAND BLOCKED PRINT",
"price": 17
},
{
"img_href": "https://schumacher-webassets.s3.amazonaws.com/Web%20Catalog-600/179362.jpg",
"name": "ADITI HAND BLOCKED PRINT",
"price": 18
}
],
"datetime": 161878993658
"catalog_id": 1
}
I am trying to add a new key-value pair to objects in catalog array. Hence, I am using an update query to update.
Here's my update query:
UPDATE tbl1
SET column_json:catalog[0] = object_insert(column_json:catalog[0], 'item_href', 'https://fschumacher.com/178791')
WHERE column_json:catalog_id = '1'
However I am facing below error
SQL compilation error: syntax error line 2 at position 20 unexpected ':'.

UPDATE only supports column operations so your approach won't work.
Rebuilding the catalog, as below, will work (but it does make me pause and wonder if there's a better way.)
UPDATE tbl1
SET column_json = new_catalog
FROM (select object_construct('catalog_id', catalog_id, 'datetime', any_value(datetime), 'catalog', array_agg(new_col)) new_catalog from (select column_json:datetime datetime, column_json:catalog_id catalog_id, iff(c.index = 0 and column_json:catalog_id = '1', object_insert(column_json:catalog[0], 'item_href', 'https://fschumacher.com/178791', true), c.value) new_col from tbl1, lateral flatten(column_json:catalog) c) group by catalog_id)
WHERE column_json:catalog_id = '1';
--This results in the following:
--select column_json:catalog[0].item_href from tbl1;
--"https://fschumacher.com/178791"

Related

SingleStore (MemSQL) How to flatten array using JSON_TO_ARRAY

I have a table source:
data
{ "results": { "rows": [ { "title": "A", "count": 61 }, { "title": "B", "count": 9 } ] }}
{ "results": { "rows": [ { "title": "C", "count": 43 } ] }}
And I want a table dest:
title
count
A
61
B
9
C
43
I found there is JSON_TO_ARRAY function that might be helpful, but got stuck how to apply it.
How to correctly flatten the json array from the table?
I have the following that works on your example but it might help you with the syntax.
In this query I created a table called json_tab with a column called jsondata.
With t as (
select table_col AS title FROM json_tab join TABLE(JSON_TO_ARRAY(jsondata::results::rows)))
SELECT t.title::$title title,t.title::$count count FROM t
I took example from the code snippet to work with Nested Arrays in a JSON Column
https://github.com/singlestore-labs/singlestoredb-samples/blob/main/JSON/Analyzing_nested_arrays.sql
Three options I came up with, which are essentially the same:
INSERT INTO dest
WITH t AS(
SELECT table_col AS arrRows FROM source JOIN TABLE(JSON_TO_ARRAY(data::results::rows))
)
SELECT arrRows::$title as title, arrRows::%count as count FROM t;
INSERT INTO dest
SELECT arrRows::$title as title, arrRows::%count as count FROM
(SELECT table_col AS arrRows FROM source JOIN TABLE(JSON_TO_ARRAY(data::results::rows)));
INSERT INTO dest
SELECT t.table_col::$title as title, t.table_col::%count as count
FROM source JOIN TABLE(json_to_array(data::results::rows)) t;

How can I modify all values that match a condition inside a json array?

I have a table which has a JSON column called people like this:
Id
people
1
[{ "id": 6 }, { "id": 5 }, { "id": 3 }]
2
[{ "id": 2 }, { "id": 3 }, { "id": 1 }]
...and I need to update the people column and put a 0 in the path $[*].id where id = 3, so after executing the query, the table should end like this:
Id
people
1
[{ "id": 6 }, { "id": 5 }, { "id": 0 }]
2
[{ "id": 2 }, { "id": 0 }, { "id": 1 }]
There may be more than one match per row.
Honestly, I didnĀ“t tried any query since I cannot figure out how can I loop inside a field, but my idea was something like this:
UPDATE mytable
SET people = JSON_SET(people, '$[*].id', 0)
WHERE /* ...something should go here */
This is my version
SELECT VERSION()
+-----------------+
| version() |
+-----------------+
| 10.4.22-MariaDB |
+-----------------+
If the id values in people are unique, you can use a combination of JSON_SEARCH and JSON_REPLACE to change the values:
UPDATE mytable
SET people = JSON_REPLACE(people, JSON_UNQUOTE(JSON_SEARCH(people, 'one', 3)), 0)
WHERE JSON_SEARCH(people, 'one', 3) IS NOT NULL
Note that the WHERE clause is necessary to prevent the query replacing values with NULL when the value is not found due to JSON_SEARCH returning NULL (which then causes JSON_REPLACE to return NULL as well).
If the id values are not unique, you will have to rely on string replacement, preferably using REGEXP_REPLACE to deal with possible differences in spacing in the values (and also avoiding replacing 3 in (for example) 23 or 34:
UPDATE mytable
SET people = REGEXP_REPLACE(people, '("id"\\s*:\\s*)2\\b', '\\14')
Demo on dbfiddle
As stated in the official documentation, MySQL stores JSON-format strings in a string column, for this reason you can either use the JSON_SET function or any string function.
For your specific task, applying the REPLACE string function may suit your case:
UPDATE
mytable
SET
people = REPLACE(people, CONCAT('"id": ', 3, ' '), CONCAT('"id": ',0, ' '))
WHERE
....;

Check if row contains a nested item in DynamoDB?

I am trying to use PartiQL with DynamoDB to perform SQL queries to check if a device is inactive and contains an error. Here's is the query I am using:
SELECT *
FROM "table"
WHERE "device"."active" = 0 AND "device"."error" IS NOT NULL
However I've noticed that even if a device doesn't have the error item, the query still returns a row. How can I query a device that only contains the error item?
With error item
{
"id": "value",
"name": "value,
"device": {
"active": 0,
"error": {
"reason": "value"
}
}
}
Without error item
{
"id": "value",
"name": "value,
"device": {
"active": 0
}
}
You're looking for IS NOT MISSING :) That's the partiql version of the filter expression operator function attribute_exists.
Given a table with a primary key PK, sort key SK, and the following data:
PK
SK
myMap
foo
1
{}
foo
2
{"test": {}}
-- Returns both foo 1 and foo 2
SELECT *
FROM "my-table"
WHERE "PK" = 'foo' AND "myMap"."test" IS NOT NULL
-- Returns just foo 2
SELECT *
FROM "my-table"
WHERE "PK" = 'foo' AND "myMap"."test" IS NOT MISSING
Also made sure my example specifies the PK in the WHERE clause - otherwise, your query will be a full scan. Maybe that's what you want, though. Just something to be aware of.

How do I INSERT columns with nested name syntax (ie. "item.description")?

I'm trying to merge two databases with the same schema on Google BigQuery.
I'm following the merge samples here: https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#merge_statement
However, my tables have nested columns, ie "service.id" or "service.description"
My code so far is:
MERGE combined_table
USING table1
ON table1.id = combined_table.id
WHEN NOT MATCHED THEN
INSERT(id, service.id, service.description)
VALUES(id, service.id, service.description)
However, I get the error message: Syntax error: Expected ")" or "," but got ".", and a red squiggly underline under .id on the INSERT(...) line.
Here is a view of part of my table's schema:
[
{
"name": "id",
"type": "STRING"
},
{
"name": "service",
"type": "RECORD",
"fields": [
{
"name": "id",
"type": "STRING"
},
{
"name": "description",
"type": "STRING"
}
]
},
{
"name": "cost",
"type": "FLOAT"
}
...
]
How do I properly structure this INSERT(...) statement so that I can include the nested columns?
Syntax error: Expected ")" or "," but got "."
Looks like you are on the right direction, Note in the documentation how you need to insert value to a REPEATED column,
You need to define the structure to guide BigQuery what to expect, For example:
STRUCT<created DATE, comment STRING>
This is the full example from the documentation
MERGE dataset.DetailedInventory T
USING dataset.Inventory S
ON T.product = S.product
WHEN NOT MATCHED AND quantity < 20 THEN
INSERT(product, quantity, supply_constrained, comments)
-- insert values like this
VALUES(product, quantity, true, ARRAY<STRUCT<created DATE, comment STRING>>[(DATE('2016-01-01'), 'comment1')])
WHEN NOT MATCHED THEN
INSERT(product, quantity, supply_constrained)
VALUES(product, quantity, false)
I've found the answer.
It turns out when referencing the top level of a STRUCT, BigQuery references all of the nested columns as well. So if I wanted to INSERT service and all of it's sub-columns (service.id and service.description), I only have to include service in the INSERT(...) statement.
The following code worked:
...
WHEN NOT MATCHED THEN
INSERT(id, service)
VALUES(id, service)
This would merge all sub columns, including service.id and service.description.

How to Query Cosmos DB graph by use of SQL CONTAINS

I have a Cosmo DB graph where I would like to access the 'name' field in an expression using the string matching CONTAINS in Cosmos DB. CONTAINS works at 1 level as in matching CONATINS
SELECT s.label, s.name FROM s WHERE CONTAINS(LOWER(s.name._value), "cara") AND s.label = "site"
I also tried with a UDF function
SELECT s.label, s.name FROM s WHERE(s.label = 'site' AND udf.strContains(s.name._value, '/cara/i'))
I don't get any hits or syntax errors from Cosmos DB even that should be at least one record in this example. Does anyone have a hint? Thanks in advance
[
{
"label": "site",
"name": [
{
"_value": "0315817 Caracol",
"id": "2e2f000d-2e0a-435a-b472-75d257236558"
}
]
},
{
"label": "site",
"name": [
{
"_value": "0315861 New Times",
"id": "48497172-1734-43d0-9866-51faf9f603ed"
}
]
}
]
I noticed that the name property is an array not an object.So, you need to use join in sql.
SELECT s.label, s.name , name._value FROM s
join name in s.name
where CONTAINS(LOWER(name._value), "cara") AND s.label = "site"
Output:
Hope it helps you.