How to convert JSON string column row into a queryable table - sql

I have exported to BigQuery from Firestore a whole collection to perform certain queries on it.
After the data was populated in my BigQuery console, now I can query the whole set like this
SELECT *
FROM `myapp-1a602.firestore_orders.orders_raw_changelog`
LIMIT 1000
Now, this statement throws my different columns, but the one I'm looking for is the data column, in my data column is each document JSON, but is in json format and I need to query all this values.
Now, this is the data from one row
{
"cart": [{
"qty": 1,
"description": "Sprite 1 L",
"productName": "Sprite 1 Liter",
"price": 1.99,
"productId": 9
}],
"storeName": "My awesome shop",
"status": 5,
"timestamp": {
"_seconds": 1590713204,
"_nanoseconds": 916000000
}
}
This data is inside the data column, so if I do this
SELECT data
FROM `myapp-1a602.firestore_orders.orders_raw_changelog`
LIMIT 1000
I will get all the json values for each document, but I don't know how to query that values, lets say I want to know all orders with status 5 and shopName My awesome shop , now, I need to do something with this json to convert it into a table ? does I need to perform the query in the json itself ?
How can I query this json output ?
Thanks

I need to do something with this json to convert it into a table ? does I need to perform the query in the json itself ?
Below is for BigQuery Standard SQL
#standardSQL
SELECT * EXCEPT(data, cart_item),
JSON_EXTRACT(data, '$.status') AS status,
JSON_EXTRACT(data, '$.storeName') AS storeName,
JSON_EXTRACT(cart_item, '$.qty') AS qty,
JSON_EXTRACT(cart_item, '$.description') AS description,
JSON_EXTRACT(cart_item, '$.productName') AS productName,
JSON_EXTRACT(cart_item, '$.price') AS price,
JSON_EXTRACT(cart_item, '$.productId') AS productId
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(data, '$.cart')) cart_item
If to apply to sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 order_id, '''
{
"cart": [{
"qty": 1,
"description": "Sprite 1 L",
"productName": "Sprite 1 Liter",
"price": 1.99,
"productId": 9
},{
"qty": 2,
"description": "Fanta 1 L",
"productName": "Fanta 1 Liter",
"price": 1.99,
"productId": 10
}],
"storeName": "My awesome shop",
"status": 5,
"timestamp": {
"_seconds": 1590713204,
"_nanoseconds": 916000000
}
}
''' data
)
SELECT * EXCEPT(data, cart_item),
JSON_EXTRACT(data, '$.status') AS status,
JSON_EXTRACT(data, '$.storeName') AS storeName,
JSON_EXTRACT(cart_item, '$.qty') AS qty,
JSON_EXTRACT(cart_item, '$.description') AS description,
JSON_EXTRACT(cart_item, '$.productName') AS productName,
JSON_EXTRACT(cart_item, '$.price') AS price,
JSON_EXTRACT(cart_item, '$.productId') AS productId
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(data, '$.cart')) cart_item
result is
Row order_id status storeName qty description productName price productId
1 1 5 "My awesome shop" 1 "Sprite 1 L" "Sprite 1 Liter" 1.99 9
2 1 5 "My awesome shop" 2 "Fanta 1 L" "Fanta 1 Liter" 1.99 10

You canwork with the json functiosn like the
CrEATE Table products (id Integer,attribs_json JSON );
INSERT INTO products VALUES (1,'{
"cart": [{
"qty": 1,
"description": "Sprite 1 L",
"productName": "Sprite 1 Liter",
"price": 1.99,
"productId": 9
}],
"storeName": "My awesome shop",
"status": 5,
"timestamp": {
"_seconds": 1590713204,
"_nanoseconds": 916000000
}
}');
select * from products where attribs_json->"$.status"
= 5 AND attribs_json->"$.storeName"
= 'My awesome shop';
id | attribs_json
-: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | {"cart": [{"qty": 1, "price": 1.99, "productId": 9, "description": "Sprite 1 L", "productName": "Sprite 1 Liter"}], "status": 5, "storeName": "My awesome shop", "timestamp": {"_seconds": 1590713204, "_nanoseconds": 916000000}}
db<>fiddle here
select attribs_json->"$.storeName",attribs_json->"$.status",attribs_json->"$.cart[0].qty" from products where attribs_json->"$.status"
= 5 AND attribs_json->"$.storeName"
= 'My awesome shop';
attribs_json->"$.storeName" | attribs_json->"$.status" | attribs_json->"$.cart[0].qty"
:-------------------------- | :----------------------- | :----------------------------
"My awesome shop" | 5 | 1
db<>fiddle here
And there is JSON_EXTRACT for mysql 5.7 and above.
Finally that is in the end only text, so you could use also REGEXP or RLIKE
To transfer the jaso again to rows, you can use JSON_TABLE

What you must do is to extract the values from the json data as:
SELECT .......
WHERE data->'$.storeName'= "My awesome shop" and data->'$.status' = 5
Extracting from the 'cart' or ´the 'timestamp' keys will give you a Json object that needs further extracting to get the data.
I hope it'll help you
You probably want to have a look at the MySql documentation (https://dev.mysql.com/doc/refman/8.0/en/json.html) or https://www.mysqltutorial.org/mysql-json/.

You can use UNNEST in the WHERE clause to access the cart's columns, and JSON_EXTRACT functions in the WHERE clause to filter the rows wanted. You need to take care on accessing either the json root or the array cart; json_data and cart_items in the example below (by the way, in your example shopName doesn't exist but storeName does).
WITH
`myapp-1a602.firestore_orders.orders_raw_changelog` AS (
SELECT
'{"cart": [{"qty": 1,"description": "Sprite 1 L","productName": "Sprite 1 Liter","price": 1.99,"productId": 9}, {"qty": 11,"description": "Sprite 11 L","productName": "Sprite 11 Liter","price": 11.99,"productId": 19}],"storeName": "My awesome shop","status": 5,"timestamp": {"_seconds": 1590713204,"_nanoseconds": 916000000}}' json_data )
SELECT
JSON_EXTRACT(json_data, '$.status') AS status,
JSON_EXTRACT(json_data, '$.storeName') AS storeName,
JSON_EXTRACT(cart_items, '$.productName') AS product,
JSON_EXTRACT_SCALAR(cart_items, '$.qty') AS qty
FROM
`myapp-1a602.firestore_orders.orders_raw_changelog`,
UNNEST(JSON_EXTRACT_ARRAY(json_data, '$.cart')) AS cart_items
WHERE
JSON_EXTRACT(json_data,'$.storeName') like "\"My awesome shop\"" AND
CAST(JSON_EXTRACT_SCALAR(json_data,'$.status') AS NUMERIC) = 5

Related

How to do an UNPIVOT on this json data?

I commonly have json data that is stored in BigQuery that is a key-value mapping such as the following:
id product sales_data
1 socks {"US": {"Price": 2.99, "Currency": "USD"},
"CA": {"Price": 3.04, "Currency": "CAD"}}
What I want to do is two-fold:
First, push the 'keys' into a consistent values struct
Unnest the now-consistent data
For example:
# push_keys_to_value(field, path, renamed)
# push_keys_to_value(sales_data, '$', 'Country'}
id product sales_data
1 socks [{"Price" 2.99, "Currency": "USD", "Country": "US"}, {"Price" 3.04, "Currency": "CAD", "Country": "CA"}]
Now unnested:
id product sales_data
1 socks {"Price" 2.99, "Currency": "USD", "Country": "US"}
1 socks {"Price" 3.04, "Currency": "CAD", "Country": "CA"}
This is a pretty common pattern I have -- taking string (json) data and 'un-pivoting' it. How could I do this in BigQuery, and is this a common pattern?
Consider below approach
select id, product, country,
json_extract_scalar(_, '$.Price') Price,
json_extract_scalar(_, '$.Currency') Currency
from (
select *, regexp_extract(sales_date, r'"' || country || '": ({.*?})') _
from your_table,
unnest(`bqutil.fn.json_extract_keys`(sales_date)) country
)
if applied to sample data in your question - output is

Nested json data not captured by JSON_TABLE in oracle sql

I'm using Oracle 12c(12.2) to read json data in a table.
SELECT jt.name,
jt.employee_id,
jt.company
FROM JSON_TABLE ( BFILENAME ('DB_DIR', 'vv.json')
i've nested data in json output. The key:value in nested data start with a value
"past_work": "N.A" for a record.
for other many records below it, have actual values like
"past_work": [{ "company": "XXXXX", "title": "XXXX"}]
but because first record done have value and start and end brackets [], oracle not capturing below records nested values.
any idea how to capture below records?
Example: Actual data like below
SELECT
jt.company,
jt.title
FROM
JSON_TABLE(
'{
"employee_data": [
{ "employee_id": "111",
"past_work": "N/A"
},
{ "employee_id": "222",
"past_work": [
{"company": "XXXXX", "title": "XXXX"},
{"company": "YYYYY", "title": "YYYY"}
]
},
{ "employee_id": "333",
"past_work": [
{"company": "XXXXX", "title": "XXXX"},
{"company": "YYYYY", "title": "YYYY"}
]
}
]
}',
'$.past_work[*]'
COLUMNS (
company VARCHAR2(100) PATH '$.company',
title VARCHAR2(100) PATH '$.title'
)
)
AS jt
now when i execute above statment, i'm getting null for company values for emplyee_id 333 and below.
Thanks
If past_work is supposed to be an array of past (company, title) pairs, then the proper way to encode "no history" is not to use a string value like "N/A", but instead you should use an empty array, as I show in the code below. If you do it your way, you can still extract the data, but it will be exceptionally messy. If you use JSON, use it correctly.
Also, you said you want to extract company and title. Just those? That makes no sense. Rather, you probably want to extract the employee id for each employee, along with the work history. In the work history, I add a column "for ordinality" (to show which company was first, which was second, etc.) If you don't need it, just leave it out.
To access nested columns, you must use the nested clause in the columns specification.
select employee_id, ord, company, title
from json_table(
'{
"employee_data": [
{ "employee_id": "111",
"past_work": [ ]
},
{ "employee_id": "222",
"past_work": [
{"company": "XXXXX", "title": "XXXX"},
{"company": "YYYYY", "title": "YYYY"}
]
},
{ "employee_id": "333",
"past_work": [
{"company": "XXXXX", "title": "XXXX"},
{"company": "YYYYY", "title": "YYYY"}
]
}
]
}', '$.employee_data[*]'
columns (
employee_id varchar2(10) path '$.employee_id',
nested path '$.past_work[*]'
columns (
ord for ordinality,
company varchar2(10) path '$.company',
title varchar2(10) path '$.title'
)
)
) jt
order by employee_id, ord;
Output:
EMPLOYEE_ID ORD COMPANY TITLE
----------- --- ------- -----
111
222 1 XXXXX XXXX
222 2 YYYYY YYYY
333 1 XXXXX XXXX
333 2 YYYYY YYYY
First, the json snippet is malformed, it MUST be surrounded by {} in order to be parsable as a json object...
{"past_work": [{ "company": "XXXXX", "title": "XXXX"}]}
Then, you can tell the json parser that you want to pull the rows from the past_work element...
JSON_TABLE(<yourJsonString>, '$.past_work[*]')
The [*] tells the parser that past_work is an array, and to process that array in to rows of json objects, rather than just return the whole array as a single json object.
That gives something like...
SELECT
jt.company,
jt.title
FROM
JSON_TABLE(
'{
"past_work": [
{"company": "XXXXX", "title": "XXXX"},
{"company": "YYYYY", "title": "YYYY"}
]
}',
'$.past_work[*]'
COLUMNS (
company VARCHAR2(100) PATH '$.company',
title VARCHAR2(100) PATH '$.title'
)
)
AS jt
db<>fiddle demo
For more details, I recommend reading the docs:
https://docs.oracle.com/database/121/SQLRF/functions092.htm#SQLRF56973
EDIT: Updated example, almost a copy and paste from the docs
Please Read The Docs!
SELECT
jt.*
FROM
JSON_TABLE(
'{
"XX_data":[
{
"employee_id": "E1",
"full_name": "E1 Admin",
"past_work": "N/A"
},
{
"employee_id": "E2",
"full_name": "E2 Admin",
"past_work": [
{"company": "E2 PW1 C", "title": "E2 PW1 T"},
{"company": "E2 PW2 C", "title": "E2 PW2 T"},
]
},
]
}',
'$.XX_data[*]'
COLUMNS (
employee_id VARCHAR2(100) PATH '$.employee_id',
full_name VARCHAR2(100) PATH '$.full_name',
past_work VARCHAR2(100) PATH '$.past_work',
NESTED PATH '$.past_work[*]'
COLUMNS (
past_work_company VARCHAR2(100) PATH '$.company',
past_work_title VARCHAR2(100) PATH '$.title'
)
)
)
AS jt
Another db<>fiddle demo
Thanks all for the Comments. Have asked product team to provide data in correct format.

Put variable inside json_extract_path_text in postgresql query

I have following select:
select json_extract_path_text(rules, 'amount', '5', 'percentage')
from promotion_rules
Sample from JSON looks like this:
{
"amount": {
"1": {
"percentage": 1
},
"2": {
"percentage": 3
},
"3": {
"percentage_below_eq": 5,
"percentage_above": 10,
"price": 20
},
"4": {
"percentage_below_eq": 10,
"percentage_above": 15,
"price": 20
}
}
}
I want to use values from other queries/tables/cte inside above json_extract function instead of '5' (or achieve exact effect), how it can be done?
Here's the part of code and fiddle with full data, I can't put it all here because stack tells me that my post i mostly code.
with percentages as (select pr.*, json_object_keys(rules->'amount')::INT as amount
from
promotion_rules pr
where id = 1
)
select
o.id as order_id,
json_extract_path_text(rules, 'amount', o.products_no, 'percentage') as percentage --it doesn't work this way, either with brackets
from orders o
join percentages p on p.amount = o.products_no
https://www.db-fiddle.com/f/oSQ3eW2G3kHgr3xvpHLw9Q/0
json_extract_path expects a list of text parameters.
If you want to use a column that's not text you need to cast it:
json_extract_path_text(rules, 'amount', o.products_no::text, 'percentage')

Flatten the jsonb nested array in postgresql

I have a data in the table as
id(integer) | label(text) | value(jsonb) |
---------------|-----------------|------------------|
12345 | Education | [[{"label": "Type", "value": "Under Graduate"},{"label": "Location", "value": "New Delhi"}],[{"label": "Type", "value": "Post Graduate"}]]|
And the required output is :
id | label | value |
------|---------------------|----------------|
12345 | Education_Type_1 | Under Graduate |
12345 | Education_Location_1| New Delhi |
12345 | Education_Type_2 | Post Graduate |
Can someone please help me solve this issue that I am facing?
You can use jsonb_array_elements(your_jsonb_column). Tested on Postgres 9.6. You can use json_array_elements(your_json_column) if you are using some other version.
Table:
create table test (id int,label text, value jsonb);
Insert Statement:
insert into test values(12345,'Education','[[{"label": "Type", "value": "Under Graduate"}],[{"label": "Type", "value": "Post Graduate"}]]');
insert into test values(123456,'Education2','[[{"label": "Type2", "value": "Under Graduate2"}],[{"label": "Type2", "value": "Post Graduate2"}]]');
SQL Query:
select id, label,jsonb_array_elements(value)->0->>'value'
from test
Where 0 is used to take first elements from an array.
->> is used to remove quotes from the string.
Output:
id label value
12345 Education Under Graduate
12345 Education Post Graduate
123456 Education2 Under Graduate2
123456 Education2 Post Graduate2
SQL Fiddle
I found the solution. Thanks #Fahad Anjum. I wrote the solution on top of your soultion.
SELECT
'Education_' || (jsonb_array_elements(elem)->>'label')::text || '_' || pos::text AS label, jsonb_array_elements(elem)->>'value'
FROM jsonb_array_elements(
'{"test": [
[{"label":"Type", "value": "Under Graduate"},{"label":"Location", "value": "New Delhi"},{"label":"CGPA", "value": "9.07"}],
[{"label":"Type", "value": "Post Graduate"},{"label":"Location", "value": "Bangalore"}],
[{"label":"Type", "value": "Some education 1"}]]}'::jsonb->'test'
) WITH ordinality arr(elem, pos);
Since we value column is like multi-dimensional array of irregular dimension we will use recursive query to find solution.
Below query result in required output you want
I have populated your sample data in CTE.
with recursive cte(id,label,value,dims) as (
select
12345,
'Education'::text,
'[
[
{"label": "Type", "value": "Under Graduate"},
{"label": "Location", "value":"New Delhi"}
],
[
{"label": "Type", "value": "Post Graduate"}
]
]'::jsonb,
jsonb_array_length('[[{"label": "Type", "value": "Under Graduate"},{"label": "Location", "value": "New Delhi"}],[{"label": "Type", "value": "Post Graduate"}]]'::jsonb)
), res(id,label,val,dims) as (
select cte.id,cte.label,l.v,cte.dims-1
from cte,lateral(
select jsonb_array_elements(cte.value) as v
) l
union all
select
res.id,res.label,l.v,res.dims-1
from res,lateral(
select jsonb_array_elements(res.val) as v
) l
where
res.dims>0
)
select
res.id,
res.val->>'value' as value,
res.label ||
'_'||
(res.val->>'label')::text ||
'_' ||
row_number() over (partition by id,label,(res.val->>'label')::text) as label
from res
where dims=0

Linq to XML query to SQL

UPDATE:
I've turned my xml into a query table in coldfusion, so this may help to solve this.
So my data is:
[id] | [code] | [desc] | [supplier] | [name] | [price]
------------------------------------------------------
1 | ABCDEF | "Tst0" | "XYZ" | "Test" | 123.00
2 | ABCDXY | "Tst1" | "XYZ" | "Test" | 130.00
3 | DCBAZY | "Tst2" | "XYZ" | "Tst2" | 150.00
Now what I need is what the linq to xml query outputs below. Output should be something like (i'll write it in JSON so it's easier for me to type) this:
[{
"code": "ABCD",
"name": "Test",
"products":
{
"id": 1,
"code": "ABCDEF",
"desc": "Tst0",
"price": 123.00
},
{
"id": 2,
"code": "ABCDXY",
"desc": "Tst1",
"price": 130.00
}
},
{
"code": "DCBA",
"name": "Tst2",
"products":
{
"id": 3,
"code": "DCBAZY",
"desc": "Tst2",
"price": 150.00
}
}]
As you can see, Group by the first 4 characters of 'CODE' and 'Supplier' code.
Thanks
How would i convert the following LINQ to XML query to SQL?
from q in query
group q by new { Code = q.code.Substring(0, 4), Supplier = q.supplier } into g
select new
{
code = g.Key.Code,
fullcode = g.FirstOrDefault().code,
supplier = g.Key.Supplier,
name = g.FirstOrDefault().name,
products = g.Select(x => new Product { id = x.id, c = x.code, desc = string.IsNullOrEmpty(x.desc) ? "Description" : x.desc, price = x.price })
}
Best i could come up with:
SELECT c, supplier, n
FROM products
GROUP BY C, supplier, n
Not sure how to get the subquery in there or get the substring of code.
ps: this is for coldfusion, so I guess their version of sql might be different to ms sql..
The easiest way is to attache a profiler to you database and see what query is generate by the linq-to-SQL engine.