create a table with a column type RECORD - google-bigquery

I'm using big query and i want to create a job which populates a table with a "record" type columns.
The data will be populated by a query - so how can i write a query which returns "record" type columns.
Thanks!

Somehow option proposed by Pentium10 never worked for me in GBQ UI or API Explorer.
I might be missing something
Meantime, the workaround I found is as in below example
SELECT location.state, location.city FROM JS(
( // input table
SELECT NEST(CONCAT(state, ',', city)) AS locations
FROM (
SELECT state, city FROM
(SELECT 'florida' AS state, 'miami' AS city),
(SELECT 'california' AS state, 'la' AS city),
(SELECT 'romania' AS state, 'transylvania' AS city)
)
),
locations, // input columns
"[ // output schema
{'name': 'location', 'type': 'RECORD',
'mode': 'REPEATED',
'fields': [
{'name': 'state', 'type': 'STRING'},
{'name': 'city', 'type': 'STRING'}
]
}
]",
"function(row, emit){ // function
for (var i = 0; i < row.locations.length; i++) {
var c = [];
x = row.locations[i].split(',');
t = {state:x[0], city:x[1]}
c.push(t);
emit({location: c});
};
}"
)
Please note:
you should set destination table with Allow Large Results and unchecked Flatten Results
Result from output table is (in JSON Mode)
[
{
"location": [
{
"state": "california",
"city": "la"
}
]
},
{
"location": [
{
"state": "florida",
"city": "miami"
}
]
},
{
"location": [
{
"state": "romania",
"city": "transylvania"
}
]
}
]
Added to address some issue #AdiCohen has with his real example that
he showed in his recent comments:
Q: my query has other columns besides the record column, but when i ran
the query, they return as null. how can i create a table with both of
the types?
SELECT amount, currency, location.state, location.city FROM JS(
( // input table
SELECT NEST(CONCAT(state, ',', city)) AS locations,
SUM(amount) AS amount, MAX(currency) as currency
FROM (
SELECT state, city, amount, currency, ROW_NUMBER() OVER() as grp FROM
(SELECT 'florida' AS state, 'miami' AS city, 'coins' AS currency, 40 AS amount),
(SELECT 'california' AS state, 'la' AS city, 'coins' AS currency, 40 AS amount),
(SELECT 'romania' AS state, 'transylvania' AS city,'coins' AS currency, 40 AS amount)
) GROUP BY grp
),
amount, currency, locations, // input columns
"[ // output schema
{'name': 'location', 'type': 'RECORD', 'mode': 'REPEATED',
'fields': [
{'name': 'state', 'type': 'STRING'},
{'name': 'city', 'type': 'STRING'}
] },
{ 'name': 'amount', 'type': 'INTEGER'},
{ 'name': 'currency', 'type': 'STRING'}
]",
"function(row, emit) { // function
for (var i = 0; i < row.locations.length; i++) {
var c = [];
x = row.locations[i].split(',');
t = {state:x[0], city:x[1]}
c.push(t);
emit({amount: row.amount, currency: row.currency, location: c});
};
}"
)
Output here is:
[
{
"amount": "40",
"currency": "coins",
"location_state": "romania",
"location_city": "transylvania"
},
{
"amount": "40",
"currency": "coins",
"location_state": "florida",
"location_city": "miami"
},
{
"amount": "40",
"currency": "coins",
"location_state": "california",
"location_city": "la"
}
]

You need to use the dot notation to reflect the output as a RECORD example query:
select
'florida' as country.state,
'SFO' as country.city;
In this example country is the record and state|city are fields in the record.

Related

how can we replicate hive nested structs in snowflake?

I have nested struct data in DB I need to migrate that to snowflake how can I replicate that nested struct in snowflake. In snowflake I don't have struct data type it is only variant.
so google finds:
https://gist.github.com/irajhedayati/c595e349d68b7a5074da81f1b8c6eec5
which has some code like:
INSERT into calls_nested
SELECT
'5' AS call_id,
'Jack' AS name,
45 AS age,
named_struct('first_name', 'Joe', 'last_name', 'Doe') AS account,
named_struct('home', '514-111-2222', 'work', '514-333-4444') AS phone_directory,
array(
named_struct('street', '1 Guy', 'city', 'Montreal'),
named_struct('street', '1 McGill', 'city', 'Montreal')
) AS addresses;
named_struct -> object_construct
array -> array_construct
and you just keep stacking them if that's what you want.
SELECT
'5' AS call_id,
'Jack' AS name,
45 AS age,
object_construct('first_name', 'Joe', 'last_name', 'Doe') AS account,
object_construct('home', '514-111-2222', 'work', '514-333-4444') AS phone_directory,
array_construct(
object_construct('street', '1 Guy', 'city', 'Montreal'),
object_construct('street', '1 McGill', 'city', 'Montreal')
) AS addresses,
object_construct('call_id', call_id, 'name', name, 'age' ,age, 'account', account, 'phone_directory', phone_directory, 'addresses',addresses) as object_of_objects
;
CALL_ID
NAME
AGE
ACCOUNT
PHONE_DIRECTORY
ADDRESSES
OBJECT_OF_OBJECTS
5
Jack
45
{ "first_name": "Joe", "last_name": "Doe" }
{ "home": "514-111-2222", "work": "514-333-4444" }
[ { "city": "Montreal", "street": "1 Guy" }, { "city": "Montreal", "street": "1 McGill" } ]
{ "account": { "first_name": "Joe", "last_name": "Doe" }, "addresses": [ { "city": "Montreal", "street": "1 Guy" }, { "city": "Montreal", "street": "1 McGill" } ], "age": 45, "call_id": "5", "name": "Jack", "phone_directory": { "home": "514-111-2222", "work": "514-333-4444" } }

Grouping a parent record with its child associations

Given a table Location, where each row has many records in table Events, for example:
Location has columns: id, city, state
Event has columns: id, name, date, location_id
I want to perform a query that results in a structure like this (json array shown here):
[
{
location: {
id: 1,
city: 'San Francisco'
state: 'California'
},
events: [
{
id: 1,
name, 'Fest 1',
date, 'March 1, 2022'
},
{
id: 2,
name, 'Fest 2',
date, 'March 2, 2022'
}
]
},
{
location: {
id: 2,
city: 'Seattle'
state: 'Washington'
},
events: [
{
id: 3,
name, 'Fest 3',
date, 'March 3, 2022'
},
{
id: 4,
name, 'Fest 4',
date, 'March 4, 2022'
}
]
}
]
I'm struggling to achieve this. I've tried various subqueries, and GROUP BY approaches but not getting what I need.
Use json_agg() to create the array and to_jsonb() to convert the rows to json:
select
to_jsonb(location) as location,
json_agg(to_jsonb(event)) as events
from location
left join event on event.location_id = location.id
group by 1

How to parse out a JSON variable in Snowflake using SQL

So I have a dataset and it has a column which has data like this:
ID values
0001 [
{
prices: {
ia: '20K+',
ln: '3K-10K'
},
formats: [
'n',
'ia'
],
id: 'c8f4f498-1cfeaf1-455a-a5ac-310191wefw959583',
image_url: 'file.jpg',
slug: 'test1'
},
{
prices: {
ia: '20K+',
ln: '3K-10K'
},
formats: [
'n',
'ia'
],
id: 'c8f4f4wfwe98-1ca1-455a-a5ac-3101919wfewf59583',
image_url: 'file.jpg',
slug: 'test3'
}
]
0002 [
{
prices: {
ia: '20K+',
ln: '3K-10K'
},
formats: [
'n',
'ia'
],
id: 'c8f4feeee498-1ca1-455a-a5ac-3101919fwewf59583',
image_url: 'file.jpg',
slug: 'test2'
}
All I care about in this variable is the slug. But different ID's have different number of slugs. How could I get a new df from this data to show me every slug as well as the count.
So the result I want:
ID slugs slug_count
0001 ['test1','test3'] 2
0002 ['test2'] 1
Flatten and group:
with data as (
select $1 id, parse_json($2) j
from values(1, '[{"slug":"a"}]'), (2, '[{"slug":"a1"},{"slug":"a2"}]')
)
select id, array_agg(x.value:slug) slugs, count(*) slug_count
from data, table(flatten(j)) x
group by id

Return SQL Server database query with nested Json

I am trying to get this kind of answer when I consume my endpoint :
[
{
"McqID":"7EED5396-9151-4E3D-BCBF-FDB72CDD22B7",
"Questions":[
{
"QuestionId":"C8440686-531D-4099-89E9-014CAF9ED054",
"Question":"human text",
"Difficulty":3,
"Answers":[
{
"AnswerId":"7530DCF4-B2D9-48B0-9978-0E4690EA0C34",
"Answer":"human text2",
"IsTrue":false
},
{
"AnswerId":"5D16F17F-E205-42A5-873A-1A367924C182",
"Answer":"human text3",
"IsTrue":false
},
{
"AnswerId":"64E78326-77C3-4628-B9E3-2E8614D63632",
"Answer":"human text4",
"IsTrue":false
},
{
"AnswerId":"199241A9-0EF6-4F96-894A-9256B129CB1F",
"Answer":"human text5",
"IsTrue":true
},
{
"AnswerId":"EDCCAC18-5209-4457-95F2-C91666F8A916",
"Answer":"human text6",
"IsTrue":false
}
]
}
]
}
]
Here's my query (example) :
SELECT
Questions.QcmID AS QcmID,
(SELECT
Questions.id AS QuestionId,
Questions.Intitule AS Question,
Questions.Difficulte AS Difficulty,
(SELECT
Reponses.id AS AnswerId,
Reponses.Libelle AS Answer,
Reponses.IsTrue AS IsTrue
FROM
Reponses
WHERE
Reponses.QuestionID = Questions.id
FOR JSON PATH) AS Answers
FROM
Questions
WHERE
Questions.QcmID = '7EED5396-9151-4E3D-BCBF-FDB72CDD22B7'
FOR JSON PATH) AS Questions
FROM
Questions
WHERE
Questions.QcmID = '7EED5396-9151-4E3D-BCBF-FDB72CDD22B7'
FOR JSON PATH
I want a nested JSON representing my data, but it ends up being formatted like (smaller example) :
[
{
"JSON_F52E2B61-18A1-11d1-B105-00805F49916B":"[{\"QcmID\":\"7EED5396-9151-4E3D-BCBF-FDB72CDD22B7\"}]"
}
]
I've tried everything, FOR JSON PATH, FOR JSON AUTO, JSON_QUERY, etc...
Nothing works. FOR JSON PATH doesn't seem to work with multiple nested collections.
How do I get this result ?
You need to use JOINs as you would normally.
Using FOR JSON AUTO will pick the JOIN alias and if you want more control use the FOR JSON PATH.
I'm going to give you a generic example that will be easy to map to your scenario:
Option 1 - FOR JSON AUTO:
The JOIN alias will be used as the nested collection property name.
SELECT
ent.Id AS 'Id',
ent.Name AS 'Name',
ent.Age AS 'Age',
Emails.Id AS 'Id',
Emails.Email AS 'Email'
FROM Entities ent
LEFT JOIN EntitiesEmails Emails ON Emails.EntityId = ent.Id
FOR JSON AUTO
Option 2 - FOR JSON PATH:
You handle everything and note that the inner select must return a string, here also using FOR JSON PATH.
SELECT
ent.Id AS 'Id',
ent.Name AS 'Name',
ent.Age AS 'Age',
EMails = (
SELECT
Emails.Id AS 'Id',
Emails.Email AS 'Email'
FROM EntitiesEmails Emails WHERE Emails.EntityId = ent.Id
FOR JSON PATH
)
FROM Entities ent
FOR JSON PATH
Both generate the same result:
[{
"Id": 1,
"Name": "Alex",
"Age": 35,
"Emails": [{
"Id": 1,
"Email": "abc#domain.com"
}, {
"Id": 2,
"Email": "def#domain.com"
}, {
"Id": 3,
"Email": "ghi#domain.net"
}]
}, {
"Id": 2,
"Name": "Another Ale",
"Age": 40,
"Emails": [{
"Id": 4,
"Email": "user#skdfh.com"
}, {
"Id": 5,
"Email": "asldkj#als09q834.net"
}]
}, {
"Id": 3,
"Name": "John Doe",
"Age": 33,
"Emails": [{
"Id": 6,
"Email": "ooaoasdjj#ksjsk0913.org"
}]
}, {
"Id": 4,
"Name": "Mario",
"Age": 54,
"Emails": [{}]
}]
Cheers!

Postgres SQL query, that will group fields in associative array

Example: I have table "tickets"
id, int
client_id, int
client_name, text
Instead of the usual select ("SELECT id, client_id, client_name FROM tickets") I need something that will give as result:
{
"id": 2,
"client": {
"id": 31,
"name": "Mark"
}
}
If you like to use SQL for this, there is json_build_object function:
SELECT
json_build_object(
'id', id,
'client', json_build_object(
'id', client_id,
'name', client_name))
FROM
tickets;
Example:
#!/usr/bin/env python
import psycopg2
import json
conn = psycopg2.connect('')
cur = conn.cursor()
cur.execute("""
with tickets(id, client_id, client_name) as (values(1,2,'x'),(3,4,'y'))
SELECT
json_build_object(
'id', id,
'client', json_build_object(
'id', client_id,
'name', client_name))
FROM
tickets;
""")
for row in cur.fetchall():
print row, json.dumps(row[0])
Output:
({u'client': {u'id': 2, u'name': u'x'}, u'id': 1},) {"client": {"id": 2, "name": "x"}, "id": 1}
({u'client': {u'id': 4, u'name': u'y'}, u'id': 3},) {"client": {"id": 4, "name": "y"}, "id": 3}