I have data in this format:
[{"id":"b","type":"user"},{"id":"c","type":"system"}]
Would like to generate JSON message with only "id" selected, for example:
[{"id":"b"},{"id":"c"}]
So far I was only able to split them and remove "type", then concatenate with []
select json_array_elements_text(column1)::jsonb #- '{type}'
from (
select '[{"id":"b","type":"user"},{"id":"c","type":"system"}]'::json as column1
) t
Are there a better way to do it (I'm sure there is), please help thanks.
Edit:
There might be other properties in addition to "id" and "type" added in future, code will need to reference to "id" only.
[{"id":"b","type":"user"},{"id":"c","type":"system"}, {"id":"d","type":"system", "flag":"Y"}]
I have to advice next flow:
select array_agg(row_to_json(t.*)) from (
select id
from jsonb_to_recordset('[{"id":"b","type":"user"},{"id":"c","type":"system"}]'::jsonb) as x(id varchar, type varchar)
) t;
You can play SQL here
I found an answer that doesn't referencing other properties in JSON objects.
select json_agg(json_build_object('id', id)) as id
from (
select json_array_elements('[{"id":"b","type":"user"},{"id":"c","type":"system"}, {"id":"d","type":"system", "flag":"Y"}]'::json)::json->'id' as id
) t
Related
I want to create a table function that takes two arguments, fieldName and parameter, where I can later use this function to create tables in other fieldName and parameter pairs. I tried multiple ways, and it seems like the fieldName(column name) is always parsed as a string in the where clause. Wondering how should I be doing this in the correct way.
CREATE OR REPLACE TABLE FUNCTION dataset.functionName( fieldName ANY TYPE, parameter ANY TYPE)
as
(SELECT *
FROM `dataset.table`
WHERE format("%t",fieldName ) = parameter
)
Later call the function as
SELECT *
from dataset.functionName( 'passed_qa', 'yes')
(passed_qa is a column name and assume it only has 'yes' and 'no' value)
I tried using EXECUTE IMMEDIATE, it works, but I just want to know if there's a way to approach this in a functional way.
Thanks for any help!
Good news - IT IS POSSIBLE!!! (side note: in my experience - i haven't had any cases when something was not possible to achieve in BigQuery directly or indirectly/workaround maybe with some few exceptions)
See example below
create or replace table function dataset.functionName(fieldName any type, parameter any type)
as (
select * from `bigquery-public-data.utility_us.us_states_area` t
where exists ( select true
from unnest(`bqutil.fn.json_extract_keys`(to_json_string(t))) key with offset
join unnest(`bqutil.fn.json_extract_values`(to_json_string(t))) value with offset
using(offset)
where key = fieldName and value = parameter
)
)
Now, when table function created - run below as see result
select *
from dataset.functionName('state_abbreviation', 'GU')
you will get record for GUAM
Then try below
select *
from dataset.functionName('division_code', '0')
with output
For details see:
https://cloud.google.com/bigquery/docs/reference/standard-sql/table-functions
A work-around can be to use a case statement to select the desired column. If any column is needed, please use the solution of Mikhail Berlyant.
Create or replace table function Test.HL(fieldName string,parameter ANY TYPE)
as
(
SELECT *
From ( select "1" as tmp, 2 as passed_qa) # generate dummy table
Where case
when fieldName="passed_qa" then format("%t",passed_qa)
when fieldName="tmp" then format("%t",tmp)
else ERROR(concat('column ',fieldName,' not found')) end = parameter
)
I have a table that has JSON data stored and I'm using json_exists functions in the query. Below is my sample data from the column for one of the rows.
{"fields":["query.metrics.metric1.field1",
"query.metrics.metric1.field2",
"query.metrics.metric1.field3",
"query.metrics.metric2.field1",
"query.metrics.metric2.field2"]}
I want all those rows which have a particular field. So, I'm trying below.
SELECT COUNT(*)
FROM my_table
WHERE JSON_EXISTS(fields, '$.fields[*]."query.metrics.metric1.field1"');
It does not give me any results back. Not sure what I'm missing here. Please help.
Thanks
You can use # operator which refers to an occurrence of the array fields such as
SELECT *
FROM my_table
WHERE JSON_EXISTS(fields, '$.fields?(#=="query.metrics.metric1.field1")')
Demo
Edit : The above case works for 12R2+, considering that it doesn't work for your version(12R1), try to use JSON_TABLE() such as
SELECT fields
FROM my_table,
JSON_TABLE(fields, '$.fields[*]' COLUMNS ( js VARCHAR2(90) PATH '$' ))
WHERE js = 'query.metrics.metric1.field1'
Demo
I have no idea how to "pattern match" on the array element, but just parsing the whole thing and filtering does the job.
with t(x, json) as (
select 1, q'|{"fields":["a", "b"]}|' from dual union all
select 2, q'|{"fields":["query.metrics.metric1.field1","query.metrics.metric1.field2","query.metrics.metric1.field3","query.metrics.metric2.field1","query.metrics.metric2.field2"]}|' from dual
)
select t.*
from t
where exists (
select null
from json_table(
t.json,
'$.fields[*]'
columns (
array_element varchar2(100) path '$'
)
)
where array_element = 'query.metrics.metric1.field1'
);
In your code, you are accessing the field "query.metrics.metric1.field1" of an object in the fields array, and there is no such object (the elements are strings)...
I'm new to Postgres functions.
I'm trying to return part of JSON response similar to:
"ids":[
"9f076580-b5f5-4e73-af08-54d5fc4b87c0",
"bd34cfad-53c7-4443-bf48-280e34d76881"
]
This ids is stored in table unit and I query them as a part of subquery and then transform into JSON with the next query
SELECT coalesce(json_agg(row_to_json(wgc)), '[]'::json)
FROM (
SELECT
(
SELECT COALESCE(json_agg(row_to_json(ids)), '[]'::json)
FROM (SELECT json_agg(l.ids) as "id"
FROM unit
) as ids
) as "ids",
......
FROM companies c
) AS wgc;
The problem is that this query gives me extract object which I want to omit.
"ids":[
{
"id":[
"9f076580-b5f5-4e73-af08-54d5fc4b87c0",
"bd34cfad-53c7-4443-bf48-280e34d76881"
]
}
]
How can omit this "id" object??
It's a bit hard to tell how your table looks like, but something like this should work:
select jsonb_build_object('ids', coalesce(jsonb_agg(id), '[]'::jsonb))
from unit
I think you are overcomplicating things. You only need a single nesting level to get the IDs as an array. There is no need to use row_to_json on the array of IDs. The outer row_to_json() will properly take care of that.
SELECT coalesce(json_agg(row_to_json(wgc)), '[]'::json)
FROM (
SELECT (SELECT json_agg(l.ids) FROM unit ) as ids
....
FROM companies c
) AS wgc;
The fact that the select ... from unit is not a co-related sub-query is a bit suspicious though. This means you will get the same array for each row in the companies table. I would have expected something like (select .. from unit u where u.??? = c.???) as ids
I don't fully understand your question. This code:
SELECT (
SELECT COALESCE(json_agg(row_to_json(ids)), '[]'::json)
FROM (SELECT json_agg(l.ids) as "id"
FROM unit l
) as ids
) as "ids"
Returns:
[{"id":["9f076580-b5f5-4e73-af08-54d5fc4b87c0", "bd34cfad-53c7-4443-bf48-280e34d76881as"]}]
which seems to be what you want.
Here is a db<>fiddle.
Something else in your query is returning a JSON object that has ids as a field. You seem to want to construct the object you want.
I am trying to create a json object from my query using json_build_object as follows:
Select json_agg(json_build_object('first_name',first_name,
'last_name',last_name,
'email',email,
'date_joined',date_joined,
'verification_date',verification_date,
'zip_code',zip_code,
'city',city,
'country',country))
from users
WHERE last_name= 'xyz'
The json object builds fine with above shown number of columns however when i add all column names, the query gets stuck/hung indefinitely and no error message is displayed. I reduce the number of columns in the query and it returns a proper json object. Does anyone have any idea about this? Thanks
I also tried the query after omitting json_agg but still the same result
I am not sure why your query hangs - could be that there is a limit on the number of args - but since you are building each JSON object in a trivial way (attribute names are the same as column names), try using row_to_json like this:
select json_agg(row_to_json(u.*)) from users u WHERE last_name = 'xyz';
Having tens or hundreds of args is not nice anyway.
You could build several object fragments and then merge them all together:
with fragments as (
select jsonb_build_object (
'key1', key1,
'key2', key2,
'key3', key3,
-- additional keys not included for brevity
'key50', key50
) as fragment1,
jsonb_build_object (
'key51', key51,
'key52', key52,
'key53', key53,
-- additional keys not included for brevity
'key100', key100
) as fragment2
from some_table
)
select fragment1 || fragment2
from fragments;
I am looking for a way to rename a column that is nested from my avro schema. I tried the options google has on their docs (https://cloud.google.com/bigquery/docs/manually-changing-schemas) but any time I try to alias or cast as a nested structure it doesn't work.
For example:
SELECT
* EXCEPT(user.name.first, user.name.last),
user.name.first AS user.name.firstName,
user.name.last AS user.name.lastName
FROM
mydataset.mytable
However this doesn't like aliasing with paths. Another option I am trying to avoid is pulling in all my previous avro files in and converting them using dataflow. I am hoping for a more elegant solution than that. Thanks.
You need to rebuild the structure at each level. Here's an example over some sample data:
SELECT
* REPLACE(
(SELECT AS STRUCT user.* REPLACE (
(SELECT AS STRUCT user.name.* EXCEPT (first, last),
user.name.first AS firstName,
user.name.last AS lastName
) AS name)
) AS user)
FROM (
SELECT
STRUCT(
STRUCT('elliott' AS first, '???' AS middle, 'brossard' AS last) AS name,
'Software Engineer' AS occupation
) AS user
)
The idea is to replace the user struct with a new one where name has the desired struct type using the nested replacement/struct construction syntax.
You have to re-build those structs. You can do something like this:
select
struct(
struct(
user.name.first as firstName,
user.name.last as lastName
) as name,
user.height as height
) as user,
address,
age
from mydataset.mytable
Once you can verify the results, you can go about either creating a new table from these results or over-writing the existing table (which is essentially a workaround for renaming columns, but with caution). Hope it helps.