Convert struct values to row in big query - sql

I want to convert values of struct to independent row
My table looks like
|id | details
| 1 | {d_0:{id:'1_0'},d_1:{id:'1_1'}}
| 2 | {d_0:{id:'2_0'},d_1:{id:'2_1'}}
Expected Result (will be flattening the inner struct here)
| id |
|'1_0'|
|'1_1'|
|'2_0'|
|'2_1'|
Since IDK how many fields will be there in details is there any way to convert all the individual fields of the struct as independent rows.
The schema for all values in the details.d_0, details.d_1,... will be the same.
Any help or pointer to resources is appreciated.

You may use this query that iterates array to achieve your desired output:
Creating table:
CREATE TABLE `<proj_id>.<dataset>.<table>` as
WITH data AS (
SELECT "1" AS id, STRUCT(STRUCT( '1_0' as id) as d_0, STRUCT( '1_1' as id) as d_1) as details,
union all SELECT "2" AS id, STRUCT(STRUCT( '2_0' as id) as d_0, STRUCT( '2_1' as id) as d_1) as details
),
tier_1 as (
select id,details.* from data
)
select * from tier_1
Actual Query:
DECLARE i INT64 DEFAULT 0;
DECLARE query_ary ARRAY<STRING> DEFAULT
ARRAY(
select concat(column_name,'.id') from `<dataset>.INFORMATION_SCHEMA.COLUMNS`
WHERE
table_name = <your-table> AND regexp_contains(column_name, r'd\_\d')
);
CREATE TEMP TABLE result(id STRING);
LOOP
SET i = i + 1;
IF i > ARRAY_LENGTH(query_ary) THEN
LEAVE;
END IF;
EXECUTE IMMEDIATE '''
INSERT result
SELECT ''' || query_ary[ORDINAL(i)] || ''' FROM `<proj_id>.<dataset>.<table>`
''';
END LOOP;
SELECT * FROM result;
Output:

Consider below approach
select id from your_table,
unnest(split(translate(format('%t', details), '()', ''), ', ')) id
if applied to sample data in your question as
with your_table as (
select "1" id, struct(struct('1_0' as id) as d_0, struct('1_1' as id) as d_1) details union all
select "2", struct(struct('2_0'), struct('2_1'))
)
output is

Related

Geography function over a column

I am trying to use the st_makeline() function in order to create lines for every points and the next one in a single column.
Do I need to create another column with the 2 points already ?
with t1 as(
SELECT *, ST_GEOGPOINT(cast(long as float64) , cast(lat as float64)) geometry FROM `my_table.faissal.trajets_flix`
where id = 1
order by index_loc
)
select index_loc geometry
from t1
Here are the results
Thanks for your help
You seems to want to write this code:
https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_makeline
WITH t1 as (
SELECT *, ST_GEOGPOINT(cast(long as float64), cast(lat as float64)) geometry
FROM `my_table.faissal.trajets_flix`
-- WHERE id = 1
)
SELECT id, ST_MAKELINE(ARRAY_AGG(geometry ORDER BY index_loc)) traj
FROM t1
GROUP BY id;
with output:
When visualized on the map.
Consider also below simple and cheap option
select st_geogfromtext(format('linestring(%s)',
string_agg(long || ' ' || lat order by index_loc))
) as path
from `my_table.faissal.trajets_flix`
where id = 1
if applied to sample data in your question - output is
which is visualized as

Equivalent function in HANA DB for json_object

I would like to return the query results into json format in HANA DB.
There is a json_object function in oracle to achieve this requirement, but I am not seeing any function in HANA.
Does anyone knows if this kind of function exists in HANA
For example:
Table Author contains non-json data as follows:
---------------------------------------------
| firstName | lastName |
---------------------------------------------
| Paulo | Coelho |
| George | Orwell |
---------------------------------------------
write a select statement to return result as json.
In Oracle it can be returned using query:
SELECT json_object(
KEY 'firstName' VALUE author.first_name,
KEY 'lastName' VALUE author.last_name
)
FROM author
Output looks like this:
---------------------------------------------
| json_array |
---------------------------------------------
| {"firstName":"Paulo","lastName":"Coelho"} |
| {"firstName":"George","lastName":"Orwell"} |
----------------------------------------------
Does anyone knows query or function in HANA to achieve the same result?
you can use the already mentioned function in SAP HANA too
JSON_QUERY (
<JSON_API_common_syntax>
[ <JSON_output_clause> ]
[ <JSON_query_wrapper_behavior> ]
[ <JSON_query_empty_behavior> ON EMPTY ]
[ <JSON_query_error_behavior> ON ERROR ]
)
research
For 2.0 SP04 and above there's a for json addition to the select statement. As documentation says, it is only permitted in subqueries, so you need to select individual columns in subselect (if you need a result set of JSON objects) of generate a JSON array as a single scalar result. Column names are inherited from subquery aliases.
Case 1:
with a as (
select 'AAA' as field1, 'Value 1' as val from dummy union all
select 'BBB' as field1, 'Value 2' as val from dummy
)
select
/*Use correlated subquery with single row*/
json_value((select a.field1, a.val from dummy for json), '$[0]') as res
from a
Or more effort to type-in, but less structure-dependent:
with a as (
select 'AAA' as field1, 'Value 1' as val from dummy union all
select 'BBB' as field1, 'Value 2' as val from dummy
)
, json_source as (
/*Intermediate query to use as correlation source in JSON_TABLE*/
select (select * from a for json) as tmp_json
from dummy
)
select json_parsed.*
from json_source,
json_table(
json_source.tmp_json
/*Access individual items*/
, '$[*]'
columns (
res nvarchar(1000) format json path '$'
)
) as json_parsed
Both return:
RES
{"FIELD1":"AAA","VAL":"Value 1"}
{"FIELD1":"BBB","VAL":"Value 2"}
Or as a scalar query returning JSON array (Case 2):
with a as (
select 'AAA' as field1, 'Value 1' as val from dummy union all
select 'BBB' as field1, 'Value 2' as val from dummy
)
select *
from (select * from a for json)
JSONRESULT
[{"FIELD1":"AAA","VAL":"Value 1"},{"FIELD1":"BBB","VAL":"Value 2"}]

how to convert jsonarray to multi column from hive

example:
there is a json array column(type:string) from a hive table like:
"[{"filed":"name", "value":"alice"}, {"filed":"age", "value":"14"}......]"
how to convert it into :
name age
alice 14
by hive sql?
I've tried lateral view explode but it's not working.
thanks a lot!
This is working example of how it can be parsed in Hive. Customize it yourself and debug on real data, see comments in the code:
with your_table as (
select stack(1,
1,
'[{"field":"name", "value":"alice"}, {"field":"age", "value":"14"}, {"field":"something_else", "value":"somevalue"}]'
) as (id,str) --one row table with id and string with json. Use your table instead of this example
)
select id,
max(case when field_map['field'] = 'name' then field_map['value'] end) as name,
max(case when field_map['field'] = 'age' then field_map['value'] end) as age --do the same for all fields
from
(
select t.id,
t.str as original_string,
str_to_map(regexp_replace(regexp_replace(trim(a.field),', +',','),'\\{|\\}|"','')) field_map --remove extra characters and convert to map
from your_table t
lateral view outer explode(split(regexp_replace(regexp_replace(str,'\\[|\\]',''),'\\},','}|'),'\\|')) a as field --remove [], replace "}," with '}|" and explode
) s
group by id --aggregate in single row
;
Result:
OK
id name age
1 alice 14
One more approach using get_json_object:
with your_table as (
select stack(1,
1,
'[{"field":"name", "value":"alice"}, {"field":"age", "value":"14"}, {"field":"something_else", "value":"somevalue"}]'
) as (id,str) --one row table with id and string with json. Use your table instead of this example
)
select id,
max(case when field = 'name' then value end) as name,
max(case when field = 'age' then value end) as age --do the same for all fields
from
(
select t.id,
get_json_object(trim(a.field),'$.field') field,
get_json_object(trim(a.field),'$.value') value
from your_table t
lateral view outer explode(split(regexp_replace(regexp_replace(str,'\\[|\\]',''),'\\},','}|'),'\\|')) a as field --remove [], replace "}," with '}|" and explode
) s
group by id --aggregate in single row
;
Result:
OK
id name age
1 alice 14

How to transpose row to column in Postgres 9.5?

I have this table in input, it contains always only three rows.
| data |
--------
| X |
| Y |
| Z |
And I want this output:
| data1| data2 | data3 |
-------+-------+-------+
| X | Y | Z |
I have tried to use the crosstab function, but as far as I understand it need more information, like a category column and a row_name column. I don't have them.
Is possible to transpose this table?
You don't need a crosstab function to do this, use just a simple PIVOT query:
SELECT max( case rn when 1 then data end ) as data1,
max( case rn when 2 then data end ) as data2,
max( case rn when 3 then data end ) as data3
FROM (
SELECT *,
row_number() over ( ORDER BY data ) rn
FROM table1
) x
Demo: http://sqlfiddle.com/#!15/bead8/4
There is one pitfall in this query you need to think about.
The database table is by definition an unordered set ot tuples, and hardly any database guarantees an ordering of the rows unless an ORDER BY clause is specified in the SELECT statement that queries the table.
Because of this the query uses ORDER BY data clause to order rows in such a way, that X will be put to data1 column, Y to data2 and Z to data3, in this order (becauce X < Y < Z).
You need to change this clause If you need to use some other order (or maybe some other column of this table to determine this order).
For fixed columns/rows count:
select
data[1] as data1,
data[2] as data2,
data[3] as data3
from
(select array_agg(data) as data from t) as t;
For variable columns/rows count (only one from many possibilities):
create function prepare_statement(in p_name text, in p_body text) returns void as $$
declare
s text;
begin
s := 'prepare ' || p_name || ' as ' || p_body;
execute s;
return;
end; $$ language plpgsql;
and then:
select prepare_statement('foo', (
select
'select ' ||
string_agg('data['||i||'] as data'||i, ', ') ||
' from (select array_agg(data) as data from t) as t'
from generate_series(1, (select count(*) from t)) n(i))
);
execute foo;
-- deallocate foo; -- to deallocate previously prepared statement
Read more about
arrays
array_agg function
prepare/execute/deallocate statements

Transform two rows in one

We have an auditing system which logs all the changes that occur in all the system tables. Basically, this is how the AuditLog table looks like:
Currently I am creating a couple of sql views to query different kind information. Everything is ok except for one point. If you take a look at the image again, you will see I have a SubscriptionMetadata table which is a key-value pairs table with 2 fields (MetaName and MetaValue). What the immage shows is that the subscription has a new watermark which value is 'Licensed copy: Fullname, Company, V...'.
What I need is transform, in my view, these two rows in just one with the following form:
41 - Insert - SubscriptionMetadata - 2012-10-19 - 53DA4XXXXXX - Watermark - Licensed copy: Fullname, Company, V...
I really cannot imagine how I can do it or search for it neither.
There is another problem (I think), these rows comes always in that order: MetaName first and then MetaValue. That´s the only way to know they are related.
Could you help me, please?
While I cannot see your full table structure you can transform the data the following way. Both of these solutions will place the data in separate columns:
;with data(id, [action], [type], [date], [col], metatype, value) as
(
select 41, 'Insert', 'SubscriptionMetaData', '2012-10-19', '53DA4XXX','Metaname', 'Watermark'
union all
select 41, 'Insert', 'SubscriptionMetaData', '2012-10-19', '53DA4XXX','MetaValue', 'Licensed copy: Fullname, Company'
)
select id, action, type, date, col,
MAX(case when metatype = 'Metaname' then value end) Name,
MAX(case when metatype = 'MetaValue' then value end) Value
from data
group by id, action, type, date, col
See SQL Fiddle with Demo
Or you can use a PIVOT on the data to get the same result:
;with data(id, [action], [type], [date], [col], metatype, value) as
(
select 41, 'Insert', 'SubscriptionMetaData', '2012-10-19', '53DA4XXX','Metaname', 'Watermark'
union all
select 41, 'Insert', 'SubscriptionMetaData', '2012-10-19', '53DA4XXX','MetaValue', 'Licensed copy: Fullname, Company'
)
select *
from
(
select id, [action], [type], [date], [col], metatype, value
from data
) src
pivot
(
max(value)
for metatype in (Metaname, MetaValue)
) piv
See SQL Fiddle with Demo
Both produce the same result:
| ID | ACTION | TYPE | DATE | COL | NAME | VALUE |
-------------------------------------------------------------------------------------------------------------
| 41 | Insert | SubscriptionMetaData | 2012-10-19 | 53DA4XXX | Watermark | Licensed copy: Fullname, Company |
You can do this via a stored procedure or scalar-valued function using the coalesce function as follows:
DECLARE #Results NVARCHAR(MAX);
DECLARE #Token NVARCHAR(5) = '-'; -- separator token
SELECT #Results = coalesce(#Results + #Token,'') + t.MetaValue from (select * from TableName where MetaName = 'SubscriptionMetadata') t;
RETURN #Results; -- variable containing the concatenated values
Here is a working example. Please replace column names as required. Col3 = your string concatenating column.
SELECT t1.col1,t1.col2,
NameValue =REPLACE( (SELECT col3 AS [data()]
FROM mytable t2
WHERE t2.col1 = t1.col1
ORDER BY t2.col1
FOR XML PATH('')
), ' ', ' : ')
FROM mytable t1
GROUP BY col1,col2 ;
--DATA
1 | X | Name
1 | X | Value
--RESULTS
1 | X | Name:Value --Name Value pair here
EDIT: If you don't need concatenation (as per your comment)
SELECT t1.col1, t1.col2, t1.col3 NameColumn, t2.col3 ValueColumn
FROM (SELECT * FROM myTable WHERE col3 = 'Watermark') t1 JOIN
(SELECT * FROM myTable WHERE NOT (col3 = 'Watermark')) t2
ON t1.col1 = t2.col1