Big Query - Transpose arrays into colums - sql

We have a table in Big Query like below.
Input table:
Name | Interests
-----+----------
Bob | ["a"]
Sue | ["a","b"]
Joe | ["b","c"]
We want to convert the above table to below format to make it BI/Visualisation friendly.
Target/Required table:
+------------------+
| Name | a | b | c |
+------------------+
| Bob | 1 | 0 | 0 |
| Sue | 1 | 1 | 0 |
| Joe | 0 | 1 | 0 |
+------------------+
Note: The Interests column is an array datatype. Is this sort of transformation possible in Big Query? If yes, Any reference query?
Thanks in advance!

Below is for BigQuery Standard SQL and uses scripting features of BQ
#standardSQL
create temp table ttt as (
select name, interest
from `project.dataset.table`,
unnest(interests) interest
);
EXECUTE IMMEDIATE (
SELECT """
SELECT name, """ ||
STRING_AGG("""MAX(IF(interest = '""" || interest || """', 1, 0)) AS """ || interest, ', ')
|| """
FROM ttt
GROUP BY name
"""
FROM (
SELECT DISTINCT interest
FROM ttt
ORDER BY interest
)
);
if to apply to sample data from your question - output is

Related

Show the firebase event_params key-value data into the single row using Big Query

I am trying to perform a Google BigQuery on the Firebase stored events. I have executed the following query
SELECT * FROM `myTable` LIMIT 6
which has the following result:
+-----+----------+--------+------------------+---------------------------------+
| Row | date | name | event_params.key | event_params.value.string_value |
+-----+----------+--------+------------------+---------------------------------+
| 1 | 20200922 | Event1 | errorName | BLE_Not_connected |
| | | | appDetails | 2.2.2 |
| | | | errorDetails | iOS-Error |
+-----+----------+--------+------------------+---------------------------------+
So, here row-1 has multiple entries of event_params.key and their value shows on event_params.value.string_value column. Now, I want to perform a Google Big-Query which flattens the event_params.key column value and show a result below
+-----+----------+--------+------------------+---------------------------------+
| Row | date | name | errorName | appDetails | errorDetails |
+-----+----------+--------+------------------+---------------------------------+
| 1 | 20200922 | Event1 | BLE_Not_connected| 2.2.2 | iOS-Error |
+-----+----------+--------+------------------+---------------------------------+
Could anyone help me? Thanks in advance.
Below is for BigQuery Standard SQL
EXECUTE IMMEDIATE (
SELECT """
SELECT date, name, """ ||
STRING_AGG("""MAX(IF(key = '""" || key || """', value.string_value, NULL)) AS """ || key, ', ')
|| """
FROM `project.dataset.table` t, t.event_params
GROUP BY date, name
"""
FROM (
SELECT DISTINCT key
FROM `project.dataset.table` t, t.event_params
)
);
If to apply to sample data from your question - output is
Row date name errorName appDetails errorDetails
1 20200922 Event1 BLE_Not_connected 2.2.2 iOS-Error

Comma separated string to JSON object

I need to update/migrate a table IdsTable in my SQL Server database which has the following format:
+----+------------------+---------+
| id | ids | idType |
+----+------------------+---------+
| 1 | id11, id12, id13 | idType1 |
| 2 | id20 | idType2 |
+----+------------------+---------+
The ids column is a comma separate list of ids. I need to combine the ids and idType column to form a single JSON string for each row and update the ids column with that object.
The JSON object has the following format:
{
"idType": string,
"ids": string[]
}
Final table after transforming/migrating data should be:
+----+-----------------------------------------------------+---------+
| id | ids | idType |
+----+-----------------------------------------------------+---------+
| 1 | {"idType": "idType1","ids": ["id11","id12","id13"]} | idType1 |
| 2 | {"idType": "idType2","ids": ["id20"]} | idType2 |
+----+-----------------------------------------------------+---------+
The best I've figured out so far is to get the results into a format where I could GROUP BY id to try and get the correct JSON format:
SELECT X.id, Y.value, X.idType
FROM
IdsTable AS X
CROSS APPLY STRING_SPLIT(X.ids, ',') AS Y
Which gives me the results:
+----+------+---------+
| id | ids | idType |
+----+------+---------+
| 1 | id11 | idType1 |
| 1 | id12 | idType1 |
| 1 | id13 | idType1 |
| 2 | id20 | idType2 |
+----+------+---------+
But I'm not familiar enough with SQL Server JSON to move forward.
If it's a one-off op I think I'd just do it the dirty way:
UPDATE table SET ids =
CONCAT('{"idType": "', idType, '","ids": ["', REPLACE(ids, ', ', '","'), '"]}'
You might need to do some prep first, like if your ids column can look like:
id1,id2,id3
id4, id5, id6
id7 ,id8 , id9
etc, a series of replacements like:
UPDATE table SET ids = REPLACE(ids, ' ,', ',') WHERE ids LIKE '% ,%'
UPDATE table SET ids = REPLACE(ids, ', ', ',') WHERE ids LIKE '%, %'
Keep running those until they don't update any more records
ps; if you've removed all the spaces from around the commas, you'll need to tweak the REPLACE in the original query - I specified ', ' as the needle
I found this blog post that helped me construct my answer:
-- Create Temporary Table
SELECT
[TAB].[id], [TAB].[ids],
(
SELECT [STRING_SPLIT_RESULTS].value as [ids], [TAB].[idType] as [idType]
FROM [IdsTable] AS [REQ]
CROSS APPLY STRING_SPLIT([REQ].[ids],',') AS [STRING_SPLIT_RESULTS]
FOR JSON PATH
) as [newIds]
INTO [#TEMP_RESULTS]
FROM [IdsTable] AS [TAB]
-- Update rows
UPDATE [IdsTable]
SET [ids] = [#TEMP_RESULTS].[newIds]
FROM [#TEMP_RESULTS]
WHERE [IdsTable].[Id] = [#TEMP_RESULTS].[Id]
-- Delete Temporary Table
DROP TABLE [#TEMP_RESULTS]
Which replaces those ids column (not replaced below for comparison):
+----+----------------+---------+------------------------------------------------------------------------------------------------------+
| id | ids | idType | newIds |
+----+----------------+---------+------------------------------------------------------------------------------------------------------+
| 1 | id11,id12,id13 | idType1 | [{"id":"id11","idType":"idType1"},{"id":"id12","idType":"idType1"},{"id":"id13","idType":"idType1"}] |
| 2 | id20 | idType2 | [{"id":"id20","idType":"idType2"}] |
+----+----------------+---------+------------------------------------------------------------------------------------------------------+
This is more verbose that I wanted but considering the table size and the number of ids stored in the ids column which translates to the size of the JSON object, this is fine for me.

SQL - left pad with the zero after symbol '-'

I am trying to left pad with a single zero after the '-'.
I did check the other answers here but didnt help me.
Here is the table :
+---------+
| Job |
+---------+
| 3254-1 |
| 3254-25 |
| 3254-6 |
+---------+
I need to left pad with single zero after '-' if the value is between 1 and 9 in the end
I want the results to be :
+---------+
| Job |
+---------+
| 3254-01 |
| 3254-25 |
| 3254-06 |
+---------+
You can use CHARINDEX(), SUBSTRING() and REPLACE() as:
CREATE TABLE Jobs(
Job VARCHAR(45)
);
INSERT INTO Jobs VALUES
('3254-1'),
('3254-25'),
('3254-6');
SELECT CASE
WHEN CHARINDEX('-', Job, 1)+1 < LEN(Job) THEN Job
ELSE
REPLACE(Job, '-', '-0')
END AS Job
FROM Jobs;
Results:
+----+---------+
| | Job |
+----+---------+
| 1 | 3254-01 |
| 2 | 3254-25 |
| 3 | 3254-06 |
+----+---------+
If you want an update, I think this is the simplest method:
update t
set job = replace(job, '-', '-0')
where job like '%-_';
This problem is simplified greatly because you are only adding a single padding character.
If you have version 2012+, then format function may be used as :
select concat(nr1, '-', format( cast ( q2.nr2 as int ), '00')) as result
from
(
select substring(q1.str,1,charindex('-',q1.str,1)-1) as nr1,
substring(q1.str,charindex('-',q1.str,1)+1,len(q1.str)) as nr2
from
(
select '3254-1' as str union all
select '3254-25' as str union all
select '3254-6' as str
) q1
) q2;
result
------
3254-01
3254-25
3254-06
Rextester Demo

Split column values in postgres

I have one table with two columns id and data. The values in the table are as follows:
id|data|
1 |A,B |
2 |B,C |
3 |C,D |
4 |D,A |
5 |E,C |
I need number of A,B,C,D,E present in the table as follows. Please note: columns are dynamic means they are dependent upon the values in the data column from the table:
A|B|C|D|E|
2|2|3|2|1|
And I have written the following query:
SELECT id,s.data
FROM my_table t,
unnest(string_to_array(t.data, ',')) s(data);
The output is given as follows:
id|data|
1 | A |
1 | A |
1 | A |
1 | A |
1 | A |
1 | A |
1 | A |
1 | A |
Please find the query which provide exact output which you want.
select * from crosstab ('SELECT ''Total'':: text, s.data, count(*)::int
FROM my_table t, unnest(string_to_array(t.data, '','')) s(data)
group by s.data') as ct(Total text, A int, B int, C int, D int, E int)
Please create extension tablefunc (if not created).

How write a query to put all values from a column into a string (transform column to row)?

if here is the data column:
+----+-------------+
| id | data |
+----+-------------+
| 1 | max |
| 2 | linda |
| 3 | sam |
| 4 | henry |
+----+-------------+
so how to make a query then has a result:
"max, lina, sam, henry"
Much like a column to row transform. Above just a simple demo, may have 10000+ record on data field.
For 11g, use LISTAGG
SELECT LISTAGG("data", ', ') WITHIN GROUP (ORDER BY "id") "data"
FROM TableName
SQLFiddle Demo
LISTAGG