Dynamic Header in BigQuery - google-bigquery

Dynamic Header in BigQuery - google-bigquery

I'm creating a dashboard where I want to show the items closed per week per category. This is a sample of my table.
Category
Opened Date
Closed Date
Sales
06/01/2021
06/02/2021
Product
06/02/2021
06/07/2021
Feedback
06/07/2021
06/14/2021
Sales
05/18/2021
05/23/2021
Product
06/01/2021
06/01/2021
Feedback
06/01/2021
06/05/2021
Sales
05/21/2021
05/24/2021
Product
05/21/2021
05/26/2021
Product
06/01/2021
06/02/2021
Feedback
05/31/2021
06/13/2021
Sales
06/02/2021
06/06/2021
Product
06/04/2021
06/07/2021
This is the result I want to achieve.
Closed category per week
Category
05/23/2021
05/30/2021
06/06/2021
06/13/2021
Sales
2
1
0
1
Product
1
2
2
0
Feedback
0
0
1
2
I tried to use the SET so I can dynamically create the column but bigQuery shows an error when I use the # sign. Also it shows literally the column name when I use the column_alias
SET #column_ week1 = DATE_SUBB(CURRENT_DATE, INTERVAL 7 DAYS)
Any other options to achieve this?

You can do all the same in one shot
execute immediate (
select '''select * from (
select replace('_' || (last_day(parse_date('%m/%d/%Y', closed), week) - 6), '-', '_') week,
category
from `project.dataset.table`
)
pivot(count(*) for week in ("''' || string_agg(week, '", "') || '''"))
'''
from (select distinct replace('_' || (last_day(parse_date('%m/%d/%Y', closed), week) - 6), '-', '_') week from `project.dataset.table` order by week)
)
with output
Check out also https://stackoverflow.com/a/67479622/5221944 - can help you in understanding above code

Try execute_immediate and pivot operator:
DECLARE weeks_list STRING;
create temp table input_table as
select 'Sales' as category, '06/02/2021' as closed_date union all
select 'Product', '06/07/2021' union all
select 'Feedback', '06/14/2021' union all
select 'Sales', '05/23/2021' union all
select 'Product', '06/01/2021' union all
select 'Feedback', '06/05/2021' union all
select 'Sales', '05/24/2021' union all
select 'Product', '05/26/2021' union all
select 'Product', '06/02/2021' union all
select 'Feedback', '06/13/2021' union all
select 'Sales', '06/06/2021' union all
select 'Product', '06/07/2021';
create temp table normalized_table as select category, FORMAT_DATE('_%m_%d_%Y', DATE_TRUNC(PARSE_DATE('%m/%d/%Y', closed_date), WEEK)) as closed_week from input_table;
SET weeks_list = (select string_agg(distinct '"' || closed_week || '"') from normalized_table);
execute immediate 'SELECT * FROM (SELECT category, closed_week FROM normalized_table) PIVOT(count(*) FOR closed_week IN (' || weeks_list || '))';
P.S.
You can skip creating temporary normalized_table and weeks_list variable:
execute immediate (
select '''
SELECT * FROM (
SELECT
category,
FORMAT_DATE("_%m_%d_%Y", DATE_TRUNC(PARSE_DATE("%m/%d/%Y", closed_date), WEEK)) as closed_week
FROM input_table)
PIVOT(count(*) FOR closed_week IN (''' || string_agg(distinct '"' || closed_week || '"') || '''))
'''
from (
select
category,
FORMAT_DATE("_%m_%d_%Y", DATE_TRUNC(PARSE_DATE("%m/%d/%Y", closed_date), WEEK)) as closed_week
from input_table
)
);

Related

Is it possible to replace values in row to values from the same table by the reference in Oracle SQL

There is a table that contains values that are used in formulas. There are simple variables, that do not contain any expression, and also there are some variables that combined from simple variables into formula. I need to figure out if is it possible to do a SELECT query to get a readable formula based on aliases it contains. Each of these aliases could be used in other formulas.
Let's say that there are two tables:
ITEM TABLE
ID
Name
FORMULA_ID
1
Item name 1
f_3
2
Item name 2
f_26
FORMULA TABLE
ID
EXPRESSION
ALIASE
NAME
f_1
null
var_100
Ticket
f_2
null
var_200
Amount
f_3
var_100 * var_200
var_300
Some description
So is there any chance to query, with result like:
ITEM_NAME
READABLE_EXPRESSION
Item name 1
Ticket * Amount

Try this:
with items(ID,Name,Formula_Id) AS (
select 1, 'Item name 1', 'f_3' from dual union all
select 2, 'Item name 2', 'f_26' from dual
),
formulas (ID, EXPRESSION, ALIAS, NAME) as (
select 'f_1', null, 'var_100', 'Ticket' from dual union all
select 'f_2', null, 'var_200', 'Amount' from dual union all
select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' from dual
),
rnformulas (id, EXPRESSION, ALIAS, NAME, rn) as (
select fm.*, row_number() over(order by id) as rn from formulas fm
),
recsubstitute( lvl, item_id, rn, expression ) as (
select 1, it.id, 0, fm.expression
from items it
join rnformulas fm on it.formula_id = fm.id
union all
select lvl+1, item_id, fm.rn, replace(r.expression, fm.alias, fm.name)
from recsubstitute r
join rnformulas fm on instr(r.expression, fm.alias) > 0 and fm.rn > r.rn
)
select item_id, expression from (
select item_id, expression, row_number() over(partition by item_id order by lvl desc, rn asc) as rn
from recsubstitute
)
where rn = 1
;
ITEM_ID EXPRESSION
---------- ------------------------------------------------------------
1 Ticket * Amount
Note that it's far to be bullet proof against all situations, especially recursion in the aliases.

Some improvement with another set of data:
with items(ID,Name,Formula_Id) AS (
select 1, 'Item name 1', 'f_3' from dual union all
select 2, 'Item name 2', 'f_4' from dual
),
formulas (ID, EXPRESSION, ALIAS, NAME) as (
select 'f_1', null, 'var_100', 'Ticket' from dual union all
select 'f_2', null, 'var_200', 'Amount' from dual union all
select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' from dual union all
select 'f_4', 'var_300', null, 'Other description' from dual
),
rnformulas (id, EXPRESSION, ALIAS, NAME, rn) as (
select fm.*, row_number() over(order by id) as rn from formulas fm
),
recsubstitute( lvl, item_id, rn, expression ) as (
select 1, it.id, 0, fm.expression
from items it
join rnformulas fm on it.formula_id = fm.id
union all
select lvl+1, item_id, fm.rn, replace(r.expression, fm.alias, nvl(fm.expression,fm.name))
from recsubstitute r
join rnformulas fm on instr(r.expression, fm.alias) > 0
)
select item_id, expression from (
select item_id, expression, row_number() over(partition by item_id order by lvl desc, rn asc) as rn
from recsubstitute
)
where rn = 1
;
1 Ticket * Amount
2 Ticket * Amount

Split the space-delimited formulas into rows. Join the expression parts to the aliases and replace the alias with the name. Join this to the item_table using LISTAGG to concatenate the rows back into a single column.
WITH formula_split AS (
SELECT DISTINCT ft.id
,level lvl
,regexp_substr(ft.expression,'[^ ]+',1,level) expression_part
FROM formula_table ft
CONNECT BY ( ft.id = ft.id
AND level <= length(ft.expression) - length(replace(ft.expression,' ')) + 1 ) START WITH ft.expression IS NOT NULL
),readable_tbl AS (
SELECT ft.id
,ft.lvl
,replace(ft.expression_part,ftn1.aliase,ftn1.name) readable_expression
FROM formula_split ft
LEFT JOIN formula_table ftn1 ON ( ft.expression_part = ftn1.aliase )
)
SELECT it.name item_name
,LISTAGG(readable_expression,' ') WITHIN GROUP(ORDER BY lvl) readable_expression
FROM item_table it
JOIN readable_tbl rt ON ( it.formula_id = rt.id )
GROUP BY it.name

With sample data create CTE (calc_data) for modeling
WITH
items (ITEM_ID, ITEM_NAME, FORMULA_ID) AS
(
Select 1, 'Item name 1', 'f_3' From Dual Union All
Select 2, 'Item name 2', 'f_26' From Dual
),
formulas (FORMULA_ID, EXPRESSION, ALIAS, ELEMENT_NAME) AS
(
Select 'f_1', null, 'var_100', 'Ticket' From Dual Union All
Select 'f_2', null, 'var_200', 'Amount' From Dual Union All
Select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' From Dual
),
calc_data AS
( SELECT e.ITEM_NAME, e.FORMULA_ID, e.FORMULA, e.X, e.OPERAND, e.Y,
ROW_NUMBER() OVER(Partition By e.ITEM_NAME Order By e.FORMULA_ID) "RN", f.ELEMENT_NAME
FROM( Select CAST('.' as VARCHAR2(32)) "FORMULA", i.ITEM_NAME, f.FORMULA_ID,
SubStr(Replace(f.EXPRESSION, ' ', ''), 1, InStr(Replace(f.EXPRESSION, ' ', ''), '*') - 1) "X",
CASE
WHEN InStr(f.EXPRESSION, '+') > 0 THEN '+'
WHEN InStr(f.EXPRESSION, '-') > 0 THEN '-'
WHEN InStr(f.EXPRESSION, '*') > 0 THEN '*'
WHEN InStr(f.EXPRESSION, '/') > 0 THEN '/'
END "OPERAND",
--
SubStr(Replace(f.EXPRESSION, ' ', ''), InStr(Replace(f.EXPRESSION, ' ', ''), '*') + 1) "Y"
From formulas f
Inner Join items i ON(f.FORMULA_ID = i.FORMULA_ID)
) e
Inner Join formulas f ON(f.FORMULA_ID <> e.FORMULA_ID)
)
Main SQL with MODEL clause
SELECT ITEM_NAME, FORMULA
FROM ( SELECT *
FROM calc_data
MODEL
PARTITION BY (ITEM_NAME)
DIMENSION BY (RN)
MEASURES (X, OPERAND, Y, FORMULA, ELEMENT_NAME)
RULES ( FORMULA[1] = ELEMENT_NAME[1] || ' ' || OPERAND[1] || ' ' || ELEMENT_NAME[2] )
)
WHERE RN = 1
R e s u l t :
ITEM_NAME
FORMULA
Item name 1
Amount * Ticket
Just as an option...
The same result without any analytic functions, pseudo columns, unions, etc... - just selecting over and over and over. Not readable, though...
Select
i.ITEM_NAME,
REPLACE( REPLACE( (Select EXPRESSION From formulas Where FORMULA_ID = f.FORMULA_ID),
(Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID),
(Select ELEMENT_NAME From formulas Where FORMULA_ID <> f.FORMULA_ID And ALIAS = (Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) )
) ||
REPLACE( (Select EXPRESSION From formulas Where FORMULA_ID = f.FORMULA_ID),
(Select Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID),
(Select ELEMENT_NAME From formulas Where FORMULA_ID <> f.FORMULA_ID And ALIAS = (Select Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) )
),
(SELECT Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID ) || (Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) ||
SubStr(f.EXPRESSION, InStr(f.EXPRESSION, ' ', 1, 1), (InStr(f.EXPRESSION, ' ', 1, 2) - InStr(f.EXPRESSION, ' ', 1, 1)) + 1 ), ''
) "FORMULA"
From
formulas f
Left Join
items i ON(i.FORMULA_ID = f.FORMULA_ID)
Where i.ITEM_NAME Is Not Null

Thank you all for your answers!
I've decided to create a pl/sql function, just to modify a formula to readable row. So the function just looks for variables using regex, and uses indexes to replace every variable with a name.
CREATE OR REPLACE FUNCTION READABLE_EXPRESSION(inExpression IN VARCHAR2)
RETURN VARCHAR2
IS
matchesCount INTEGER;
toReplace VARCHAR2(32767);
readableExpression VARCHAR2(32767);
selectString VARCHAR2(32767);
BEGIN
matchesCount := REGEXP_COUNT(inExpression, '(var_)(.*?)');
IF matchesCount IS NOT NULL AND matchesCount > 0 THEN
readableExpression := inExpression;
FOR i in 1..matchesCount
LOOP
toReplace := substr(inExpression, REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 0),
REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 1) -
REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 0)
);
SELECT DISTINCT F.NAME
INTO selectString
FROM FORMULA F
WHERE F.ALIASE = toReplace FETCH FIRST 1 ROW ONLY;
readableExpression := REPLACE(readableExpression,
toReplace,
selectString
);
end loop;
end if;
return readableExpression;
END;
So such function returns 1 result row with replaced values for 1 input row with FORMULA. All you need to do is join the ITEM and FORMULA tables in the SELECT.
SELECT item.name, READABLE_EXPRESSION(formula.expression)
FROM item
JOIN formula ON item.formula_id = formula.id;
Please note that the tables are fictitious so as not to reveal the actual data structure, so there might be some inaccuracies. But the general idea should be clear.

Oracle SQL: How to remove duplicate in listagg

After using listagg to combine my data, there are many duplicates that I want to remove.
Original Table
There will be only 3 types of technologies in total with no specific pattern in my data. I am wondering is it possible to remove all the duplicates and only keep 1 type in the respective row?
select
NAME,
RTRIM(
REGEXP_REPLACE(
(LISTAGG(
NVL2(Membrane_column, 'Membrane, ', NULL)
|| NVL2(SNR_column, 'SNR, ', NULL)
|| NVL2(SMR_column, 'SMR, ', NULL)
) within group (ORDER BY name)),
'Membrane, |SNR, |SMR, ', '', '1', '1', 'c')
', ')
as TECHNOLOGY
from
Table A
The current table I have for now
Name
Technology
A
SNR, SMR, SMR, SNR
B
Membrane, SNR, SMR, Membrane
C
SMR, SMR, Membrane
Desired Table
Name
Technology
A
SNR, SMR
B
Membrane, SNR, SMR
C
SMR, Membrane

This could be an easy way:
select name, listagg(technology, ', ') within group (order by 1) -- or whatever order you need
from
(
select distinct name, technology
from tableA
)
group by name

Maybe just create the SUM of the SNR/SMR/Membrane columns, group them by name, and replace the numbers with the strings that you want to see in the output.
Query (first step ...)
select name
, sum( snr_column ), sum( smr_column ), sum( membrane_column )
from original
group by name
;
-- output
NAME SUM(SNR_COLUMN) SUM(SMR_COLUMN) SUM(MEMBRANE_COLUMN)
2 1 1 2
3 null 2 1
1 2 2 null
Replace the sums, concatenate, remove the trailing comma with RTRIM()
select
name
, rtrim(
case when sum( snr_column ) >= 1 then 'SNR, ' end
|| case when sum( smr_column ) >= 1 then 'SMR, ' end
|| case when sum( membrane_column ) >= 1 then 'Membrane' end
, ' ,'
) as technology
from original
group by name
order by name
;
-- output
NAME TECHNOLOGY
1 SNR, SMR
2 SNR, SMR, Membrane
3 SMR, Membrane
Code the CASEs in the required order.
DBfiddle

Starting from Oracle 19c listagg supports distinct keyword. Also within group became optional.
with a as (
select column_value as a
from table(sys.odcivarchar2list('A', 'B', 'A', 'B', 'C')) q
)
select listagg(distinct a, ',')
from a
LISTAGG(DISTINCTA,',')
----------------------
A,B,C
livesql example here.

Show Zero if there is no record count - ORACLE SQL query

The below Oracle query gives if there are any different errors with error_message and Serial_num.
If there is ZERO or No Different error count instead of showing Blank/Null result. How can i see the output like this? I tried with NVL(error_message,0) and COALESCE (Sum(total),0) but not getting the desired output.
Expected output:
1 Different Errors: 0
Oracle SQL Query:
SELECT
1 as Index_Num,
CONCAT('Different Errors: ', error_message || '# ' || serial_num),
SUM(total)
FROM (
SELECT error_message, serial_num, COUNT(*) total
FROM Table1
WHERE error_message NOT LIKE '%INVALID%'
GROUP BY error_message, serial_num
)
GROUP BY error_message, serial_num

Create a CTE for the subquery and use UNION ALL with NOT EXISTS to cover the case that the CTE does not return any rows:
WITH cte AS (
SELECT error_message, serial_num, COUNT(*) total
FROM Table1
WHERE error_message NOT LIKE '%INVALID%'
GROUP BY error_message, serial_num
)
SELECT
1 as Index_Num,
CONCAT(
'Different Errors: ',
list_agg(error_message || '# ' || serial_num) within group (order by error_message)
),
SUM(total)
FROM cte
UNION ALL
SELECT 1, 'Different Errors: ', 0
FROM dual
WHERE NOT EXISTS (SELECT 1 FROM cte)

D'oh! Looks like I took too long. Here's another option for posterity:
SELECT
1,
CONCAT(
'Different Errors: ',
CASE
WHEN src.error_message IS NULL THEN ''
ELSE src.error_message || ' # ' || src.serial_num
END
) Summary,
COALESCE(src.total, 0) AS total
FROM dual -- Get a seed row (in case there are no rows in error table)
LEFT JOIN (
SELECT error_message, serial_num, COUNT(*) total
FROM Table1
WHERE error_message NOT LIKE '%INVALID%'
GROUP BY error_message, serial_num
) src ON 0=0
SQL Fiddle

It is not exactly what you are asking for, but might prove useful. You can easily add a row with the total number of errors, using grouping sets:
SELECT 1 as Index_Num,
('Different Errors: ' || error_message || '# ' || serial_num),
COUNT(*) as total
FROM Table1
WHERE error_message NOT LIKE '%INVALID%'
GROUP BY GROUPING SETS ( (error_message, serial_num), () );
Alas, this produces the summary row even when there are errors. It occurs to me that you might find this useful.

How to find the changes happened between rows?

I have two tables that I need to find the difference between.
What's required is a table of a summary of what fields have changed (ignoring id columns). Also, I don't know which columns have changed.
e.g. Source table [fields that have changed are {name}, {location}; {id} is ignored]
id || name || location || description
1 || aaaa || ddd || abc
2 || bbbb || eee || abc
e.g. Output Table [outputting {name}, {location} as they have changed]
Table_name || Field_changed || field_was || field_now
Source table || name || aaaa || bbbb
Source table || location || ddd || eee
I have tried to use lag(); but that only gives me the columns I selected. Eventually I'd want to see all changes in all columns as I am not sure what columns are changed.
Also please note that the table has about 150 columns - so one of the biggest issues is how to find the ones that changed

As your table can contain multiple changes in a single row and it needs to be calculated in the result as multiple rows, I have created a query to incorporate them separately as follows:
WITH DATAA(ID, NAME, LOCATION, DESCRIPTION)
AS
(SELECT 1, 'aaaa', 'ddd', 'abc' FROM DUAL UNION ALL
SELECT 2, 'bbbb', 'eee', 'abc' FROM DUAL),
-- YOUR QUERY WILL START FROM HERE
CTE AS (SELECT NAME,
LAG(NAME,1) OVER (ORDER BY ID) PREV_NAME,
LOCATION,
LAG(LOCATION,1) OVER (ORDER BY ID) PREV_LOCATION,
DESCRIPTION,
LAG(DESCRIPTION,1) OVER (ORDER BY ID) PREV_DESCRIPTION
FROM DATAA)
--
SELECT
'Source table' AS TABLE_NAME,
FIELD_CHANGED,
FIELD_WAS,
FIELD_NOW
FROM
(
SELECT
'Name' AS FIELD_CHANGED,
PREV_NAME AS FIELD_WAS,
NAME AS FIELD_NOW
FROM
CTE
WHERE
NAME <> PREV_NAME
UNION ALL
SELECT
'location' AS FIELD_CHANGED,
PREV_LOCATION AS FIELD_WAS,
LOCATION AS FIELD_NOW
FROM
CTE
WHERE
LOCATION <> PREV_LOCATION
UNION ALL
SELECT
'description' AS FIELD_CHANGED,
PREV_DESCRIPTION AS FIELD_WAS,
DESCRIPTION AS FIELD_NOW
FROM
CTE
WHERE
DESCRIPTION <> PREV_DESCRIPTION
);
Output:
DEMO
Cheers!!

get count of words in column sql

after the following queries
SELECT * FROM table;
SELECT REGEXP_REPLACE(description || '!', '[^[:punct:]]')
FROM table;
SELECT REGEXP_REPLACE ( description, '[' || REGEXP_REPLACE ( description || '!', '[^[:punct:]]') || ']') test
FROM table;
SELECT REGEXP_REPLACE(UPPER(TEST), ' ', '#') test
FROM (SELECT REGEXP_REPLACE (description, '[' || REGEXP_REPLACE (description || '!', '[^[:punct:]]') || ']') test
FROM table);
I have a column in an oracle sql looking like:
TEST
---------------------------------------------
SPOKE#WITH#MR#SMITHS#ASSISTANT
EMAILED#FOR#VISIT
SCHEDULING#OFFICE#LM#FOR#VISIT
LM#FOR#VISIT
LM#FOR#VISIT
PHONE#CALL
---------------------------------------------
all of the words are separated by #'s. I would like to get counts of the occurrences of words, for example:
word | count
------------
LM | 3
FOR | 4
VISIT| 4
PHONE| 1
etc etc. I'm new to oracle sql and am only familiar with rudimentary mysql commands. any help or pointers to tutorials would also be helpful. thank you.
edit: there are approximately 1500 rows with about 250 unique responses that i'm trying to account for

WITH mydata AS
( SELECT 'SPOKE#WITH#MR#SMITHS#ASSISTANT' AS str FROM dual
UNION ALL
SELECT 'EMAILED#FOR#VISIT' FROM dual
UNION ALL
SELECT 'SCHEDULING#OFFICE#LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'LM#FOR#VISIT' FROM dual
UNION ALL
SELECT 'PHONE#CALL' FROM dual
),
splitted_words AS
(
SELECT REGEXP_SUBSTR(str,'[^#]+', 1, level) AS word
FROM mydata
CONNECT BY level <= LENGTH(regexp_replace(str,'[^#]')) + 1
AND PRIOR str = str
AND PRIOR sys_guid() IS NOT NULL
)
SELECT word,
COUNT(1)
FROM splitted_words
GROUP BY word;
If your table is YOUR_TABLE and column is YOUR_COLUMN
WITH splitted_words AS
(
SELECT REGEXP_SUBSTR(YOUR_COLUMN,'[^#]+', 1, level) AS word
FROM YOUR_TABLE
CONNECT BY level <= LENGTH(regexp_replace(YOUR_COLUMN,'[^#]')) + 1
AND PRIOR YOUR_COLUMN = YOUR_COLUMN
AND PRIOR sys_guid() IS NOT NULL
)
SELECT word,
COUNT(1)
FROM splitted_words
GROUP BY word;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Dynamic Header in BigQuery - google-bigquery

Related

Is it possible to replace values in row to values from the same table by the reference in Oracle SQL

Oracle SQL: How to remove duplicate in listagg

Show Zero if there is no record count - ORACLE SQL query

How to find the changes happened between rows?

get count of words in column sql

Categories

Resources