Pivot BigQuery table using multiple rows

Pivot BigQuery table using multiple rows - google-bigquery

In order to pivot my big query table, I found this code
SELECT 'SELECT id, ' ||
STRING_AGG(
'MAX(IF(key = "' || key || '", value, NULL)) as `' || key || '`'
)
|| ' FROM `project.dataset.table` GROUP BY id ORDER BY id'
FROM (
SELECT key
FROM `project.dataset.table`
GROUP BY key
ORDER BY key
But even if I apply EXECUTE IMMEDIATE function, it returns a string of the code above.
What did I missed in that function ?
Thanks for your help

Use below
EXECUTE IMMEDIATE(
SELECT 'SELECT id, ' ||
STRING_AGG(
'MAX(IF(key = "' || key || '", value, NULL)) as `' || key || '`'
)
|| ' FROM `project.dataset.table` GROUP BY id ORDER BY id'
FROM (
SELECT key
FROM `project.dataset.table`
GROUP BY key
ORDER BY key
)
);

Related

Convert ` LISTAGG` TO `XMLAGG`

How to convert LISTAGG with case statements to XMLAGG equivalent, so as to avoid the concatenation error.
#ECHO ${cols_2 ||32767||varchar2}$ --Declare variable
SELECT LISTAGG( 'MAX(CASE WHEN CATEGORY = '''||CATEGORY||''' THEN "'||"LEVEL"||'" END) AS "'||"LEVEL"||'_'||CATEGORY||'"' , ',' )
WITHIN GROUP( ORDER BY CATEGORY, "LEVEL" DESC )
INTO cols_2
FROM (
SELECT DISTINCT "LEVEL", CATEGORY
FROM temp
);
I tried this and I'm getting an error saying missing keyword
#ECHO ${cols_2 ||32767||varchar2}$ --Declare variable
select rtrim (
xmlagg (xmlelement (e, 'MAX(CASE WHEN CATEGORY = '''||CATEGORY||''' THEN "'||"LEVEL"||'" END) AS "'||LEVEL||'_'||CATEGORY||'"', ',') order by 1,2 desc).extract (
'//text()'),
', ')
INTO cols_2
FROM (
SELECT DISTINCT "LEVEL", CATEGORY
temp
);
I have tried this an declared the cols_2 as clob type :-
SELECT DBMS_XMLGEN.CONVERT (
RTRIM (
XMLAGG (XMLELEMENT (
e,
'MAX(CASE WHEN CATEGORY = '''
|| CATEGORY
|| ''' THEN "'
|| "LEVEL"
|| '" END) AS "'
|| "LEVEL"
|| '_'
|| CATEGORY
|| '"',
',')
ORDER BY 1, DESC).EXTRACT('//text()').getclobval(),','),1)
', '),
1)
INTO cols_2
FROM (SELECT DISTINCT "LEVEL", CATEGORY
FROM temp);
Yet my issue is not resolved ,Im getting an error while trying to execute it as a procedure like :-
Error in concatenation of `LISTAGG` function[Not a duplicate question]

You are getting the missing keyword error because you are most likely attempting to run the second query as a standalone query instead of in a PL/SQL block. When you are doing that, you have to remove your into cols_2 clause. That is your immediate issue that should resolve your error.
Also, based on your prior question, using the XML functions will escape your ' and " characters so you will want to make sure to unescape them back to their original characters so you can use them in your dynamic sql query like this:
SELECT DBMS_XMLGEN.CONVERT (
RTRIM (
XMLAGG (XMLELEMENT (
e,
'MAX(CASE WHEN CATEGORY = '''
|| CATEGORY
|| ''' THEN "'
|| "LEVEL"
|| '" END) AS "'
|| "LEVEL"
|| '_'
|| CATEGORY
|| '"',
',')
ORDER BY 1, 2 DESC).EXTRACT ('//text()'),
', '),
1)
--INTO cols_2
FROM (SELECT DISTINCT "LEVEL", CATEGORY
FROM temp);

Get count of rows from multiple tables Redshift SQL?

I have a redshift database that is being updated with new tables so I can't just manually list the tables I want. I want to get a count of the rows of all the tables from my query. So far I have:
select 'SELECT ''' || table_name || ''' as table_name, count(*) As con ' ||
'FROM ' || table_name ||
CASE WHEN lead(table_name) OVER (order by table_name ) IS NOT NULL
THEN ' UNION ALL ' END
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME LIKE '%results%'
but when I do this I get the error:
Specified types or functions (one per INFO message) not supported on Redshift tables.
I've searched a lot but I can't seem to find a solution for my problem. Any help would be greatly appreciated. Thanks!
EDIT:
I've changed my approach to this and decided to use a for loop in R to get the row counts of each but I'm running into the issue that 'row_counts' is only saving one number, not the count of each row like I want. Here is the code:
schema <- "x"
table_prefix <- "results"
geos <- ad_districts %>% filter(geo != "geo")
row_count <- list()
i = 1
for (geo in geos){
table_name <- paste0(schema, ".", table_prefix, geo)
row_count[[i]] <- dbGetQuery(con,
paste("SELECT COUNT(*) FROM", table_name))
i = i + 1
}

Your query is doing a select * for all tables, this will take a lot of time and resources. Instead use a system table to get the same info
select name, sum(rows) as rows
from stv_tbl_perm
where name like '%results%'
group by 1

[EDIT] - I think this is the root cause - some sql functions are only supported on the leader node. Try connecting to that node and re-run your SQL.
https://docs.aws.amazon.com/redshift/latest/dg/c_sql-functions-leader-node.html
Hope this helps.
select 'select count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' ;' as sql_text
from information_schema.tables
;
[EDIT - refined this a bit to generate a series of statements that can be run at once]
select rownum, case when rownum > 1 then sql_text else replace(sql_text, 'union all', '') end as sql_text
from
(
select rank() over (order by sql_text DESC) as rownum,
sql_text
from
(
select 'select ''' || table_schema || ' ' || table_name || ''' , count(*) as "' || table_schema || '.' || table_name || '" from ' || table_schema || '.' || table_name || ' union all ' as sql_text
from information_schema.tables
where table_schema = 'public'
order by table_schema, table_name
)X
)Y
order by rownum desc ;

SELECT ' Select count(*) , '''+ tablename + ''' from '+'"' + tablename +'"' +' Union ALL '
FROM pg_table_def
GROUP BY tablename
Above query eliminates any table name with space. Remove UNION ALL at the end of the query and query will be ready to be executed.

Comparing column by column between two rows in Oracle DB

I need to write a query to compare column by column (ie: find differences) between two rows in the database. For example:
row1: 10 40 sometext 24
row2: 10 25 sometext 24
After the query executed, it should shows only the fields that have difference (ie: the second field)
Here's what I have done so far:
select table1.column1, table1.column2, table1.column3, table1.column4
from table1
where somefield in (field1, field2);
The above query will show me two rows one above another like this:
10 40 sometext 24
10 25 sometext 24
Then I have to manually do the comparison and it takes a lot of time b/c the row contains a lot of column.
So again my question is: How can I write a query that will show me only the columns that have differences??
Thanks

Use UNPIVOT clause (see http://www.oracle-developer.net/display.php?id=506) to turn columns into rows, then filter out the same rows (using GROUP BY HAVING COUNT and finally use PIVOT to get rows with different columns only.

To do this easily you need to query the metadata for the table to get each row. You can use the following code as a script.
Replace the define table_name with your table name and define yes_drop_it = NO. Put your raw WHERE syntax into the where_clause. The comparison logic always compares the first two rows returned for the where clause.
whenever sqlerror exit failure rollback;
set linesize 150
define test_tab_name = tst_cf_cols
define yes_drop_it = YES
define order_by = 1, 2
define where_clause = 1 = 1
define tab_owner = user
<<clearfirst>> begin
for clearout in (
select 'drop table ' || table_name as cmd
from all_tables
where owner = &&tab_owner and table_name = upper('&&test_tab_name')
and '&&yes_drop_it' = 'YES'
) loop
execute immediate clearout.cmd;
execute immediate '
create table &&test_tab_name as
select 10 as column1, 40 as column2, ''sometext'' as column3, 24 as column4 from dual
union all
select 10 as column1, 25 as column2, ''sometext'' as column3, 24 as column4 from dual
';
end loop;
end;
/
column cfsynt format a4000 word_wrap new_value comparison_syntax
with parms as (select 'parmquery' as cte_name, 'row_a' as corr_name_1, 'row_b' as corr_name_2 from dual)
select
'select * from (select ' || LISTAGG(cfcol || ' AS cf_' || trim (to_char (column_id, '000')) || '_' || column_name
, chr(13) || ', ') WITHIN GROUP (order by column_id)
|| chr(13) || ' from (select * from parmquery where row_number = 1) ' || corr_name_1
|| chr(13) || ', (select * from parmquery where row_number = 2) ' || corr_name_2
|| chr(13) || ') where ''DIFFERENT'' IN (' || LISTAGG ('cf_' || trim (to_char (column_id, '000')) || '_' || column_name, chr(13) || ', ') within group (order by column_id) || ')'
as cfsynt
from parms, (
select
'decode (' || corr_name_1 || '.' || column_name || ', ' || corr_name_2
|| '.' || column_name || ', ''SAME'', ''DIFFERENT'')'
as cfcol,
column_name,
column_id
from
parms,
all_tab_columns
where
owner = &&tab_owner and table_name = upper ('&&test_tab_name')
);
with parmquery as (select rownum as row_number, vals.* from (
select * from &&test_tab_name
where &&where_clause
order by &&order_by
) vals
) &&comparison_syntax
;

SQL scripts to generate SQL scripts

Assume that the DBA_TAB_COLUMNS looks like this:
I'd like to write a SQL or PL/SQL script to generate following text:
select 'NULL' as A1, B1, QUERY, RECORD_KEY from SMHIST.probsummarym1
union all
select 'NULL' as A1, 'NULL' as B1, QUERY, RECORD_KEY from SMHIST_EIT200.probsummarym1
union all
select A1, 'NULL' as B1, QUERY, RECORD_KEY from SMHIST_EIT300.probsummarym1
the requirements are:
If the table under any of the SMHIST% schemas do not have that column, then insert a default NULL alias for that columns.
the column list is in alphabetical order.
so can anybody tell me how to write this script?

EDIT: Added better alias names and en explicit CROSS JOIN. Added XMLAGG version.
NB: LISTAGG exists from Oracle version 11.2 and onwards and returns VARCHAR2. If the output string is larger than 4000K or if on a prior version you can use XMLAGG which is a bit more cumbersome to work with (eg. http://psoug.org/definition/xmlagg.htm).
With LISTAGG (returning VARCHAR2):
SELECT LISTAGG (line,
CHR (13) || CHR (10) || 'union all' || CHR (13) || CHR (10))
WITHIN GROUP (ORDER BY sortorder)
script
FROM (SELECT line, ROWNUM sortorder
FROM ( SELECT 'select '
|| LISTAGG (
CASE
WHEN tc.column_name IS NULL
THEN
'''NULL'' as '
END
|| col_join.column_name,
', ')
WITHIN GROUP (ORDER BY col_join.column_name)
|| ' from '
|| col_join.owner
|| '.'
|| col_join.table_name
line
FROM dba_tab_columns tc
RIGHT OUTER JOIN
(SELECT DISTINCT
owner, table_name, col_list.column_name
FROM dba_tab_columns
CROSS JOIN
(SELECT DISTINCT column_name
FROM dba_tab_columns
WHERE owner LIKE 'SMHIST%') col_list
WHERE owner LIKE 'SMHIST%') col_join
ON tc.owner = col_join.owner
AND tc.table_name = col_join.table_name
AND tc.column_name = col_join.column_name
GROUP BY col_join.owner, col_join.table_name
ORDER BY col_join.owner, col_join.table_name))
With XMLAGG (returning CLOB by adding .getclobval (), note: RTRIM works here because table names cannot include ',' and ' ' (space)):
SELECT REPLACE (SUBSTR (script, 1, LENGTH (script) - 12),
'&' || 'apos;',
'''')
FROM (SELECT XMLAGG (
XMLELEMENT (
e,
line,
CHR (13)
|| CHR (10)
|| 'union all'
|| CHR (13)
|| CHR (10))).EXTRACT ('//text()').getclobval ()
script
FROM (SELECT line, ROWNUM sortorder
FROM ( SELECT 'select '
|| RTRIM (
REPLACE (
XMLAGG (XMLELEMENT (
e,
CASE
WHEN tc.column_name
IS NULL
THEN
'''NULL'' as '
END
|| col_join.column_name,
', ') ORDER BY
col_join.column_name).EXTRACT (
'//text()').getclobval (),
'&' || 'apos;',
''''),
', ')
|| ' from '
|| col_join.owner
|| '.'
|| col_join.table_name
line
FROM dba_tab_columns tc
RIGHT OUTER JOIN
(SELECT DISTINCT
owner,
table_name,
col_list.column_name
FROM dba_tab_columns
CROSS JOIN
(SELECT DISTINCT column_name
FROM dba_tab_columns
WHERE owner LIKE 'SMHIST%') col_list
WHERE owner LIKE 'SMHIST%') col_join
ON tc.owner = col_join.owner
AND tc.table_name = col_join.table_name
AND tc.column_name = col_join.column_name
GROUP BY col_join.owner, col_join.table_name
ORDER BY col_join.owner, col_join.table_name)))

How to sort data in Oracle SQL with sub query and wm_concat

Below is a sub-query of a bigger query, what I am trying to do is to get last 5 documents sorted by SL_DT in descending.
I always get an error that the right parenthesis is missing, I have also considered using row_number() over (order by pa.last_modified_date desc) but it doesn't work.
SELECT REPLACE (
wm_concat( SL_TXN_CODE
|| ' - '
|| SL_NO
|| '('
|| SL_DT
|| ') - '
|| SUM (SL_QTY)),
',',
' ,'
)
FROM STK_LEDGER
WHERE ROWNUM <= 5
AND SL_ITEM_CODE =
(SELECT IDH_ITEM_CODE
FROM AA_ITEM_DEFINATION_HEAD
WHERE IDH_SUPP_BC_1 = '111' OR IDH_ITEM_CODE = '111')
AND SL_TXN_TYPE IN ('SARTN', 'GRN', 'LTRFI')
AND SL_LOCN_CODE NOT IN ('D2', 'D4', 'D5')
GROUP BY SL_TXN_CODE, SL_NO, SL_DT
ORDER BY SL_DT DESC
Please suggest the best way to sort SL_DT in descending and getting the 5 records only. As you can see that I need all data in one single field.
The database is Oracle 10g.
Thanks in advance.

SELECT VALUE
FROM (SELECT VALUE, ROWNUM AS ROW_NUM
FROM (SELECT REPLACE(WM_CONCAT(SL_TXN_CODE || ' - ' || SL_NO || '(' ||
SL_DT || ') - ' || SUM(SL_QTY)),
',',
' ,') AS VALUE
FROM STK_LEDGER
WHERE SL_ITEM_CODE =
(SELECT IDH_ITEM_CODE
FROM AA_ITEM_DEFINATION_HEAD
WHERE IDH_SUPP_BC_1 = '111'
OR IDH_ITEM_CODE = '111')
AND SL_TXN_TYPE IN ('SARTN', 'GRN', 'LTRFI')
AND SL_LOCN_CODE NOT IN ('D2', 'D4', 'D5')
GROUP BY SL_TXN_CODE, SL_NO, SL_DT
ORDER BY SL_DT DESC))
WHERE ROW_NUM <= 5

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Pivot BigQuery table using multiple rows - google-bigquery

Use below EXECUTE IMMEDIATE( SELECT 'SELECT id, ' || STRING_AGG( 'MAX(IF(key = "' || key || '", value, NULL)) as `' || key || '`' ) || ' FROM `project.dataset.table` GROUP BY id ORDER BY id' FROM ( SELECT key FROM `project.dataset.table` GROUP BY key ORDER BY key ) );

Related

Convert ` LISTAGG` TO `XMLAGG`

Get count of rows from multiple tables Redshift SQL?

Comparing column by column between two rows in Oracle DB

SQL scripts to generate SQL scripts

How to sort data in Oracle SQL with sub query and wm_concat

Categories

Resources