PostgreSQL convert columns to rows? Transpose? - sql

I have a PostgreSQL function (or table) which gives me the following output:
Sl.no username Designation salary etc..
1 A XYZ 10000 ...
2 B RTS 50000 ...
3 C QWE 20000 ...
4 D HGD 34343 ...
Now I want the Output as below:
Sl.no 1 2 3 4 ...
Username A B C D ...
Designation XYZ RTS QWE HGD ...
Salary 10000 50000 20000 34343 ...
How to do this?

SELECT
unnest(array['Sl.no', 'username', 'Designation','salary']) AS "Columns",
unnest(array[Sl.no, username, value3Count,salary]) AS "Values"
FROM view_name
ORDER BY "Columns"
Reference : convertingColumnsToRows

Basing my answer on a table of the form:
CREATE TABLE tbl (
sl_no int
, username text
, designation text
, salary int
);
Each row results in a new column to return. With a dynamic return type like this, it's hardly possible to make this completely dynamic with a single call to the database. Demonstrating solutions with two steps:
Generate query
Execute generated query
Generally, this is limited by the maximum number of columns a table can hold. So not an option for tables with more than 1600 rows (or fewer). Details:
What is the maximum number of columns in a PostgreSQL select query
Postgres 9.4+
Dynamic solution with crosstab()
Use the first one you can. Beats the rest.
SELECT 'SELECT *
FROM crosstab(
$ct$SELECT u.attnum, t.rn, u.val
FROM (SELECT row_number() OVER () AS rn, * FROM '
|| attrelid::regclass || ') t
, unnest(ARRAY[' || string_agg(quote_ident(attname)
|| '::text', ',') || '])
WITH ORDINALITY u(val, attnum)
ORDER BY 1, 2$ct$
) t (attnum bigint, '
|| (SELECT string_agg('r'|| rn ||' text', ', ')
FROM (SELECT row_number() OVER () AS rn FROM tbl) t)
|| ')' AS sql
FROM pg_attribute
WHERE attrelid = 'tbl'::regclass
AND attnum > 0
AND NOT attisdropped
GROUP BY attrelid;
Operating with attnum instead of actual column names. Simpler and faster. Join the result to pg_attribute once more or integrate column names like in the pg 9.3 example.
Generates a query of the form:
SELECT *
FROM crosstab(
$ct$
SELECT u.attnum, t.rn, u.val
FROM (SELECT row_number() OVER () AS rn, * FROM tbl) t
, unnest(ARRAY[sl_no::text,username::text,designation::text,salary::text]) WITH ORDINALITY u(val, attnum)
ORDER BY 1, 2$ct$
) t (attnum bigint, r1 text, r2 text, r3 text, r4 text);
This uses a whole range of advanced features. Just too much to explain.
Simple solution with unnest()
One unnest() can now take multiple arrays to unnest in parallel.
SELECT 'SELECT * FROM unnest(
''{sl_no, username, designation, salary}''::text[]
, ' || string_agg(quote_literal(ARRAY[sl_no::text, username::text, designation::text, salary::text])
|| '::text[]', E'\n, ')
|| E') \n AS t(col,' || string_agg('row' || sl_no, ',') || ')' AS sql
FROM tbl;
Result:
SELECT * FROM unnest(
'{sl_no, username, designation, salary}'::text[]
,'{10,Joe,Music,1234}'::text[]
,'{11,Bob,Movie,2345}'::text[]
,'{12,Dave,Theatre,2356}'::text[])
AS t(col,row1,row2,row3,row4);
db<>fiddle here
Old sqlfiddle
Postgres 9.3 or older
Dynamic solution with crosstab()
Completely dynamic, works for any table. Provide the table name in two places:
SELECT 'SELECT *
FROM crosstab(
''SELECT unnest(''' || quote_literal(array_agg(attname))
|| '''::text[]) AS col
, row_number() OVER ()
, unnest(ARRAY[' || string_agg(quote_ident(attname)
|| '::text', ',') || ']) AS val
FROM ' || attrelid::regclass || '
ORDER BY generate_series(1,' || count(*) || '), 2''
) t (col text, '
|| (SELECT string_agg('r'|| rn ||' text', ',')
FROM (SELECT row_number() OVER () AS rn FROM tbl) t)
|| ')' AS sql
FROM pg_attribute
WHERE attrelid = 'tbl'::regclass
AND attnum > 0
AND NOT attisdropped
GROUP BY attrelid;
Could be wrapped into a function with a single parameter ...
Generates a query of the form:
SELECT *
FROM crosstab(
'SELECT unnest(''{sl_no,username,designation,salary}''::text[]) AS col
, row_number() OVER ()
, unnest(ARRAY[sl_no::text,username::text,designation::text,salary::text]) AS val
FROM tbl
ORDER BY generate_series(1,4), 2'
) t (col text, r1 text,r2 text,r3 text,r4 text);
Produces the desired result:
col r1 r2 r3 r4
-----------------------------------
sl_no 1 2 3 4
username A B C D
designation XYZ RTS QWE HGD
salary 10000 50000 20000 34343
Simple solution with unnest()
SELECT 'SELECT unnest(''{sl_no, username, designation, salary}''::text[] AS col)
, ' || string_agg('unnest('
|| quote_literal(ARRAY[sl_no::text, username::text, designation::text, salary::text])
|| '::text[]) AS row' || sl_no, E'\n , ') AS sql
FROM tbl;
Slow for tables with more than a couple of columns.
Generates a query of the form:
SELECT unnest('{sl_no, username, designation, salary}'::text[]) AS col
, unnest('{10,Joe,Music,1234}'::text[]) AS row1
, unnest('{11,Bob,Movie,2345}'::text[]) AS row2
, unnest('{12,Dave,Theatre,2356}'::text[]) AS row3
, unnest('{4,D,HGD,34343}'::text[]) AS row4
Same result.

If (like me) you were needing this information from a bash script, note there is a simple command-line switch for psql to tell it to output table columns as rows:
psql mydbname -x -A -F= -c "SELECT * FROM foo WHERE id=123"
The -x option is the key to getting psql to output columns as rows.

I have a simpler approach than Erwin pointed above, that worked for me with Postgres (and I think that it should work with all major relational databases whose support SQL standard)
You can use simply UNION instead of crosstab:
SELECT text 'a' AS "text" UNION SELECT 'b';
text
------
a
b
(2 rows)
Of course that depends on the case in which you are going to apply this. Considering that you know beforehand what fields you need, you can take this approach even for querying different tables. I.e.:
SELECT 'My first metric' as name, count(*) as total from first_table UNION
SELECT 'My second metric' as name, count(*) as total from second_table
name | Total
------------------|--------
My first metric | 10
My second metric | 20
(2 rows)
It's a more maintainable approach, IMHO. Look at this page for more information: https://www.postgresql.org/docs/current/typeconv-union-case.html

There is no proper way to do this in plain SQL or PL/pgSQL.
It will be way better to do this in the application, that gets the data from the DB.

Related

How to merge two tables with different column number in Snowflake?

I am querying TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED columns from Snowflake information schema. VIEWS. Next, I would like to MERGE that table with row count for the view. Below are my queries I am running in Snowflake my issue is I am not sure how to combine these two table in 1 table ?
Note: I am new to Snowflake. Please provide code with explanation.
Thanks in advance for help!
Query 1
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED FROM DB.SCHEMA.VIEWS
WHERE TABLE_SCHEMA="MY_SHEMA" AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3')
Query 2
SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE1
UNION ALL SELECT COUNT(*) FROM DB.SCHEMA.VIEW_TABLE2
To get result of the COUNT(*) needs to be built dynamically and attached to the "driving query".
Sample data:
CREATE VIEW VIEW_TABLE1(c)
AS
SELECT 1;
CREATE VIEW VIEW_TABLE2(e)
AS
SELECT 2 UNION ALL SELECT 4;
CREATE VIEW VIEW_TABLE3(f)
AS
SELECT 3;
Full query:
DECLARE
QUERY STRING;
RES RESULTSET;
BEGIN
SELECT
LISTAGG(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
$$SELECT '<TABLE_SCHEMA>' AS TABLE_SCHEMA,
'<TABLE_NAME>' AS TABLE_NAME,
'<CREATED>' AS CREATED,
'<LAST_ALTERED>' AS LAST_ALTERED,
COUNT(*) AS cnt
FROM <tab_name>
$$,
'<TABLE_SCHEMA>', v.TABLE_SCHEMA),
'<TABLE_NAME>', v.TABLE_NAME),
'<CREATED>', v.CREATED),
'<LAST_ALTERED>', v.LAST_ALTERED),
'<tab_name>', CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name)),
' UNION ALL ') WITHIN GROUP (ORDER BY CONCAT_WS('.', v.table_catalog, v.table_schema, v.table_name))
INTO :QUERY
FROM INFORMATION_SCHEMA.VIEWS v
WHERE TABLE_SCHEMA='PUBLIC'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
RES := (EXECUTE IMMEDIATE :QUERY);
RETURN TABLE(RES);
END;
Output:
Rationale:
The ideal query would be(pseudocode):
SELECT TABLE_SCHEMA,TABLE_NAME,CREATED,LAST_ALTERED,
EVAL('SELECT COUNT(*) FROM ' || view_name) AS row_count
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_SCHEMA='MY_SHEMA'
AND TABLE_NAME IN ('VIEW_TABLE1','VIEW_TABLE2','VIEW_TABLE3');
Such construct EVAL(dynamic query) at SELECT list does not exist as it would require building a query on the fly and execute per each row. Though for some RDBMSes are workaround like dbms_xmlgen.getxmltype
Include table/view names as string in your count(*) queries and then you can join.
Example below -
select * from
(SELECT TABLE_SCHEMA,TABLE_NAME,CREATED FROM information_schema.tables
WHERE TABLE_SCHEMA='PUBLIC' AND TABLE_NAME IN ('D1','D2')) t1
left join
(
SELECT 'D1' table_name, COUNT(*) FROM d1
UNION ALL SELECT 'D2',COUNT(*) FROM d2) t2
on t1.table_name = t2.table_name ;
TABLE_SCHEMA
TABLE_NAME
CREATED
TABLE_NAME
COUNT(*)
PUBLIC
D1
2022-04-06 14:24:56.224 -0700
D1
12
PUBLIC
D2
2022-04-06 14:25:27.276 -0700
D2
5

Query select with next id order

I have a table with ID and NextID like this:
MainID || NextID
1 || 2
2 || 3
3 || 5
4 || 6
5 || 4
6 || ...
... || ...
what I want to achieve is select data into like this
MainID || NextID
1 || 2
2 || 3
3 || 5
5 || 4
4 || 6
6 || ...
... || ...
what i've tried is simple query like :
SELECT * FROM 'table' ORDER BY NextID
but of course it didn't meet my needs,
I have an idea to create a temp table and insert with loop but takes too much time to complete :
WHILE #NextID IS NOT NULL
BEGIN
INSERT INTO 'table'(MainID, NextID)
SELECT MainID, NextId
FROM 'table' WHERE MainID=#NextID
END
Can anyone help me?
Thanks
Recursive cte will return rows in the order of nodes visited
with t as (
select f.*, right('00000000'+cast(f.mainId as varchar(max)),9) path
from yourtable f
where MainID=1
union all
select f.*, path + '->' + right('00000000'+cast(f.mainId as varchar(max)),9)
from t
join yourtable f on t.NextID = f.MainID
)
select *
from t
order by path
db<>fiddle
where MainId=1 is an arbitrary start. You may wish also start with
where not exists (select 1 from yourtable f2 where f2.Nextid = f.MainId)
Edit
Added explicit order by
For this particular case you may use right join with some ordering
select t2.*
from some_table t1
right join some_table t2
on t1. main_id = t2.next_id
order by case when t2.next_id is null then 9999999 else t2.main_id + t2.next_id end;
the 999999 in the "order by" part is to place last line (6, null) to the end of the output.
Good luck with adopting the query to your real data

In BigQuery, identify when columns do not match on UNION ALL

with
table1 as (
select 'joe' as name, 17 as age, 25 as speed
),
table2 as (
select 'nick' as name, 21 as speed, 23 as strength
)
select * from table1
union all
select * from table2
In Google BigQuery, this union all does not throw an error because both tables have the same number of columns (3 each). However I receive bad data output because the columns do not match. Rather than outputting a new table with 4 columns name, age, speed, strength with correct values + nulls for missing values (which would probably be preferred), the union all keeps the 3 columns from the top row.
Is there a good way to catch that the columns do not match, rather than the query silently returning bad data? Is there any way for this to return an error perhaps, as opposed to a successful table? I'm not sure how to check in SQL that the columns in the 2 tables match.
Edit: in this example it is clear to see that the columns do not match, however in our data we have 100+ columns and we want to avoid a situation where we make an error in a UNION ALL
Below is for BigQuery Standard SQL and using scripting feature of BQ
DECLARE statement STRING;
SET statement = (
WITH table1_columns AS (
SELECT column FROM (SELECT * FROM `project.dataset.table1` LIMIT 1) t,
UNNEST(REGEXP_EXTRACT_ALL(TRIM(TO_JSON_STRING(t), '{}'), r'"([^"]*)":')) column
), table2_columns AS (
SELECT column FROM (SELECT * FROM `project.dataset.table2` LIMIT 1) t,
UNNEST(REGEXP_EXTRACT_ALL(TRIM(TO_JSON_STRING(t), '{}'), r'"([^"]*)":')) column
), all_columns AS (
SELECT column FROM table1_columns UNION DISTINCT SELECT column FROM table2_columns
)
SELECT (
SELECT 'SELECT ' || STRING_AGG(IF(t.column IS NULL, 'NULL as ', '') || a.column, ', ') || ' FROM `project.dataset.table1` UNION ALL '
FROM all_columns a LEFT JOIN table1_columns t USING(column)
) || (
SELECT 'SELECT ' || STRING_AGG(IF(t.column IS NULL, 'NULL as ', '') || a.column, ', ') || ' FROM `project.dataset.table2`'
FROM all_columns a LEFT JOIN table2_columns t USING(column)
)
);
EXECUTE IMMEDIATE statement;
when applied to sample data from your question - output is
Row name age speed strength
1 joe 17 25 null
2 nick null 21 23
After saving table1 and table2 as 2 tables in a dataset in BigQuery, I then used the metadata using INFORMATION_SCHEMA to check that the columns matched.
SELECT *
FROM models.INFORMATION_SCHEMA.COLUMNS
where table_name = 'table1'
SELECT *
FROM models.INFORMATION_SCHEMA.COLUMNS
where table_name = 'table2'
INFORMATION_SCHEMA.COLUMNS returns information including the column names and their positioning. I can join these 2 tables together then to check that the names match...

Unable to query multiple tables via XML: Error occurred in XML processing

I would like to get a column_name and table_name for a list from all_tab_columns (which is not a problem till here) and also for each given column I want to go to the original table/column and see what is the top value with highest occurence.
With the query below I get the desired value for 1 example of column in 1 table:
select col1
from (SELECT col1, rank () over (order by count(*) desc) as rnk
from T1
Group by col1
)
where rnk = 1
now I want something like this:
select table_name,
column_name,
xmlquery('/ROWSET/ROW/C/text()'
passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name || ' from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from '
|| table_name || ' Group by ' || column_name || ') where rnk = 1;'))
returning content) as C
from all_tab_columns
where owner = 'S1'
and table_name in ('T1', 'T2', 'T3', 'T4')
;
but it does not work. This is the error I get:
ORA-19202: Error occurred in XML processing
ORA-00933: SQL command not properly ended
ORA-06512: at "SYS.DBMS_XMLGEN", line 176
ORA-06512: at line 1
19202. 00000 - "Error occurred in XML processing%s"
*Cause: An error occurred when processing the XML function
*Action: Check the given error message and fix the appropriate problem
I make an example. These are my two tables, for instance; T1:
col.1 col.2 col.3
----- ---------- -----
y m1
y 22 m2
n 45 m2
y 10 m5
and T2:
col.1 col.2 col.3
----- ------- -----
1 germany xxx
2 england xxx
3 germany uzt
3 germany vvx
8 US XXX
so
from T1/Col.1 I should get 'y'
from T1/col.3 I should get 'm2'
from T2/col.3 I should get 'xxx'
and so on.
The important error in what has been reported to you is this one:
ORA-00933: SQL command not properly ended
Remove the semicolon from the query inside the dbms_xmlgen.getxml() call:
select table_name,
column_name,
xmlquery('/ROWSET/ROW/C/text()'
passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name || ' from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from '
|| table_name || ' Group by ' || column_name || ') where rnk = 1'))
-------^ no semicolon here
returning content) as C
from all_tab_columns
...
Your XPath seems to be wrong too though; you're looking for /ROWSET/ROW/C, but C is the column alias for the entire expression, not the column being counted. You need to alias the column name within the query, and use that in the XPath:
select table_name,
column_name,
xmlquery('/ROWSET/ROW/COL/text()'
-- ^^^
passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name || ' as col from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from '
-- ^^^^^^
|| table_name || ' Group by ' || column_name || ') where rnk = 1'))
returning content) as C
from all_tab_columns
...
With your sample data that gets:
TABLE_NAME COLUMN_NAME C
------------------------------ ------------------------------ ----------
T1 col.1 y
T1 col.2 224510
T1 col.3 m2
T2 col.1 3
T2 col.2 germany
T2 col.3 xxx
db<>fiddle
The XMLQuery is returning an XMLtype result, which your client is apparently showing as (XMLTYPE). You can probably change that behaviour - e.g. in Sql Developer from Tool->Preferences->Database->Advanced->DIsplay XMl Value in Grid. But you can also convert the reult to a string, using getStringVal() to return a varchar2 (or getClobVal() if you have CLOB values, which might cause you other issues):
select table_name,
column_name,
xmlquery('/ROWSET/ROW/COL/text()'
passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name || ' as col from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from '
|| table_name || ' Group by ' || column_name || ') where rnk = 1'))
returning content).getStringVal() as C
-- ^^^^^^^^^^^^^^^
from all_tab_columns
...
As you can see, this doesn't do quite what you might expect when there are ties due to equal counts - in your example, there are found different values for T1."col.2" (null, 10, 22, 45) which each appear once; and the XMLQuery is sticking them all together in one result. You need to decide what you want to happen in that case; if you only want to see one then you need to specify how to decide to break ties, within the analytic order by clause.
I actually want to see all results but I expected to see them in different rows
An alternative approach that allows that is to use XMLTable instead of XMLQuery:
select table_name, column_name, value
from (
select atc.table_name, atc.column_name, x.value, x.value_count,
rank() over (partition by atc.table_name, atc.column_name
order by x.value_count desc) as rnk
from all_tab_columns atc
cross join xmltable(
'/ROWSET/ROW'
passing xmltype(dbms_xmlgen.getxml(
'select "' || column_name || '" as value, count(*) as value_count '
|| 'from ' || table_name || ' '
|| 'group by "' || column_name || '"'))
columns value varchar2(4000) path 'VALUE',
value_count number path 'VALUE_COUNT'
) x
where atc.owner = user
and atc.table_name in ('T1', 'T2', 'T3', 'T4')
)
where rnk = 1;
The inner query cross-joins all_tab_columns to an XMLTable which does a simpler dbms_xmlgen.get_xml() call to just get every value and its count, extracts the values and counts as relational data from the generated XML, and includes the ranking function as part of that subquery rather than within the XML generation. If you run the subquery on its own you'll see all possibel values and their counts, along with each values' ranking.
The outer query then just filters on the ranking, and shows you the relevant columns from the subquery for the first-ranked result.
db<>fiddle

Listagg Overflow function implementation (Oracle SQL)

I am using LISTAGG function for my query, however, it returned an ORA-01489: result of string concatenation is too long error. So I googled that error and found out I can use ON OVERFLOW TRUNCATE and I implemented that into my SQL but now it generates missing right parenthesis error and I can't seem to figure out why?
My query
SELECT DISTINCT cust_id, acct_no, state, language_indicator, billing_system, market_code,
EMAIL_ADDR, DATE_OF_CHANGE, TO_CHAR(DATE_LOADED, 'DD-MM-YYYY') DATE_LOADED,
(SELECT LISTAGG( SUBSTR(mtn, 7, 4),'<br>' ON OVERFLOW TRUNCATE '***' )
WITHIN GROUP (ORDER BY cust_id || acct_no) mtnlist
FROM process.feature WHERE date_loaded BETWEEN TO_DATE('02-08-2018','MM-dd-yyyy')
AND TO_DATE('02-09-2018', 'MM-dd-yyyy') AND cust_id = ffsr.cust_id
AND acct_no = ffsr.acct_no AND filename = 'FEATURE.VB2B.201802090040'
GROUP BY cust_id||acct_no) mtnlist
FROM process.feature ffsr WHERE date_loaded BETWEEN TO_DATE('02-08-2018','MM-dd-yyyy')
AND TO_DATE('02-09-2018','MM-dd-yyyy') AND cust_id BETWEEN 0542185146 AND 0942025571
AND src_ind = 'B' AND filename = 'FEATURE.VB2B.201802090040'
AND letter_type = 'FA' ORDER BY cust_id;
With a little bit of help by XML, you might get it work. Example is based on HR schema.
SQL> select
2 listagg(s.department_name, ',') within group (order by null) result
3 from departments s, departments d;
from departments s, departments d
*
ERROR at line 3:
ORA-01489: result of string concatenation is too long
SQL>
SQL> select
2 rtrim(xmlagg(xmlelement (e, s.department_name || ',')).extract
3 ('//text()').getclobval(), ',') result
4 from departments s, departments d;
RESULT
--------------------------------------------------------------------------------
Administration,Administration,Administration,Administration,Administration,Admin
SQL>
This demo sourced from livesql.oracle.com
-- Create table with 93 strings of different lengths, plus one NULL string. Notice the only ASCII character not used is '!', so I will use it as a delimiter in LISTAGG.
create table strings as
with letters as (
select level num,
chr(ascii('!')+level) let
from dual
connect by level <= 126 - ascii('!')
union all
select 1, null from dual
)
select rpad(let,num,let) str from letters;
-- Note the use of LENGTHB to get the length in bytes, not characters.
select str,
sum(lengthb(str)+1) over(order by str rows unbounded preceding) - 1 cumul_lengthb,
sum(lengthb(str)+1) over() - 1 total_lengthb,
count(*) over() num_values
from strings
where str is not null;
-- This statement implements the ON OVERFLOW TRUNCATE WITH COUNT option of LISTAGG in 12.2. If there is no overflow, the result is the same as a normal LISTAGG.
select listagg(str, '!') within group(order by str) ||
case when max(total_lengthb) > 4000 then
'! ... (' || (max(num_values) - count(*)) || ')'
end str_list
from (
select str,
sum(lengthb(str)+1) over(order by str) - 1 cumul_lengthb,
sum(lengthb(str)+1) over() - 1 total_lengthb,
count(*) over() num_values
from strings
where str is not null
)
where total_lengthb <= 4000
or cumul_lengthb <= 4000 - length('! ... (' || num_values || ')');