I have a table like:
Key type value
---------------------
40 A 12.34
41 A 10.24
41 B 12.89
I want it in the format:
Types 40 41 42 (keys)
---------------------------------
A 12.34 10.24 XXX
B YYY 12.89 ZZZ
How can this be done through an SQL query. Case statements, decode??
What you're looking for is called a "pivot" (see also "Pivoting Operations" in the Oracle Database Data Warehousing Guide):
SELECT *
FROM tbl
PIVOT(SUM(value) FOR Key IN (40, 41, 42))
It was added to Oracle in 11g. Note that you need to specify the result columns (the values from the unpivoted column that become the pivoted column names) in the pivot clause. Any columns not specified in the pivot are implicitly grouped by. If you have columns in the original table that you don't wish to group by, select from a view or subquery, rather than from the table.
You can engage in a bit of wizardry and get Oracle to create the statement for you, so that you don't need to figure out what column values to pivot on. In 11g, when you know the column values are numeric:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN ('
|| LISTAGG(Key, ',') WITHIN GROUP (ORDER BY Key)
|| ');'
FROM tbl;
If the column values might not be numeric:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN (\''
|| LISTAGG(Key, '\',\'') WITHIN GROUP (ORDER BY Key)
|| '\'));'
FROM tbl;
LISTAGG probably repeats duplicates (would someone test this?), in which case you'd need:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN (\''
|| LISTAGG(Key, '\',\'') WITHIN GROUP (ORDER BY Key)
|| '\'));'
FROM (SELECT DISTINCT Key FROM tbl);
You could go further, defining a function that takes a table name, aggregate expression and pivot column name that returns a pivot statement by first producing then evaluating the above statement. You could then define a procedure that takes the same arguments and produces the pivoted result. I don't have access to Oracle 11g to test it, but I believe it would look something like:
CREATE PACKAGE dynamic_pivot AS
-- creates a PIVOT statement dynamically
FUNCTION pivot_stmt (tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE)
RETURN varchar2(300);
PRAGMA RESTRICT_REFERENCES (pivot_stmt, WNDS, RNPS);
-- creates & executes a PIVOT
PROCEDURE pivot_table (tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE);
END dynamic_pivot;
CREATE PACKAGE BODY dynamic_pivot AS
FUNCTION pivot_stmt (
tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr_expr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE
) RETURN varchar2(300)
IS
stmt VARCHAR2(400);
quote VARCHAR2(2) DEFAULT '';
BEGIN
IF quote_values THEN
quote := '\\\'';
END IF;
-- "\||" shows that you are still in the dynamic statement string
-- The input fields aren't sanitized, so this is vulnerable to injection
EXECUTE IMMEDIATE 'SELECT \'SELECT * FROM ' || tbl_name
|| ' PIVOT(' || aggr_expr || ' FOR ' || pivot_col
|| ' IN (' || quote || '\' \|| LISTAGG(' || pivot_col
|| ', \'' || quote || ',' || quote
|| '\') WITHIN GROUP (ORDER BY ' || pivot_col || ') \|| \'' || quote
|| '));\' FROM (SELECT DISTINCT ' || pivot_col || ' FROM ' || tbl_name || ');'
INTO stmt;
RETURN stmt;
END pivot_stmt;
PROCEDURE pivot_table (tbl_name IN varchar2(30), pivot_col IN varchar2(30), aggr_expr IN varchar2(40), quote_values IN BOOLEAN DEFAULT TRUE) IS
BEGIN
EXECUTE IMMEDIATE pivot_stmt(tbl_name, pivot_col, aggr_expr, quote_values);
END pivot_table;
END dynamic_pivot;
Note: the length of the tbl_name, pivot_col and aggr_expr parameters comes from the maximum table and column name length. Note also that the function is vulnerable to SQL injection.
In pre-11g, you can apply MySQL pivot statement generation techniques (which produces the type of query others have posted, based on explicitly defining a separate column for each pivot value).
Pivot does simplify things greatly. Before 11g however, you need to do this manually.
select
type,
sum(case when key = 40 then value end) as val_40,
sum(case when key = 41 then value end) as val_41,
sum(case when key = 42 then value end) as val_42
from my_table
group by type;
Never tried it but it seems at least Oracle 11 has a PIVOT clause
If you do not have access to 11g, you can utilize a string aggregation and a grouping method to approx. what you are looking for such as
with data as(
SELECT 40 KEY , 'A' TYPE , 12.34 VALUE FROM DUAL UNION
SELECT 41 KEY , 'A' TYPE , 10.24 VALUE FROM DUAL UNION
SELECT 41 KEY , 'B' TYPE , 12.89 VALUE FROM DUAL
)
select
TYPE ,
wm_concat(KEY) KEY ,
wm_concat(VALUE) VALUE
from data
GROUP BY TYPE;
type KEY VALUE
------ ------- -----------
A 40,41 12.34,10.24
B 41 12.89
This is based on wm_concat as shown here: http://www.oracle-base.com/articles/misc/StringAggregationTechniques.php
I'm going to leave this here just in case it helps, but I think PIVOT or MikeyByCrikey's answers would best suit your needs after re-looking at your sample results.
Related
If customer first_name-'Monika',
last_name='Awasthi'
Then I am using below query to return value in json format:
SELECT *
FROM
(
SELECT JSON_ARRAYAGG(JSON_OBJECT('CODE' IS '1','VALUE' IS 'Monika'||' '||'Awasthi'))
FROM DUAL
);
It is working fine & give below output:
[{"CODE":"1","VALUE":"Monika Awasthi"}]
But I want one more value which should be reversed means output should be:
[{"CODE":"1","VALUE":"Monika Awasthi"},{"CODE":"2","VALUE":"Awasthi Monika"}]
Kindly give me some suggestions. Thank You
Another approach is to use a CTE to generate the two codes and values; your original version could be written to get the name data from a table or CTE:
-- CTE for sample data
WITH cte (first_name, last_name) AS (
SELECT 'Monika', 'Awasthi' FROM DUAL
)
-- query against CTE or table
SELECT JSON_ARRAYAGG(JSON_OBJECT('CODE' IS '1','VALUE' IS last_name ||' '|| first_name))
FROM cte;
And you could then extend that with a CTE that generates the value with the names in both orders:
WITH cte1 (first_name, last_name) AS (
SELECT 'Monika', 'Awasthi' FROM DUAL
),
cte2 (code, value) AS (
SELECT 1 AS code, first_name || ' ' || last_name FROM cte1
UNION ALL
SELECT 2 AS code, last_name || ' ' || first_name FROM cte1
)
SELECT JSON_ARRAYAGG(JSON_OBJECT('CODE' IS code,'VALUE' IS value))
FROM cte2;
which gives:
JSON_ARRAYAGG(JSON_OBJECT('CODE'ISCODE,'VALUE'ISVALUE))
-------------------------------------------------------------------------
[{"CODE":1,"VALUE":"Monika Awasthi"},{"CODE":2,"VALUE":"Awasthi Monika"}]
db<>fiddle
A simple logic through use of SQL(without using PL/SQL) in order to generate code values as only be usable for two columns as in this case might be
SELECT JSON_ARRAYAGG(
JSON_OBJECT('CODE' IS tt.column_id,
'VALUE' IS CASE WHEN column_id=1
THEN name||' '||surname
ELSE surname||' '||name
END)
) AS result
FROM t
CROSS JOIN (SELECT column_id FROM user_tab_cols WHERE table_name = 'T') tt
where t is a table which hold name and surname columns
Demo
More resilient solution might be provided through use of PL/SQL, even more columns exist within the data source such as
DECLARE
v_jso VARCHAR2(4000);
v_arr OWA.VC_ARR;
v_arr_t JSON_ARRAY_T := JSON_ARRAY_T();
BEGIN
FOR c IN ( SELECT column_id FROM user_tab_cols WHERE table_name = 'T' )
LOOP
SELECT 'JSON_OBJECT( ''CODE'' IS '||MAX(c.column_id)||',
''VALUE'' IS '||LISTAGG(column_name,'||'' ''||')
WITHIN GROUP (ORDER BY ABS(column_id-c.column_id))
||' )'
INTO v_arr(c.column_id)
FROM ( SELECT * FROM user_tab_cols WHERE table_name = 'T' );
EXECUTE IMMEDIATE 'SELECT '||v_arr(c.column_id)||' FROM t' INTO v_jso;
v_arr_t.APPEND(JSON_OBJECT_T(v_jso));
END LOOP;
DBMS_OUTPUT.PUT_LINE(v_arr_t.STRINGIFY);
END;
/
Demo
As I explained in a comment under your question, I am not clear on how you define the CODE values for your JSON string (assuming you have more than one customer).
Other than that, if you need to create a JSON array of objects from individual strings (as in your attempt), you probably need to use JSON_ARRAY rather than JSON_ARRAYAGG. Something like I show below. Incidentally, I also don't know why you needed to SELECT * FROM (subquery) - the outer SELECT seems entirely unnecessary.
So, if you don't actually aggregate over a table, but just need to build a JSON array from individual pieces:
select json_array
(
json_object('CODE' is '1', 'VALUE' is first_name || ' ' || last_name ),
json_object('CODE' is '2', 'VALUE' is last_name || ' ' || first_name)
) as result
from ( select 'Monika' as first_name, 'Awasthi' as last_name from dual )
;
RESULT
------------------------------------------------------------------------------
[{"CODE":"1","VALUE":"Monika Awasthi"},{"CODE":"2","VALUE":"Awasthi Monika"}]
I have table like this:
Table-1
Table-2
Table-3
Table-4
Table-5
each table is having many columns and one of the column name is employee_id.
Now, I want to write a query which will
1) return all the tables which is having this columns and
2) results should show the tables if the column is having values or empty values by passing employee_id.
e.g. show table name, column name from Table-1, Table-2,Table-3,... where employee_id='1234'.
If one of the table doesn't have this column, then it is not required to show.
I have verified with link, but it shows only table name and column name and not by passing some column values to it.
Also verified this, but here verifies from entire schema which I dont want to do it.
UPDATE:
Found a solution, but by using xmlsequence which is deprecated,
1)how do I make this code as xmltable?
2) If there are no values in the table, then output should have empty/null. or default as "YES" value
WITH char_cols AS
(SELECT /*+materialize */ table_name, column_name
FROM cols
WHERE data_type IN ('CHAR', 'VARCHAR2') and table_name in ('Table-1','Table-2','Table-3','Table-4','Table-5'))
SELECT DISTINCT SUBSTR (:val, 1, 11) "Employee_ID",
SUBSTR (table_name, 1, 14) "Table",
SUBSTR (column_name, 1, 14) "Column"
FROM char_cols,
TABLE (xmlsequence (dbms_xmlgen.getxmltype ('select "'
|| column_name
|| '" from "'
|| table_name
|| '" where upper("'
|| column_name
|| '") like upper(''%'
|| :val
|| '%'')' ).extract ('ROWSET/ROW/*') ) ) t ORDER BY "Table"
/
This query can be done in one step using the (non-deprecated) XMLTABLE.
Sample Schema
--Table-1 and Table-2 match the criteria.
--Table-3 has the right column but not the right value.
--Table-4 does not have the right column.
create table "Table-1" as select '1234' employee_id from dual;
create table "Table-2" as select '1234' employee_id from dual;
create table "Table-3" as select '4321' employee_id from dual;
create table "Table-4" as select 1 id from dual;
Query
--All tables with the column EMPLOYEE_ID, and the number of rows where EMPLOYEE_ID = '1234'.
select table_name, total
from
(
--Get XML results of dynamic query on relevant tables and columns.
select
dbms_xmlgen.getXMLType(
(
--Create a SELECT statement on each table, UNION ALL'ed together.
select listagg(
'select '''||table_name||''' table_name, count(*) total
from "'||table_name||'" where employee_id = ''1234'''
,' union all'||chr(10)) within group (order by table_name) v_sql
from user_tab_columns
where column_name = 'EMPLOYEE_ID'
)
) xml
from dual
) x
cross join
--Convert the XML data to relational.
xmltable('/ROWSET/ROW'
passing x.xml
columns
table_name varchar2(128) path 'TABLE_NAME',
total number path 'TOTAL'
);
Results
TABLE_NAME TOTAL
---------- -----
Table-1 1
Table-2 1
Table-3 0
Just try to use code below.
Pay your attention that may be nessecery clarify scheme name in loop.
This code works for my local db.
set serveroutput on;
DECLARE
ex_query VARCHAR(300);
num NUMBER;
emp_id number;
BEGIN
emp_id := <put your value>;
FOR rec IN
(SELECT table_name
FROM all_tab_columns
WHERE column_name LIKE upper('employee_id')
)
LOOP
num :=0;
ex_query := 'select count(*) from ' || rec.table_name || ' where employee_id = ' || emp_id;
EXECUTE IMMEDIATE ex_query into num;
if (num>0) then
DBMS_OUTPUT.PUT_LINE(rec.table_name);
end if;
END LOOP;
END;
I tried with the xml thing, but I get an error I cannot solve. Something about a zero size result. How difficult is it to solve this instead of raising exception?! Ask Oracle.
Anyway.
What you can do is use the COLS table to know what table has the employee_id column.
1) what table from table TABLE_LIKE_THIS (I assume column with table names is C) has this column?
select *
from COLS, TABLE_LIKE_THIS t
where cols.table_name = t
and cols.column_name = 'EMPLOYEE_ID'
-- think Oracle metadata/ think upper case
2) Which one has the value you are looking for: write a little chunk of Dynamic PL/SQL with EXECUTE IMMEDIATE to count the tables matching above condition
declare
v_id varchar2(10) := 'JP1829'; -- value you are looking for
v_col varchar2(20) := 'EMPLOYEE_ID'; -- column
n_c number := 0;
begin
for x in (
select table_name
from all_tab_columns cols
, TABLE_LIKE_THIS t
where cols.table_name = t.c
and cols.column_name = v_col
) loop
EXECUTE IMMEDIATE
'select count(1) from '||x.table_name
||' where Nvl('||v_col||', ''##'') = ''' ||v_id||'''' -- adding quotes around string is a little specific
INTO n_c;
if n_c > 0 then
dbms_output.put_line(n_C|| ' in ' ||x.table_name||' has '||v_col||'='||v_id);
end if;
-- idem for null values
-- ... ||' where '||v_col||' is null '
-- or
-- ... ||' where Nvl('||v_col||', ''##'') = ''##'' '
end loop;
dbms_output.put_line('done.');
end;
/
Hope this helps
I need to count the number of null values of all the columns in a table in Oracle.
For instance, I execute the following statements to create a table TEST and insert data.
CREATE TABLE TEST
( A VARCHAR2(20 BYTE),
B VARCHAR2(20 BYTE),
C VARCHAR2(20 BYTE)
);
Insert into TEST (A) values ('a');
Insert into TEST (B) values ('b');
Insert into TEST (C) values ('c');
Now, I write the following code to compute the number of null values in the table TEST:
declare
cnt number :=0;
temp number :=0;
begin
for r in ( select column_name, data_type
from user_tab_columns
where table_name = upper('test')
order by column_id )
loop
if r.data_type <> 'NOT NULL' then
select count(*) into temp FROM TEST where r.column_name IS NULL;
cnt := cnt + temp;
END IF;
end loop;
dbms_output.put_line('Total: '||cnt);
end;
/
It returns 0, when the expected value is 6.
Where is the error?
Thanks in advance.
Counting NULLs for each column
In order to count NULL values for all columns of a table T you could run
SELECT COUNT(*) - COUNT(col1) col1_nulls
, COUNT(*) - COUNT(col2) col2_nulls
,..
, COUNT(*) - COUNT(colN) colN_nulls
, COUNT(*) total_rows
FROM T
/
Where col1, col2, .., colN should be replaced with actual names of columns of T table.
Aggregate functions -like COUNT()- ignore NULL values, so COUNT(*) - COUNT(col) will give you how many nulls for each column.
Summarize all NULLs of a table
If you want to know how many fields are NULL, I mean every NULL of every record you can
WITH d as (
SELECT COUNT(*) - COUNT(col1) col1_nulls
, COUNT(*) - COUNT(col2) col2_nulls
,..
, COUNT(*) - COUNT(colN) colN_nulls
, COUNT(*) total_rows
FROM T
) SELECT col1_nulls + col1_nulls +..+ colN_null
FROM d
/
Summarize all NULLs of a table (using Oracle dictionary tables)
Following is an improvement in which you need to now nothing but table name and it is very easy to code a function based on it
DECLARE
T VARCHAR2(64) := '<YOUR TABLE NAME>';
expr VARCHAR2(32767);
q INTEGER;
BEGIN
SELECT 'SELECT /*+FULL(T) PARALLEL(T)*/' || COUNT(*) || ' * COUNT(*) OVER () - ' || LISTAGG('COUNT(' || COLUMN_NAME || ')', ' + ') WITHIN GROUP (ORDER BY COLUMN_ID) || ' FROM ' || T
INTO expr
FROM USER_TAB_COLUMNS
WHERE TABLE_NAME = T;
-- This line is for debugging purposes only
DBMS_OUTPUT.PUT_LINE(expr);
EXECUTE IMMEDIATE expr INTO q;
DBMS_OUTPUT.PUT_LINE(q);
END;
/
Due to calculation implies a full table scan, code produced in expr variable was optimized for parallel running.
User defined function null_fields
Function version, also includes an optional parameter to be able to run on other schemas.
CREATE OR REPLACE FUNCTION null_fields(table_name IN VARCHAR2, owner IN VARCHAR2 DEFAULT USER)
RETURN INTEGER IS
T VARCHAR2(64) := UPPER(table_name);
o VARCHAR2(64) := UPPER(owner);
expr VARCHAR2(32767);
q INTEGER;
BEGIN
SELECT 'SELECT /*+FULL(T) PARALLEL(T)*/' || COUNT(*) || ' * COUNT(*) OVER () - ' || listagg('COUNT(' || column_name || ')', ' + ') WITHIN GROUP (ORDER BY column_id) || ' FROM ' || o || '.' || T || ' t'
INTO expr
FROM all_tab_columns
WHERE table_name = T;
EXECUTE IMMEDIATE expr INTO q;
RETURN q;
END;
/
-- Usage 1
SELECT null_fields('<your table name>') FROM dual
/
-- Usage 2
SELECT null_fields('<your table name>', '<table owner>') FROM dual
/
Thank you #Lord Peter :
The below PL/SQL script works
declare
cnt number :=0;
temp number :=0;
begin
for r in ( select column_name, nullable
from user_tab_columns
where table_name = upper('test')
order by column_id )
loop
if r.nullable = 'Y' then
EXECUTE IMMEDIATE 'SELECT count(*) FROM test where '|| r.column_name ||' IS NULL' into temp ;
cnt := cnt + temp;
END IF;
end loop;
dbms_output.put_line('Total: '||cnt);
end;
/
The table name test may be replaced the name of table of your interest.
I hope this solution is useful!
The dynamic SQL you execute (this is the string used in EXECUTE IMMEDIATE) should be
select sum(
decode(a,null,1,0)
+decode(b,null,1,0)
+decode(c,null,1,0)
) nullcols
from test;
Where each summand corresponds to a NOT NULL column.
Here only one table scan is necessary to get the result.
Use the data dictionary to find the number of NULL values almost instantly:
select sum(num_nulls) sum_num_nulls
from all_tab_columns
where owner = user
and table_name = 'TEST';
SUM_NUM_NULLS
-------------
6
The values will only be correct if optimizer statistics were gathered recently and if they were gathered with the default value for the sample size.
Those may seem like large caveats but it's worth becoming familiar with your database's statistics gathering process anyway. If your database is not automatically gathering statistics or if your database is not using the default sample size those are likely huge problems you need to be aware of.
To manually gather stats for a specific table a statement like this will work:
begin
dbms_stats.gather_table_stats(user, 'TEST');
end;
/
select COUNT(1) TOTAL from table where COLUMN is NULL;
I have a data model like the following which is simplified to show you only this problem (SQL Fiddle Link at the bottom):
A person is represented in the database as a meta table row with a name and with multiple attributes which are stored in the data table as key-value pair (key and value are in separate columns).
Expected Result
Now I would like to retrieve all users with all their attributes. The attributes should be returned as json object in a separate column. For example:
name, data
Florian, { "age":23, "color":"blue" }
Markus, { "age":24, "color":"green" }
My Approach
Now my problem is, that I couldn't find a way to create a key-value pair in postgres. I tried following:
SELECT
name,
array_to_json(array_agg(row(d.key, d.value))) AS data
FROM meta AS m
JOIN (
SELECT d.fk_id, d.key, d.value AS value FROM data AS d
) AS d
ON d.fk_id = m.id
GROUP BY m.name;
But it returns this as data column:
[{"f1":"age","f2":24},{"f1":"color","f2":"blue"}]
Other Solutions
I know there is the function crosstab which enables me to turn the data table into a key as column and value as row table. But this is not dynamic. And I don't know how many attributes a person has in the data table. So this is not an option.
I could also create a json like string with the two row values and aggregate them. But maybe there is a nicer solution.
And no, it is not possible to change the data-model because the real data model is already in use of multiple parties.
SQLFiddle
Check and test out the fiddle i've created for this question:
http://sqlfiddle.com/#!15/bd579/14
Use the aggregate function json_object_agg(key, value):
select
name,
json_object_agg(key, value) as data
from data
join meta on fk_id = id
group by 1;
Db<>Fiddle.
The function was introduced in Postgres 9.4.
I found a way to return crosstab data with dynamic columns. Maybe rewriting this will be better to suit your needs:
CREATE OR REPLACE FUNCTION report.usp_pivot_query_amount_generate(
i_group_id INT[],
i_start_date TIMESTAMPTZ,
i_end_date TIMESTAMPTZ,
i_interval INT
) RETURNS TABLE (
tab TEXT
) AS $ab$
DECLARE
_key_id TEXT;
_text_op TEXT = '';
_ret TEXT;
BEGIN
-- SELECT DISTNICT for query results
FOR _key_id IN
SELECT DISTINCT at_name
FROM report.company_data_date cd
JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id
JOIN report.amount_types at ON cda.amount_type_id = at.id
WHERE date_start BETWEEN i_start_date AND i_end_date
AND group_id = ANY (i_group_id)
AND interval_type_id = i_interval
LOOP
-- build function_call with datatype of column
IF char_length(_text_op) > 1 THEN
_text_op := _text_op || ', ' || _key_id || ' NUMERIC(20,2)';
ELSE
_text_op := _text_op || _key_id || ' NUMERIC(20,2)';
END IF;
END LOOP;
-- build query with parameter filters
_ret = '
SELECT * FROM crosstab(''SELECT date_start, at.at_name, cda.amount ct
FROM report.company_data_date cd
JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id
JOIN report.amount_types at ON cda.amount_type_id = at.id
WHERE date_start between $$' || i_start_date::TEXT || '$$ AND $$' || i_end_date::TEXT || '$$
AND interval_type_id = ' || i_interval::TEXT || '
AND group_id = ANY (ARRAY[' || array_to_string(i_group_id, ',') || '])
ORDER BY date_start'')
AS ct (date_start timestamptz, ' || _text_op || ')';
RETURN QUERY
SELECT _ret;
END;
$ab$ LANGUAGE 'plpgsql';
Call the function to get the string, then execute the string. I think I tried executing it in the function, but it didn't work well.
I ran into the same problem when I needed to update some JSON and remove a few elements in my database. This query below worked well enough for me, as it preserves the string quotes but does not add them to numbers.
select
'{' || substr(x.arr, 3, length(x.arr) - 4) || '}'
from
(
select
replace(replace(cast(array_agg(xx) as varchar), '\"', '"'), '","', ', ') as arr
from
(
select
elem.key,
elem.value,
'"' || elem.key || '":' || elem.value as xx
from
quote q
cross join
json_each(q.detail::json -> 'bQuoteDetail'-> 'quoteHC'->0) as elem
where
elem.key != 'allRiskItems'
) f
) x
Just a silly example:
Table A:
eggs
bread
cheese
Table B (when they are eaten):
Egg | date
Bread | date
Egg | date
cheese | date
Bread | date
For statistics purpouses, i need to have statistics per date per food type in a look like this:
Table Statistics:
egg | bread | cheese
date1 2 1 0
date2 6 4 2
date3 2 0 0
I need the column headers to be dynamic in the report (if new ones are added, it should automatically appear).
Any idea how to make this in postgres?
Thanks.
based on answer Postgres dynamic column headers (from another table) (the work of Eric Vallabh Minikel) i improved the function to be more flexible and convenient. I think it might be useful for others too, especially as it only relies on pg/plsql and does not need installation of extentions as other derivations of erics work (i.e. plpython) do. Testet with 9.3.5 but should also work at least down to 9.2.
Improvements:
deal with pivoted column names containing spaces
deal with multiple row header columns
deal with aggregate function in pivot cell as well as non-aggregated pivot cell (last parameter might be 'sum(cellval)' as well as 'cellval' in case the underlying table/view already does aggregation)
auto detect data type of pivot cell (no need to pass it to the function any more)
Useage:
SELECT get_crosstab_statement('table_to_pivot', ARRAY['rowname' [, <other_row_header_columns_as_well>], 'colname', 'max(cellval)');
Code:
CREATE OR REPLACE FUNCTION get_crosstab_statement(tablename character varying, row_header_columns character varying[], pivot_headers_column character varying, pivot_values character varying)
RETURNS character varying AS
$BODY$
--returns the sql statement to use for pivoting the table
--based on: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
--based on: https://stackoverflow.com/questions/4104508/postgres-dynamic-column-headers-from-another-table
--based on: http://www.postgresonline.com/journal/categories/24-tablefunc
DECLARE
arrayname CONSTANT character varying := 'r';
row_headers_simple character varying;
row_headers_quoted character varying;
row_headers_castdown character varying;
row_headers_castup character varying;
row_header_count smallint;
row_header record;
pivot_values_columnname character varying;
pivot_values_datatype character varying;
pivot_headers_definition character varying;
pivot_headers_simple character varying;
sql_row_headers character varying;
sql_pivot_headers character varying;
sql_crosstab_result character varying;
BEGIN
-- 1. create row header definitions
row_headers_simple := array_to_string(row_header_columns, ', ');
row_headers_quoted := '''' || array_to_string(row_header_columns, ''', ''') || '''';
row_headers_castdown := array_to_string(row_header_columns, '::text, ') || '::text';
row_header_count := 0;
sql_row_headers := 'SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = ''' || tablename || ''' AND column_name IN (' || row_headers_quoted || ')';
FOR row_header IN EXECUTE sql_row_headers LOOP
row_header_count := row_header_count + 1;
row_headers_castup := COALESCE(row_headers_castup || ', ', '') || arrayname || '[' || row_header_count || ']::' || row_header.data_type || ' AS ' || row_header.column_name;
END LOOP;
-- 2. retrieve basic column name in case an aggregate function is used
SELECT coalesce(substring(pivot_values FROM '.*\((.*)\)'), pivot_values)
INTO pivot_values_columnname;
-- 3. retrieve pivot values datatype
SELECT data_type
FROM information_schema.columns
WHERE table_name = tablename AND column_name = pivot_values_columnname
INTO pivot_values_datatype;
-- 4. retrieve list of pivot column names.
sql_pivot_headers := 'SELECT string_agg(DISTINCT quote_ident(' || pivot_headers_column || '), '', '' ORDER BY quote_ident(' || pivot_headers_column || ')) as names, string_agg(DISTINCT quote_ident(' || pivot_headers_column || ') || '' ' || pivot_values_datatype || ''', '', '' ORDER BY quote_ident(' || pivot_headers_column || ') || '' ' || pivot_values_datatype || ''') as definitions FROM ' || tablename || ';';
EXECUTE sql_pivot_headers INTO pivot_headers_simple, pivot_headers_definition;
-- 5. set up the crosstab query
sql_crosstab_result := 'SELECT ' || replace (row_headers_castup || ', ' || pivot_headers_simple, ', ', ',
') || '
FROM crosstab (
''SELECT ARRAY[' || row_headers_castdown || '] AS ' || arrayname || ', ' || pivot_headers_column || ', ' || pivot_values || '
FROM ' || tablename || '
GROUP BY ' || row_headers_simple || ', ' || pivot_headers_column || (CASE pivot_values_columnname=pivot_values WHEN true THEN ', ' || pivot_values ELSE '' END) || '
ORDER BY ' || row_headers_simple || '''
,
''SELECT DISTINCT ' || pivot_headers_column || '
FROM ' || tablename || '
ORDER BY ' || pivot_headers_column || '''
) AS newtable (
' || arrayname || ' varchar[]' || ',
' || replace(pivot_headers_definition, ', ', ',
') || '
);';
RETURN sql_crosstab_result;
END
$BODY$
LANGUAGE plpgsql STABLE
COST 100;
I came across the same problem, and found an alternative solution. I'd like to ask for comments here.
The idea is to create the "output type" string for crosstab dynamically. The end result can't be returned by a plpgsql function, because that function would either need to have a static return type (which we don't have) or return setof record, thus having no advantage over the original crosstab function. Therefore my function saves the output cross-table in a view. Likewise, the input table of "pivot" data not yet in cross-table format is taken from an existing view or table.
The usage would be like this, using your example (I added the "fat" field to illustrate the sorting feature):
CREATE TABLE food_fat (name character varying(20) NOT NULL, fat integer);
CREATE TABLE eaten (food character varying(20) NOT NULL, day date NOT NULL);
-- This view will be formatted as cross-table.
-- ORDER BY is important, otherwise crosstab won't merge rows
CREATE TEMPORARY VIEW mymeals AS
SELECT day,food,COUNT(*) AS meals FROM eaten
GROUP BY day, food ORDER BY day;
SELECT auto_crosstab_ordered('mymeals_cross',
'mymeals', 'day', 'food', 'meals', -- what table to convert to cross-table
'food_fat', 'name', 'fat'); -- where to take the columns from
The last statement creates a view mymeals_cross that looks like this:
SELECT * FROM mymeals_cross;
day | bread | cheese | eggs
------------+-------+--------+------
2012-06-01 | 3 | 3 | 2
2012-06-02 | 2 | 1 | 3
(2 rows)
Here comes my implementation:
-- FUNCTION get_col_type(tab, col)
-- returns the data type of column <col> in table <tab> as string
DROP FUNCTION get_col_type(TEXT, TEXT);
CREATE FUNCTION get_col_type(tab TEXT, col TEXT, OUT ret TEXT)
AS $BODY$ BEGIN
EXECUTE $f$
SELECT atttypid::regtype::text
FROM pg_catalog.pg_attribute
WHERE attrelid='$f$||quote_ident(tab)||$f$'::regclass
AND attname='$f$||quote_ident(col)||$f$'
$f$ INTO ret;
END;
$BODY$ LANGUAGE plpgsql;
-- FUNCTION get_crosstab_type(tab, row, val, cattab, catcol, catord)
--
-- This function generates the output type expression for the crosstab(text, text)
-- function from the PostgreSQL tablefunc module when the "categories"
-- (cross table column labels) can be looked up in some view or table.
--
-- See auto_crosstab below for parameters
DROP FUNCTION get_crosstab_type(TEXT, TEXT, TEXT, TEXT, TEXT, TEXT);
CREATE FUNCTION get_crosstab_type(tab TEXT, rw TEXT, val TEXT, cattab TEXT,
catcol TEXT, catord TEXT, OUT ret TEXT)
AS $BODY$ BEGIN
EXECUTE $f$
SELECT '"$f$||quote_ident(rw)||$f$" $f$
||get_col_type(quote_ident(tab), quote_ident(rw))||$f$'
|| string_agg(',"'||_values._v||'" $f$
||get_col_type(quote_ident(tab), quote_ident(val))||$f$')
FROM (SELECT DISTINCT ON(_t.$f$||quote_ident(catord)||$f$) _t.$f$||quote_ident(catcol)||$f$ AS _v
FROM $f$||quote_ident(cattab)||$f$ _t
ORDER BY _t.$f$||quote_ident(catord)||$f$) _values
$f$ INTO ret;
END; $BODY$ LANGUAGE plpgsql;
-- FUNCTION auto_crosstab_ordered(view_name, tab, row, cat, val, cattab, catcol, catord)
--
-- This function creates a VIEW containing a cross-table of input table.
-- It fetches the column names of the cross table ("categories") from
-- another table.
--
-- view_name - name of VIEW to be created
-- tab - input table. This table / view must have 3 columns:
-- "row", "category", "value".
-- row - column name of the "row" column
-- cat - column name of the "category" column
-- val - column name of the "value" column
-- cattab - another table holding the possible categories
-- catcol - column name in cattab to use as column label in the cross table
-- catord - column name in cattab to sort columns in the cross table
DROP FUNCTION auto_crosstab_ordered(TEXT, TEXT, TEXT, TEXT, TEXT, TEXT, TEXT, TEXT);
CREATE FUNCTION auto_crosstab_ordered(view_name TEXT, tab TEXT, rw TEXT,
cat TEXT, val TEXT, cattab TEXT, catcol TEXT, catord TEXT) RETURNS void
AS $BODY$ BEGIN
EXECUTE $f$
CREATE VIEW $f$||quote_ident(view_name)||$f$ AS
SELECT * FROM crosstab(
'SELECT $f$||quote_ident(rw)||$f$,
$f$||quote_ident(cat)||$f$,
$f$||quote_ident(val)||$f$
FROM $f$||quote_ident(tab)||$f$',
'SELECT DISTINCT ON($f$||quote_ident(catord)||$f$) $f$||quote_ident(catcol)||$f$
FROM $f$||quote_ident(cattab)||$f$ m
ORDER BY $f$||quote_ident(catord)||$f$'
) AS ($f$||get_crosstab_type(tab, rw, val, cattab, catcol, catord)||$f$)$f$;
END; $BODY$ LANGUAGE plpgsql;
-- FUNCTION auto_crosstab(view_name, tab, row, cat, val)
--
-- This function creates a VIEW containing a cross-table of input table.
-- It fetches the column names of the cross table ("categories") from
-- DISTINCT values of the 2nd column of the input table.
--
-- view_name - name of VIEW to be created
-- tab - input table. This table / view must have 3 columns:
-- "row", "category", "value".
-- row - column name of the "row" column
-- cat - column name of the "category" column
-- val - column name of the "value" column
DROP FUNCTION auto_crosstab(TEXT, TEXT, TEXT, TEXT, TEXT);
CREATE FUNCTION auto_crosstab(view_name TEXT, tab TEXT, rw TEXT, cat TEXT, val TEXT) RETURNS void
AS $$ BEGIN
PERFORM auto_crosstab_ordered(view_name, tab, rw, cat, val, tab, cat, cat);
END; $$ LANGUAGE plpgsql;
The code is working fine. you will receive output as dynamic query.
CREATE OR REPLACE FUNCTION public.pivotcode(tablename character varying, rowc character varying, colc character varying, cellc character varying, celldatatype character varying)
RETURNS character varying AS
$BODY$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct ''_''||'||colc||'||'' '||celldatatype||''','','' order by ''_''||'||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
--create temp table temp as
dynsql2 = 'select * from crosstab ( ''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'', ''select distinct '||colc||' from '||tablename||' order by 1'' ) as newtable ( '||rowc||' varchar,'||columnlist||' );';
return dynsql2;
end
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
This is an example of pivoting data - in postgreSQL 9 there is a crosstab function to do this: http://www.postgresql.org/docs/current/static/tablefunc.html
I solved it folowing this article:
http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html
In short, i used function that for every result in a query dynamically creates the values for the next query, and returns the result needed as a refcursor. This solved my sql result part, now i need to figure out the java part, but that isnt connected much to the question :)