How to convert two rows into key-value json object in postgresql? - sql

I have a data model like the following which is simplified to show you only this problem (SQL Fiddle Link at the bottom):
A person is represented in the database as a meta table row with a name and with multiple attributes which are stored in the data table as key-value pair (key and value are in separate columns).
Expected Result
Now I would like to retrieve all users with all their attributes. The attributes should be returned as json object in a separate column. For example:
name, data
Florian, { "age":23, "color":"blue" }
Markus, { "age":24, "color":"green" }
My Approach
Now my problem is, that I couldn't find a way to create a key-value pair in postgres. I tried following:
SELECT
name,
array_to_json(array_agg(row(d.key, d.value))) AS data
FROM meta AS m
JOIN (
SELECT d.fk_id, d.key, d.value AS value FROM data AS d
) AS d
ON d.fk_id = m.id
GROUP BY m.name;
But it returns this as data column:
[{"f1":"age","f2":24},{"f1":"color","f2":"blue"}]
Other Solutions
I know there is the function crosstab which enables me to turn the data table into a key as column and value as row table. But this is not dynamic. And I don't know how many attributes a person has in the data table. So this is not an option.
I could also create a json like string with the two row values and aggregate them. But maybe there is a nicer solution.
And no, it is not possible to change the data-model because the real data model is already in use of multiple parties.
SQLFiddle
Check and test out the fiddle i've created for this question:
http://sqlfiddle.com/#!15/bd579/14

Use the aggregate function json_object_agg(key, value):
select
name,
json_object_agg(key, value) as data
from data
join meta on fk_id = id
group by 1;
Db<>Fiddle.
The function was introduced in Postgres 9.4.

I found a way to return crosstab data with dynamic columns. Maybe rewriting this will be better to suit your needs:
CREATE OR REPLACE FUNCTION report.usp_pivot_query_amount_generate(
i_group_id INT[],
i_start_date TIMESTAMPTZ,
i_end_date TIMESTAMPTZ,
i_interval INT
) RETURNS TABLE (
tab TEXT
) AS $ab$
DECLARE
_key_id TEXT;
_text_op TEXT = '';
_ret TEXT;
BEGIN
-- SELECT DISTNICT for query results
FOR _key_id IN
SELECT DISTINCT at_name
FROM report.company_data_date cd
JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id
JOIN report.amount_types at ON cda.amount_type_id = at.id
WHERE date_start BETWEEN i_start_date AND i_end_date
AND group_id = ANY (i_group_id)
AND interval_type_id = i_interval
LOOP
-- build function_call with datatype of column
IF char_length(_text_op) > 1 THEN
_text_op := _text_op || ', ' || _key_id || ' NUMERIC(20,2)';
ELSE
_text_op := _text_op || _key_id || ' NUMERIC(20,2)';
END IF;
END LOOP;
-- build query with parameter filters
_ret = '
SELECT * FROM crosstab(''SELECT date_start, at.at_name, cda.amount ct
FROM report.company_data_date cd
JOIN report.company_data_amount cda ON cd.id = cda.company_data_date_id
JOIN report.amount_types at ON cda.amount_type_id = at.id
WHERE date_start between $$' || i_start_date::TEXT || '$$ AND $$' || i_end_date::TEXT || '$$
AND interval_type_id = ' || i_interval::TEXT || '
AND group_id = ANY (ARRAY[' || array_to_string(i_group_id, ',') || '])
ORDER BY date_start'')
AS ct (date_start timestamptz, ' || _text_op || ')';
RETURN QUERY
SELECT _ret;
END;
$ab$ LANGUAGE 'plpgsql';
Call the function to get the string, then execute the string. I think I tried executing it in the function, but it didn't work well.

I ran into the same problem when I needed to update some JSON and remove a few elements in my database. This query below worked well enough for me, as it preserves the string quotes but does not add them to numbers.
select
'{' || substr(x.arr, 3, length(x.arr) - 4) || '}'
from
(
select
replace(replace(cast(array_agg(xx) as varchar), '\"', '"'), '","', ', ') as arr
from
(
select
elem.key,
elem.value,
'"' || elem.key || '":' || elem.value as xx
from
quote q
cross join
json_each(q.detail::json -> 'bQuoteDetail'-> 'quoteHC'->0) as elem
where
elem.key != 'allRiskItems'
) f
) x

Related

Select query using json format value

If customer first_name-'Monika',
last_name='Awasthi'
Then I am using below query to return value in json format:
SELECT *
FROM
(
SELECT JSON_ARRAYAGG(JSON_OBJECT('CODE' IS '1','VALUE' IS 'Monika'||' '||'Awasthi'))
FROM DUAL
);
It is working fine & give below output:
[{"CODE":"1","VALUE":"Monika Awasthi"}]
But I want one more value which should be reversed means output should be:
[{"CODE":"1","VALUE":"Monika Awasthi"},{"CODE":"2","VALUE":"Awasthi Monika"}]
Kindly give me some suggestions. Thank You
Another approach is to use a CTE to generate the two codes and values; your original version could be written to get the name data from a table or CTE:
-- CTE for sample data
WITH cte (first_name, last_name) AS (
SELECT 'Monika', 'Awasthi' FROM DUAL
)
-- query against CTE or table
SELECT JSON_ARRAYAGG(JSON_OBJECT('CODE' IS '1','VALUE' IS last_name ||' '|| first_name))
FROM cte;
And you could then extend that with a CTE that generates the value with the names in both orders:
WITH cte1 (first_name, last_name) AS (
SELECT 'Monika', 'Awasthi' FROM DUAL
),
cte2 (code, value) AS (
SELECT 1 AS code, first_name || ' ' || last_name FROM cte1
UNION ALL
SELECT 2 AS code, last_name || ' ' || first_name FROM cte1
)
SELECT JSON_ARRAYAGG(JSON_OBJECT('CODE' IS code,'VALUE' IS value))
FROM cte2;
which gives:
JSON_ARRAYAGG(JSON_OBJECT('CODE'ISCODE,'VALUE'ISVALUE))
-------------------------------------------------------------------------
[{"CODE":1,"VALUE":"Monika Awasthi"},{"CODE":2,"VALUE":"Awasthi Monika"}]
db<>fiddle
A simple logic through use of SQL(without using PL/SQL) in order to generate code values as only be usable for two columns as in this case might be
SELECT JSON_ARRAYAGG(
JSON_OBJECT('CODE' IS tt.column_id,
'VALUE' IS CASE WHEN column_id=1
THEN name||' '||surname
ELSE surname||' '||name
END)
) AS result
FROM t
CROSS JOIN (SELECT column_id FROM user_tab_cols WHERE table_name = 'T') tt
where t is a table which hold name and surname columns
Demo
More resilient solution might be provided through use of PL/SQL, even more columns exist within the data source such as
DECLARE
v_jso VARCHAR2(4000);
v_arr OWA.VC_ARR;
v_arr_t JSON_ARRAY_T := JSON_ARRAY_T();
BEGIN
FOR c IN ( SELECT column_id FROM user_tab_cols WHERE table_name = 'T' )
LOOP
SELECT 'JSON_OBJECT( ''CODE'' IS '||MAX(c.column_id)||',
''VALUE'' IS '||LISTAGG(column_name,'||'' ''||')
WITHIN GROUP (ORDER BY ABS(column_id-c.column_id))
||' )'
INTO v_arr(c.column_id)
FROM ( SELECT * FROM user_tab_cols WHERE table_name = 'T' );
EXECUTE IMMEDIATE 'SELECT '||v_arr(c.column_id)||' FROM t' INTO v_jso;
v_arr_t.APPEND(JSON_OBJECT_T(v_jso));
END LOOP;
DBMS_OUTPUT.PUT_LINE(v_arr_t.STRINGIFY);
END;
/
Demo
As I explained in a comment under your question, I am not clear on how you define the CODE values for your JSON string (assuming you have more than one customer).
Other than that, if you need to create a JSON array of objects from individual strings (as in your attempt), you probably need to use JSON_ARRAY rather than JSON_ARRAYAGG. Something like I show below. Incidentally, I also don't know why you needed to SELECT * FROM (subquery) - the outer SELECT seems entirely unnecessary.
So, if you don't actually aggregate over a table, but just need to build a JSON array from individual pieces:
select json_array
(
json_object('CODE' is '1', 'VALUE' is first_name || ' ' || last_name ),
json_object('CODE' is '2', 'VALUE' is last_name || ' ' || first_name)
) as result
from ( select 'Monika' as first_name, 'Awasthi' as last_name from dual )
;
RESULT
------------------------------------------------------------------------------
[{"CODE":"1","VALUE":"Monika Awasthi"},{"CODE":"2","VALUE":"Awasthi Monika"}]

PostgreSQL plpgsql with subquery and dynamic SQL

I am having trouble with the plpsql below. What I am trying to do is:
I have a table name with people's names
I am trying to write a PostgreSQL function that copies non-null columns from one person to another (as part of a larger merge)
There are a lot of columns in the name table, and other tables where I want to do the same thing.
In order to limit the amount of code that needs to be written, I am trying to iterate through an array and generate dynamic SQL
However I cannot get this to work.
What I have so far is:
CREATE OR REPLACE FUNCTION test(first_id bigint, second_id bigint) RETURNS boolean AS $$
DECLARE
first_name name%ROWTYPE;
second_name name%ROWTYPE;
col_name VARCHAR(100);
sql_block VARCHAR(500);
BEGIN
SELECT * INTO first_name FROM name WHERE person_id = first_id;
SELECT * INTO second_name FROM name WHERE person_id = second_id;
FOREACH col_name IN ARRAY ARRAY['column1', 'column2', 'column3', 'column4']
LOOP
---- The follow line is not working and keeps giving a syntax error
EXECUTE 'if (first_name .' || col_name || ' IS NULL and second_name.' ||
col_name || ' IS NOT NULL) THEN UPDATE name set ' || col_name ||
' = second_name.' || col_name || ' where name.id = first_name.id; END IF;';
END LOOP;
RETURN TRUE;
END;
$$ LANGUAGE plpgsql;
First - you need a query that will update single column in a single table:
UPDATE table_name t1
SET column1 = COALESE(t1.column1, t2.colum1)
FROM table_name t2
WHERE t1.id = first_id
AND t2.id = second_id
This query will update the field for the first table if it is empty.
Next - you can split this query into 3 parts:
UPDATE table_name t1
SET
As a header.
column1 = COALESE(t1.column1, t2.colum1),
column2 = COALESE(t1.column2, t2.colum2),
...
As a generated list of columns to update.
FROM table_name t2
WHERE t1.id = ?
AND t2.id = ?
As a footer. You need to replace parameters with placeholders here.
Next - just do something like:
EXECUTE header||columns||footer USING first_id, second_id;

PostgreSQL: Get values of a register as multiple rows

Using PostgreSQL 9.3, I'm creating a Jasper reports template to make a pdf report. I want to create reports of different tables, with multiple columns, all with the same template. A solution could be to get values of register as pairs of column name and value per id.
By example, if I had a table like:
id | Column1 | Column2 | Column3
-------------------------------------------------
1 | Register1C1 | Register1C2 | Register1C3
I would like to get the register as:
Id | ColumnName | Value
-----------------------------
1 | Column1 | Register1C1
1 | Column2 | Register1C2
1 | Column3 | Register1C3
The data type of value columns can vary!
Is it possible? How can I do this?
If all your columns share the same data type and order of rows does not have to be enforced:
SELECT t.id, v.*
FROM tbl t, LATERAL (
VALUES
('col1', col1)
, ('col2', col2)
, ('col3', col3)
-- etc.
) v(col, val);
About LATERAL (requires Postgres 9.3 or later):
What is the difference between LATERAL and a subquery in PostgreSQL?
Combining it with a VALUES expression:
Crosstab transpose query request
SELECT DISTINCT on multiple columns
For varying data types, the common denominator would be text, since every type can be cast to text. Plus, order enforced:
SELECT t.id, v.col, v.val
FROM tbl t, LATERAL (
VALUES
(1, 'col1', col1::text)
, (2, 'col2', col2::text)
, (3, 'col3', col3::text)
-- etc.
) v(rank, col, val)
ORDER BY t.id, v.rank;
In Postgres 9.4 or later use the new unnest() for multiple arrays:
SELECT t.id, v.*
FROM tbl t, unnest('{col1,col2,col3}'::text[]
, ARRAY[col1,col2,col3]) v(col, val);
-- , ARRAY[col1::text,col2::text,col3::text]) v(col, val);
The commented alternative for varying data types.
Full automation for Postgres 9.4:
The query above is convenient to automate for a dynamic set of columns:
CREATE OR REPLACE FUNCTION f_transpose (_tbl regclass, VARIADIC _cols text[])
RETURNS TABLE (id int, col text, val text) AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
'SELECT t.id, v.* FROM %s t, unnest($1, ARRAY[%s]) v'
, _tbl, array_to_string(_cols, '::text,') || '::text'))
-- , _tbl, array_to_string(_cols, ','))) -- simple alternative for only text
USING _cols;
END
$func$ LANGUAGE plpgsql;
Call - with table name and any number of column names, any data types:
SELECT * FROM f_transpose('table_name', 'column1', 'column2', 'column3');
Weakness: the list of column names is not safe against SQL injection. You could gather column names from pg_attribute instead. Example:
How to perform the same aggregation on every column, without listing the columns?
SELECT id
,unnest(string_to_array('col1,col2,col3', ',')) col_name
,unnest(string_to_array(col1 || ',' || col2 || ',' || col3, ',')) val
FROM t
Try following method:
My sample table name is t,to get the n columns name you can use this query
select string_agg(column_name,',') cols from information_schema.columns where
table_name='t' and column_name<>'id'
this query will selects all columns in your table except id column.If you want to specify schema name then use table_schema='your_schema_name' in where clause
To create select query dynamically
SELECT 'select id,unnest(string_to_array(''' || cols || ''','','')) col_name,unnest(string_to_array(' || cols1 || ','','')) val from t'
FROM (
SELECT string_agg(column_name, ',') cols -- here we'll get all the columns in table t
,string_agg(column_name, '||'',''||') cols1
FROM information_schema.columns
WHERE table_name = 't'
AND column_name <> 'id'
) tb;
And using following plpgsql function dynamically creates SELECT id,unnest(string_to_array('....')) col_name,unnest(string_to_array(.., ',')) val FROM t and execute.
CREATE OR replace FUNCTION fn ()
RETURNS TABLE (
id INT
,columname TEXT
,columnvalues TEXT
) AS $$
DECLARE qry TEXT;
BEGIN
SELECT 'select id,unnest(string_to_array(''' || cols || ''','','')) col_name,unnest(string_to_array(' || cols1 || ','','')) val from t'
INTO qry
FROM (
SELECT string_agg(column_name, ',') cols
,string_agg(column_name, '||'',''||') cols1
FROM information_schema.columns
WHERE table_name = 't'
AND column_name <> 'id'
) tb;
RETURN QUERY
EXECUTE format(qry);
END;$$
LANGUAGE plpgsql
Call this function like select * from fn()

A column-safe `INSERT INTO t1 SELECT * FROM ...`?

Is there some way to do a INSERT INTO t1 SELECT * FROM... such that it fails if the column names do not coincide?
I'm using Postgresql 9.x The columns names are not known in advance.
Motivation: I'm doing a periodic refresh of materialized views by the (quite standard) PL/pgSQL procedure:
CREATE OR REPLACE FUNCTION matview_refresh(name) RETURNS void AS
$BODY$
DECLARE
matview ALIAS FOR $1;
entry matviews%ROWTYPE;
BEGIN
SELECT * INTO entry FROM matviews WHERE mv_name = matview;
IF NOT FOUND THEN
RAISE EXCEPTION 'Materialized view % does not exist.', matview;
END IF;
EXECUTE 'TRUNCATE TABLE ' || matview;
EXECUTE 'INSERT INTO ' || matview || ' SELECT * FROM ' || entry.v_name;
UPDATE matviews SET last_refresh=CURRENT_TIMESTAMP WHERE mv_name=matview;
RETURN;
END
I preferred a TRUNCATE followed by a SELECT * INTO instead of a DROP/CREATE because it seemed more light and concurrent-friendly. It would fail if someone adds/remove columns from the view (then I would do the DROP/CREATE) but, it doesn't matter, in that case the refresh would not complete and we would catch the problem soon. What does matter is what happened today: someone changed the order of two columns of the view (of the same type), and the refresh inserted bogus data.
Build this into your plpgsql function to verify that view and table share the same column names in the same sequence exactly:
IF EXISTS (
SELECT 1
FROM (
SELECT *
FROM pg_attribute
WHERE attrelid = matview::regclass
AND attisdropped = FALSE
AND attnum > 0
) t
FULL OUTER JOIN (
SELECT *
FROM pg_attribute
WHERE attrelid = entry.v_name::regclass
AND attisdropped = FALSE
AND attnum > 0
) v USING (attnum, attname) -- atttypid to check for type, too
WHERE t.attname IS NULL
OR v.attname IS NULL
) THEN
RAISE EXCEPTION 'Mismatch between table and view!';
END IF;
The FULL OUTER JOIN adds a row with NULL values for any mismatch between the list of column names. So, if EXISTS finds a row, something is off.
And the cast to ::regclass would raise an exception right away if either table or view do not exist (or is out of scope - not in the search_path and not schema-qualified).
If you also want to check data types of the columns, just add atttypid to the USING clause.
As an aside: Querying pg_catalog tables is regularly faster by an order of magnitude than querying the bloated views int information_schema - information_schema is only good for SQL standard compliance and portability of code. Since you are writing 100 % Postgres-specific code, neither is relevant here.
You can query information_schema.columns to get columns in the right order:
SELECT INTO cols array_to_string(array_agg(column_name::text), ',')
FROM (
SELECT column_name
FROM information_schema.columns
WHERE table_name = 'matview'
ORDER BY ordinal_position
) AS x;
EXECUTE 'INSERT INTO ' || matview || ' SELECT ' || cols || ' FROM ' || entry.v_name;
You can get column list directly from pg_attribute -- just replace inner SELECT from information_schema.columns by:
SELECT attname AS column_name
FROM pg_attribute
WHERE attrelid = 'matview'::regclass AND attisdropped = false
ORDER BY attnum;

Transposing a table through select query

I have a table like:
Key type value
---------------------
40 A 12.34
41 A 10.24
41 B 12.89
I want it in the format:
Types 40 41 42 (keys)
---------------------------------
A 12.34 10.24 XXX
B YYY 12.89 ZZZ
How can this be done through an SQL query. Case statements, decode??
What you're looking for is called a "pivot" (see also "Pivoting Operations" in the Oracle Database Data Warehousing Guide):
SELECT *
FROM tbl
PIVOT(SUM(value) FOR Key IN (40, 41, 42))
It was added to Oracle in 11g. Note that you need to specify the result columns (the values from the unpivoted column that become the pivoted column names) in the pivot clause. Any columns not specified in the pivot are implicitly grouped by. If you have columns in the original table that you don't wish to group by, select from a view or subquery, rather than from the table.
You can engage in a bit of wizardry and get Oracle to create the statement for you, so that you don't need to figure out what column values to pivot on. In 11g, when you know the column values are numeric:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN ('
|| LISTAGG(Key, ',') WITHIN GROUP (ORDER BY Key)
|| ');'
FROM tbl;
If the column values might not be numeric:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN (\''
|| LISTAGG(Key, '\',\'') WITHIN GROUP (ORDER BY Key)
|| '\'));'
FROM tbl;
LISTAGG probably repeats duplicates (would someone test this?), in which case you'd need:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN (\''
|| LISTAGG(Key, '\',\'') WITHIN GROUP (ORDER BY Key)
|| '\'));'
FROM (SELECT DISTINCT Key FROM tbl);
You could go further, defining a function that takes a table name, aggregate expression and pivot column name that returns a pivot statement by first producing then evaluating the above statement. You could then define a procedure that takes the same arguments and produces the pivoted result. I don't have access to Oracle 11g to test it, but I believe it would look something like:
CREATE PACKAGE dynamic_pivot AS
-- creates a PIVOT statement dynamically
FUNCTION pivot_stmt (tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE)
RETURN varchar2(300);
PRAGMA RESTRICT_REFERENCES (pivot_stmt, WNDS, RNPS);
-- creates & executes a PIVOT
PROCEDURE pivot_table (tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE);
END dynamic_pivot;
CREATE PACKAGE BODY dynamic_pivot AS
FUNCTION pivot_stmt (
tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr_expr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE
) RETURN varchar2(300)
IS
stmt VARCHAR2(400);
quote VARCHAR2(2) DEFAULT '';
BEGIN
IF quote_values THEN
quote := '\\\'';
END IF;
-- "\||" shows that you are still in the dynamic statement string
-- The input fields aren't sanitized, so this is vulnerable to injection
EXECUTE IMMEDIATE 'SELECT \'SELECT * FROM ' || tbl_name
|| ' PIVOT(' || aggr_expr || ' FOR ' || pivot_col
|| ' IN (' || quote || '\' \|| LISTAGG(' || pivot_col
|| ', \'' || quote || ',' || quote
|| '\') WITHIN GROUP (ORDER BY ' || pivot_col || ') \|| \'' || quote
|| '));\' FROM (SELECT DISTINCT ' || pivot_col || ' FROM ' || tbl_name || ');'
INTO stmt;
RETURN stmt;
END pivot_stmt;
PROCEDURE pivot_table (tbl_name IN varchar2(30), pivot_col IN varchar2(30), aggr_expr IN varchar2(40), quote_values IN BOOLEAN DEFAULT TRUE) IS
BEGIN
EXECUTE IMMEDIATE pivot_stmt(tbl_name, pivot_col, aggr_expr, quote_values);
END pivot_table;
END dynamic_pivot;
Note: the length of the tbl_name, pivot_col and aggr_expr parameters comes from the maximum table and column name length. Note also that the function is vulnerable to SQL injection.
In pre-11g, you can apply MySQL pivot statement generation techniques (which produces the type of query others have posted, based on explicitly defining a separate column for each pivot value).
Pivot does simplify things greatly. Before 11g however, you need to do this manually.
select
type,
sum(case when key = 40 then value end) as val_40,
sum(case when key = 41 then value end) as val_41,
sum(case when key = 42 then value end) as val_42
from my_table
group by type;
Never tried it but it seems at least Oracle 11 has a PIVOT clause
If you do not have access to 11g, you can utilize a string aggregation and a grouping method to approx. what you are looking for such as
with data as(
SELECT 40 KEY , 'A' TYPE , 12.34 VALUE FROM DUAL UNION
SELECT 41 KEY , 'A' TYPE , 10.24 VALUE FROM DUAL UNION
SELECT 41 KEY , 'B' TYPE , 12.89 VALUE FROM DUAL
)
select
TYPE ,
wm_concat(KEY) KEY ,
wm_concat(VALUE) VALUE
from data
GROUP BY TYPE;
type KEY VALUE
------ ------- -----------
A 40,41 12.34,10.24
B 41 12.89
This is based on wm_concat as shown here: http://www.oracle-base.com/articles/misc/StringAggregationTechniques.php
I'm going to leave this here just in case it helps, but I think PIVOT or MikeyByCrikey's answers would best suit your needs after re-looking at your sample results.