Convert one row into multiple rows with fewer columns - sql

I'd like to convert single rows into multiple rows in PostgreSQL, where some of the columns are removed. Here's an example of the current output:
name | st | ot | dt |
-----|----|----|----|
Fred | 8 | 2 | 3 |
Jane | 8 | 1 | 0 |
Samm | 8 | 0 | 6 |
Alex | 8 | 0 | 0 |
Using the following query:
SELECT
name, st, ot, dt
FROM
times;
And here's what I want:
name | t | val |
-----|----|-----|
Fred | st | 8 |
Fred | ot | 2 |
Fred | dt | 3 |
Jane | st | 8 |
Jane | ot | 1 |
Samm | st | 8 |
Samm | dt | 6 |
Alex | st | 8 |
How can I modify the query to get the above desired output?

SELECT
times.name, x.t, x.val
FROM
times cross join lateral (values('st',st),('ot',ot),('dt',dt)) as x(t,val)
WHERE
x.val <> 0;

The core problem is the reverse of a pivot / crosstab operation. Sometimes called "unpivot".
Basically, Abelisto's query is the way to go in Postgres 9.3 or later. Related:
SELECT DISTINCT on multiple columns
You may want to use LEFT JOIN LATERAL ... ON u.val <> 0 to include names without valid values in the result (and shorten the syntax a bit).
What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
If you have more than a few value columns (or varying lists of columns) you may want to use a function to build and execute the query automatically:
CREATE OR REPLACE FUNCTION f_unpivot_columns(VARIADIC _cols text[])
RETURNS TABLE(name text, t text, val int)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT
'SELECT t.name, u.t, u.val
FROM times t
LEFT JOIN LATERAL (VALUES '
|| string_agg(format('(%L, t.%I)', c, c), ', ')
|| ') u(t, val) ON (u.val <> 0)'
FROM unnest(_cols) c
);
END
$func$;
Call:
SELECT * FROM f_unpivot_times_columns(VARIADIC '{st, ot, dt}');
Or:
SELECT * FROM f_unpivot_columns('ot', 'dt');
Columns names are provided as string literals and must be in correct (case-sensitive!) spelling with no extra double-quotes. See:
Are PostgreSQL column names case-sensitive?
db<>fiddle here
Related with more examples and explanation:
How to unpivot a table in PostgreSQL

One way:
with times(name , st , ot , dt) as(
select 'Fred',8 , 2 , 3 union all
select 'Jane',8 , 1 , 0 union all
select 'Samm',8 , 0 , 6 union all
select 'Alex',8 , 0 , 0
)
select name, key as t, value::int from
(
select name, json_build_object('st' ,st , 'ot',ot, 'dt',dt) as j
from times
) t
join lateral json_each_text(j)
on true
where value <> '0'
-- order by name, case when key = 'st' then 0 when key = 'ot' then 1 when key = 'dt' then 2 end

Related

SQL Match group of records to another group of records

Is there SQL statement to match up multiple records to an exact match of multiple records in another table?
Lets say I have table A
ID | List# | Item
1 | 5 | A
2 | 5 | C
3 | 5 | B
4 | 6 | A
5 | 6 | D
*I purposely made Items 'ABC' out of order as the order of the records I receive may be out of order.
Table B
ID | Group | Item
1 | AAA | A
2 | AAA | B
3 | AAA | C
4 | AAA | D
5 | BBB | A
6 | BBB | B
7 | BBB | C
8 | DDD | A
If looking at the first table, I would want List# 5 to return a match only for group 'BBB', as all (and only) three records match.
The simplest way is to aggregate into a string or array and join. Standard SQL supports listagg(), so you can do:
select a.list, b.list, a.items
from (select a.list, listagg(item, ',') within group (order by item) as items
from a
group by a.list
) a join
(select b.list, listagg(item, ',') within group (order by item) as items
from b
group by b.list
) b
on a.items = b.items;
Not all databases support listagg(). Many -- but not all -- have similar functionality. This is simpler than the "standard" SQL approach.
You can simulate database division. It's a little bit cumbersome but here it is:
with
x as (
select
from a
where a.list = 5
),
y as (
select grp, count(*) as cnt
from b
join x on x.item = b.item
group by grp
)
select grp
from y
where cnt = (select count(*) from x)

Redshift create all the combinations of any length for the values in one column

How can we create all the combinations of any length for the values in one column and return the distinct count of another column for that combination?
Table:
+------+--------+
| Type | Name |
+------+--------+
| A | Tom |
| A | Ben |
| B | Ben |
| B | Justin |
| C | Ben |
+------+--------+
Output Table:
+-------------+-------+
| Combination | Count |
+-------------+-------+
| A | 2 |
| B | 2 |
| C | 1 |
| AB | 3 |
| BC | 2 |
| AC | 2 |
| ABC | 3 |
+-------------+-------+
When the combination is only A, there are Tom and Ben so it's 2.
When the combination is only B, 2 distinct names so it's 2.
When the combination is A and B, 3 distinct names: Tom, Ben, Justin so it's 3.
I'm working in Amazon Redshift. Thank you!
NOTE: This answers the original version of the question which was tagged Postgres.
You can generate all combinations with this code
with recursive td as (
select distinct type
from t
),
cte as (
select td.type, td.type as lasttype, 1 as len
from td
union all
select cte.type || t.type, t.type as lasttype, cte.len + 1
from cte join
t
on 1=1 and t.type > cte.lasttype
)
You can then use this in a join:
with recursive t as (
select *
from (values ('a'), ('b'), ('c'), ('d')) v(c)
),
cte as (
select t.c, t.c as lastc, 1 as len
from t
union all
select cte.type || t.type, t.type as lasttype, cte.len + 1
from cte join
t
on 1=1 and t.type > cte.lasttype
)
select type, count(*)
from (select name, cte.type, count(*)
from cte join
t
on cte.type like '%' || t.type || '%'
group by name, cte.type
having count(*) = length(cte.type)
) x
group by type
order by type;
There is no way to generate all possible combinations (A, B, C, AB, AC, BC, etc) in Amazon Redshift.
(Well, you could select each unique value, smoosh them into one string, send it to a User-Defined Function, extract the result into multiple rows and then join it against a big query, but that really isn't something you'd like to attempt.)
One approach would be to create a table containing all possible combinations — you'd need to write a little program to do that (eg using itertools in Python). Then, you could join the data against that reasonably easy to get the desired result (eg IF 'ABC' CONTAINS '%A%').

Oracle SQL Get unique symbols from table

I have table with descriptions of smth. For example:
My_Table
id description
================
1 ABC
2 ABB
3 OPAC
4 APEЧ
I need to get all unique symbols from all "description" columns.
Result should look like that:
symbol
================
A
B
C
O
P
E
Ч
And it shoud work for all languages, so, as I see, regular expressions cant help.
Please help me. Thanks.
with cte (c,description_suffix) as
(
select substr(description,1,1)
,substr(description,2)
from mytable
where description is not null
union all
select substr(description_suffix,1,1)
,substr(description_suffix,2)
from cte
where description_suffix is not null
)
select c
,count(*) as cnt
from cte
group by c
order by c
or
with cte(n) as
(
select level
from dual
connect by level <= (select max(length(description)) from mytable)
)
select substr(t.description,c.n,1) as c
,count(*) as cnt
from mytable t
join cte c
on c.n <= length(description)
group by substr(t.description,c.n,1)
order by c
+---+-----+
| C | CNT |
+---+-----+
| A | 4 |
| B | 3 |
| C | 2 |
| E | 1 |
| O | 1 |
| P | 2 |
| Ч | 1 |
+---+-----+
Create a numbers table and populate it with all the relevant ids you'd need (in this case 1..maxlength of string)
SELECT DISTINCT
locate(your_table.description, numbers.id) AS symbol
FROM
your_table
INNER JOIN
numbers
ON numbers.id >= 1
AND numbers.id <= CHAR_LENGTH(your_table.description)
SELECT DISTINCT(SUBSTR(ll,LEVEL,1)) OP --Here DISTINCT(SUBSTR(ll,LEVEL,1)) is used to get all distinct character/numeric in vertical as per requirment
FROM
(
SELECT LISTAGG(DES,'')
WITHIN GROUP (ORDER BY ID) ll
FROM My_Table --Here listagg is used to convert all values under description(des) column into a single value and there is no space in between
)
CONNECT BY LEVEL <= LENGTH(ll);

Check if NULL exists in Postgres array

Similar to this question, how can I find if a NULL value exists in an array?
Here are some attempts.
SELECT num, ar, expected,
ar #> ARRAY[NULL]::int[] AS test1,
NULL = ANY (ar) AS test2,
array_to_string(ar, ', ') <> array_to_string(ar, ', ', '(null)') AS test3
FROM (
SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;
num | ar | expected | test1 | test2 | test3
-----+------------+----------+-------+-------+-------
1 | {1,2,NULL} | t | f | | t
2 | {1,2,3} | f | f | | f
(2 rows)
Only a trick with array_to_string shows the expected value. Is there a better way to test this?
Postgres 9.5 or later
Or use array_position(). Basically:
SELECT array_position(arr, NULL) IS NOT NULL AS array_has_null
See demo below.
Postgres 9.3 or later
You can test with the built-in functions array_remove() or array_replace().
Postgres 9.1 or any version
If you know a single element that can never exist in your arrays, you can use this fast expression. Say, you have an array of positive numbers, and -1 can never be in it:
-1 = ANY(arr) IS NULL
Related answer with detailed explanation:
Is array all NULLs in PostgreSQL
If you cannot be absolutely sure, you could fall back to one of the expensive but safe methods with unnest(). Like:
(SELECT bool_or(x IS NULL) FROM unnest(arr) x)
or:
EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)
But you can have fast and safe with a CASE expression. Use an unlikely number and fall back to the safe method if it should exist. You may want to treat the case arr IS NULL separately. See demo below.
Demo
SELECT num, arr, expect
, -1 = ANY(arr) IS NULL AS t_1 -- 50 ms
, (SELECT bool_or(x IS NULL) FROM unnest(arr) x) AS t_2 -- 754 ms
, EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL) AS t_3 -- 521 ms
, CASE -1 = ANY(arr)
WHEN FALSE THEN FALSE
WHEN TRUE THEN EXISTS (SELECT 1 FROM unnest(arr) x WHERE x IS NULL)
ELSE NULLIF(arr IS NOT NULL, FALSE) -- catch arr IS NULL -- 55 ms
-- ELSE TRUE -- simpler for columns defined NOT NULL -- 51 ms
END AS t_91
, array_replace(arr, NULL, 0) <> arr AS t_93a -- 99 ms
, array_remove(arr, NULL) <> arr AS t_93b -- 96 ms
, cardinality(array_remove(arr, NULL)) <> cardinality(arr) AS t_94 -- 81 ms
, COALESCE(array_position(arr, NULL::int), 0) > 0 AS t_95a -- 49 ms
, array_position(arr, NULL) IS NOT NULL AS t_95b -- 45 ms
, CASE WHEN arr IS NOT NULL
THEN array_position(arr, NULL) IS NOT NULL END AS t_95c -- 48 ms
FROM (
VALUES (1, '{1,2,NULL}'::int[], true) -- extended test case
, (2, '{-1,NULL,2}' , true)
, (3, '{NULL}' , true)
, (4, '{1,2,3}' , false)
, (5, '{-1,2,3}' , false)
, (6, NULL , null)
) t(num, arr, expect);
Result:
num | arr | expect | t_1 | t_2 | t_3 | t_91 | t_93a | t_93b | t_94 | t_95a | t_95b | t_95c
-----+-------------+--------+--------+------+-----+------+-------+-------+------+-------+-------+-------
1 | {1,2,NULL} | t | t | t | t | t | t | t | t | t | t | t
2 | {-1,NULL,2} | t | f --!! | t | t | t | t | t | t | t | t | t
3 | {NULL} | t | t | t | t | t | t | t | t | t | t | t
4 | {1,2,3} | f | f | f | f | f | f | f | f | f | f | f
5 | {-1,2,3} | f | f | f | f | f | f | f | f | f | f | f
6 | NULL | NULL | t --!! | NULL | f | NULL | NULL | NULL | NULL | f | f | NULL
Note that array_remove() and array_position() are not allowed for multi-dimensional arrays. All expressions to the right of t_93a only work for 1-dimenstioal arrays.
db<>fiddle here - Postgres 13, with more tests
Old sqlfiddle
Benchmark setup
The added times are from a benchmark test with 200k rows in Postgres 9.5. This is my setup:
CREATE TABLE t AS
SELECT row_number() OVER() AS num
, array_agg(elem) AS arr
, bool_or(elem IS NULL) AS expected
FROM (
SELECT CASE WHEN random() > .95 THEN NULL ELSE g END AS elem -- 5% NULL VALUES
, count(*) FILTER (WHERE random() > .8)
OVER (ORDER BY g) AS grp -- avg 5 element per array
FROM generate_series (1, 1000000) g -- increase for big test case
) sub
GROUP BY grp;
Function wrapper
For repeated use, I would create a function in Postgres 9.5 like this:
CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
RETURNS bool
LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT array_position($1, NULL) IS NOT NULL';
PARALLEL SAFE only for Postgres 9.6 or later.
Using a polymorphic input type this works for any array type, not just int[].
Make it IMMUTABLE to allow performance optimization and index expressions.
Does PostgreSQL support "accent insensitive" collations?
But don't make it STRICT, which would disable "function inlining" and impair performance because array_position() is not STRICT itself. See:
Function executes faster without STRICT modifier?
If you need to catch the case arr IS NULL:
CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
RETURNS bool
LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT CASE WHEN $1 IS NOT NULL
THEN array_position($1, NULL) IS NOT NULL END';
For Postgres 9.1 use the t_91 expression from above. The rest applies unchanged.
Closely related:
How to determine if NULL is contained in an array in Postgres?
PostgreSQL's UNNEST() function is a better choice.You can write a simple function like below to check for NULL values in an array.
create or replace function NULL_EXISTS(val anyelement) returns boolean as
$$
select exists (
select 1 from unnest(val) arr(el) where el is null
);
$$
language sql
For example,
SELECT NULL_EXISTS(array [1,2,NULL])
,NULL_EXISTS(array [1,2,3]);
Result:
null_exists null_exists
----------- --------------
t f
So, You can use NULL_EXISTS() function in your query like below.
SELECT num, ar, expected,NULL_EXISTS(ar)
FROM (
SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;
PostgreSQL 9.5 (I know you spcified 9.1, but anyway) has the array_position() function to do just what you want without having to use the horribly inefficient unnest() for something as trivial as this (see test4):
patrick#puny:~$ psql -d test
psql (9.5.0)
Type "help" for help.
test=# SELECT num, ar, expected,
ar #> ARRAY[NULL]::int[] AS test1,
NULL = ANY (ar) AS test2,
array_to_string(ar, ', ') <> array_to_string(ar, ', ', '(null)') AS test3,
coalesce(array_position(ar, NULL::int), 0) > 0 AS test4
FROM (
SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;
num | ar | expected | test1 | test2 | test3 | test4
-----+------------+----------+-------+-------+-------+-------
1 | {1,2,NULL} | t | f | | t | t
2 | {1,2,3} | f | f | | f | f
(2 rows)
I use this
select
array_position(array[1,null], null) is not null
array_position - returns the subscript of the first occurrence of the second argument in the array, starting at the element indicated by the third argument or at the first element (array must be one-dimensional)

subtract data from single column

I have a database table with 2 columns naming piece and diff and type.
Here's what the table looks like
id | piece | diff | type
1 | 20 | NULL | cake
2 | 15 | NULL | cake
3 | 10 | NULL | cake
I want like 20 - 15 = 5 then 15 -10 = 5 , then so on so fort with type as where.
Result will be like this
id | piece | diff | type
1 | 20 | 0 | cake
2 | 15 | 5 | cake
3 | 10 | 5 | cake
Here's the code I have so far but i dont think I'm on the right track
SELECT
tableblabla.id,
(tableblabla.cast(pieces as decimal(7, 2)) - t.cast(pieces as decimal(7, 2))) as diff
FROM
tableblabla
INNER JOIN
tableblablaas t ON tableblabla.id = t.id + 1
Thanks for the help
Use LAG/LEAD window function.
Considering that you want to find Difference per type else remove Partition by from window functions
select id, piece,
Isnull(lag(piece)over(partition by type order by id) - piece,0) as Diff,
type
From yourtable
If you are using Sql Server prior to 2012 use this.
;WITH cte
AS (SELECT Row_number()OVER(partition by type ORDER BY id) RN,*
FROM Yourtable)
SELECT a.id,
a.piece,
Isnull(b.piece - a.piece, 0) AS diff,
a.type
FROM cte a
LEFT JOIN cte b
ON a.rn = b.rn + 1