Grouping by column and rows

Grouping by column and rows - sql

I have a table like this:
+----+--------------+--------+----------+
| id | name | weight | some_key |
+----+--------------+--------+----------+
| 1 | strawberries | 12 | 1 |
| 2 | blueberries | 7 | 1 |
| 3 | elderberries | 0 | 1 |
| 4 | cranberries | 8 | 2 |
| 5 | raspberries | 18 | 2 |
+----+--------------+--------+----------+
I'm looking for a generic request that would get me all berries where there are three entries with the same 'some_key' and one of the entries (within those three entries belonging to the same some_key) has the weight = 0
in case of the sample table, expected output would be:
1 strawberries
2 blueberries
3 cranberries

As you want to include non-grouped columns, I would approach this with window functions:
select id, name
from (
select id,
name,
count(*) over w as key_count,
count(*) filter (where weight = 0) over w as num_zero_weight
from fruits
window w as (partition by some_key)
) x
where x.key_count = 3
and x.num_zero_weight >= 1
The count(*) over w counts the number of rows in that group (= partition) and the count(*) filter (where weight = 0) over w counts how many of those have a weight of zero.
The window w as ... avoids repeating the same partition by clause for the window functions.
Online example: https://rextester.com/SGWFI49589

Try this-
SELECT some_key,
SUM(weight) --Sample aggregations on column
FROM your_table
GROUP BY some_key
HAVING COUNT(*) = 3 -- If you wants at least 3 then use >=3
AND SUM(CASE WHEN weight = 0 THEN 1 ELSE 0 END) >= 1
As per your edited question, you can try this below-
SELECT id, name
FROM your_table
WHERE some_key IN (
SELECT some_key
FROM your_table
GROUP BY some_key
HAVING COUNT(*) = 3 -- If you wants at least 3 then use >=3
AND SUM(CASE WHEN weight = 0 THEN 1 ELSE 0 END) >= 1
)

Try doing this.
Table structure and sample data
CREATE TABLE tmp (
id int,
name varchar(50),
weight int,
some_key int
);
INSERT INTO tmp
VALUES
('1', 'strawberries', '12', '1'),
('2', 'blueberries', '7', '1'),
('3', 'elderberries', '0', '1'),
('4', 'cranberries', '8', '2'),
('5', 'raspberries', '18', '2');
Query
SELECT t1.*
FROM tmp t1
INNER JOIN (SELECT some_key
FROM tmp
GROUP BY some_key
HAVING Count(some_key) >= 3
AND Min(Abs(weight)) = 0) t2
ON t1.some_key = t2.some_key;
Output
+-----+---------------+---------+----------+
| id | name | weight | some_key |
+-----+---------------+---------+----------+
| 1 | strawberries | 12 | 1 |
| 2 | blueberries | 7 | 1 |
| 3 | elderberries | 0 | 1 |
+-----+---------------+---------+----------+
Online Demo: http://sqlfiddle.com/#!15/70cca/26/0
Thank you, #mkRabbani for reminding me about the negative values.
Further reading
- ABS() Function - Link01, Link02
- HAVING Clause - Link01, Link02

Related

Postgresql - Looping through array_agg

I have a table from which I need to calculate the number of times intent_level changes for each id.
Sample Table format :
id | start_time | intent_level
----+------------+--------------
1 | 2 | status
1 | 3 | status
1 | 1 |
1 | 4 | category
2 | 5 | status
2 | 8 |
2 | 7 | status
I couldn't figure out how to loop through array_agg and compare consecutive elements. Below I tried using array_agg, but then I don't know how to loop through it and compare consecutive elements.
select
id,
array_agg (intent_level ORDER BY start_time)
FROM temp.chats
GROUP BY id;
which gives output :
id | array_agg
----+-----------------------------
1 | {"",status,status,category}
2 | {status,status,""}
Desired output is :
id | array_agg
----+-----------------------------
1 | 2
2 | 1
2 (since value changes from "" to status(1 to 2) and status to category(3 to 4))
1 (since value changes from status to ""(2 to 3))
CREATE AND INSERT QUERIES :
create table temp.chats (
id varchar(5),
start_time varchar(5),
intent_level varchar(20)
);
insert into temp.chats values
('1', '2', 'status'),
('1', '3', 'status'),
('1', '1', ''),
('1', '4', 'category'),
('2', '5', 'status'),
('2', '8', ''),
('2', '7', 'status');

Use lag() and aggregate:
select id, count(*)
from (select c.*,
lag(intent_level) over (partition by id order by start_time) as prev_intent_level
from temp.chats c
) c
where prev_intent_level is distinct from intent_level
group by id;
Here is a db<>fiddle.
Arrays seem quite unnecessary for this.

Possible to use a column name in a UDF in SQL?

I have a query in which a series of steps is repeated constantly over different columns, for example:
SELECT DISTINCT
MAX (
CASE
WHEN table_2."GRP1_MINIMUM_DATE" <= cohort."ANCHOR_DATE" THEN 1
ELSE 0
END)
OVER (PARTITION BY cohort."USER_ID")
AS "GRP1_MINIMUM_DATE",
MAX (
CASE
WHEN table_2."GRP2_MINIMUM_DATE" <= cohort."ANCHOR_DATE" THEN 1
ELSE 0
END)
OVER (PARTITION BY cohort."USER_ID")
AS "GRP2_MINIMUM_DATE"
FROM INPUT_COHORT cohort
LEFT JOIN INVOLVE_EVER table_2 ON cohort."USER_ID" = table_2."USER_ID"
I was considering writing a function to accomplish this as doing so would save on space in my query. I have been reading a bit about UDF in SQL but don't yet understand if it is possible to pass a column name in as a parameter (i.e. simply switch out "GRP1_MINIMUM_DATE" for "GRP2_MINIMUM_DATE" etc.). What I would like is a query which looks like this
SELECT DISTINCT
FUNCTION(table_2."GRP1_MINIMUM_DATE") AS "GRP1_MINIMUM_DATE",
FUNCTION(table_2."GRP2_MINIMUM_DATE") AS "GRP2_MINIMUM_DATE",
FUNCTION(table_2."GRP3_MINIMUM_DATE") AS "GRP3_MINIMUM_DATE",
FUNCTION(table_2."GRP4_MINIMUM_DATE") AS "GRP4_MINIMUM_DATE"
FROM INPUT_COHORT cohort
LEFT JOIN INVOLVE_EVER table_2 ON cohort."USER_ID" = table_2."USER_ID"
Can anyone tell me if this is possible/point me to some resource that might help me out here?
Thanks!

There is no such direct as #Tejash already stated, but the thing looks like your database model is not ideal - it would be better to have a table that has USER_ID and GRP_ID as keys and then MINIMUM_DATE as seperate field.
Without changing the table structure, you can use UNPIVOT query to mimic this design:
WITH INVOLVE_EVER(USER_ID, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE)
AS (SELECT 1, SYSDATE, SYSDATE, SYSDATE, SYSDATE FROM dual UNION ALL
SELECT 2, SYSDATE-1, SYSDATE-2, SYSDATE-3, SYSDATE-4 FROM dual)
SELECT *
FROM INVOLVE_EVER
unpivot ( minimum_date FOR grp_id IN ( GRP1_MINIMUM_DATE AS 1, GRP2_MINIMUM_DATE AS 2, GRP3_MINIMUM_DATE AS 3, GRP4_MINIMUM_DATE AS 4))
Result:
| USER_ID | GRP_ID | MINIMUM_DATE |
|---------|--------|--------------|
| 1 | 1 | 09/09/19 |
| 1 | 2 | 09/09/19 |
| 1 | 3 | 09/09/19 |
| 1 | 4 | 09/09/19 |
| 2 | 1 | 09/08/19 |
| 2 | 2 | 09/07/19 |
| 2 | 3 | 09/06/19 |
| 2 | 4 | 09/05/19 |
With this you can write your query without further code duplication and if you need use PIVOT-syntax to get one line per USER_ID.
The final query could then look like this:
WITH INVOLVE_EVER(USER_ID, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE)
AS (SELECT 1, SYSDATE, SYSDATE, SYSDATE, SYSDATE FROM dual UNION ALL
SELECT 2, SYSDATE-1, SYSDATE-2, SYSDATE-3, SYSDATE-4 FROM dual)
, INPUT_COHORT(USER_ID, ANCHOR_DATE)
AS (SELECT 1, SYSDATE-1 FROM dual UNION ALL
SELECT 2, SYSDATE-2 FROM dual UNION ALL
SELECT 3, SYSDATE-3 FROM dual)
-- Above is sampledata query starts from here:
, unpiv AS (SELECT *
FROM INVOLVE_EVER
unpivot ( minimum_date FOR grp_id IN ( GRP1_MINIMUM_DATE AS 1, GRP2_MINIMUM_DATE AS 2, GRP3_MINIMUM_DATE AS 3, GRP4_MINIMUM_DATE AS 4)))
SELECT qcsj_c000000001000000 user_id, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE
FROM INPUT_COHORT cohort
LEFT JOIN unpiv table_2
ON cohort.USER_ID = table_2.USER_ID
pivot (MAX(CASE WHEN minimum_date <= cohort."ANCHOR_DATE" THEN 1 ELSE 0 END) AS MINIMUM_DATE
FOR grp_id IN (1 AS GRP1,2 AS GRP2,3 AS GRP3,4 AS GRP4))
Result:
| USER_ID | GRP1_MINIMUM_DATE | GRP2_MINIMUM_DATE | GRP3_MINIMUM_DATE | GRP4_MINIMUM_DATE |
|---------|-------------------|-------------------|-------------------|-------------------|
| 3 | | | | |
| 1 | 0 | 0 | 0 | 0 |
| 2 | 0 | 1 | 1 | 1 |
This way you only have to write your calculation logic once (see line starting with pivot).

Sql query to partition and sum the records grouping by their bill number and Product code

Below are two tables where there are parent bill number like 1, 4 and 8. These parents bill references to nothing/NULL values. They are referenced by one or more child bill number. For eg parent bill 1 is referenced by child bill 2, 3 and 6.
Table B also has the bill no column with prod code with actual service (ST values) and associated service values (SV). SV are the additional cost to ST.
Same ST may occur in multiple bill numbers. Here Bill number is only unique.
For eg, ST1 are in bill number 1 and 8. Also same SV may reference same or different ST.
SV1, SV2 and SV3 are referencing to ST1 corresponding to bill no. 1 and SV2 and SV4 are referencing to ST2 corresponding to bill no.2.
How can we get below expected output?
Table A:
| bill no | ref |
+----------------------------------------+
| 1 | |
| 2 | 1 |
| 3 | 1 |
| 4 | |
| 5 | 4 |
| 6 | 1 |
| 7 | 4 |
| 8 | |
| 9 | 8 |
Table B:
| bill no | Prod code | cost |
+-----------------------------------------------------+
| 1 | ST1 | 10
| 2 | SV1 | 20
| 3 | SV2 | 30
| 4 | ST2 | 10
| 5 | SV2 | 20
| 6 | SV3 | 30
| 7 | SV4 | 40
| 8 | ST1 | 50
| 9 | SV1 | 10
Expected output:
| bill no | Prod code | ST_cost | SV1 | SV2 | SV3 |
+---------------------------------------------------------------------------------------------+
| 1 | ST1 | 10 | 20 | 30 | 30 |
| 4 | ST2 | 10 | 20 | 40 | |
| 8 | ST1 | 50 | 10 | | |

Here's a script that should get you there:
USE tempdb;
GO
DROP TABLE IF EXISTS dbo.TableA;
CREATE TABLE dbo.TableA
(
BillNumber int NOT NULL PRIMARY KEY,
Reference int NULL
);
GO
INSERT dbo.TableA (BillNumber, Reference)
SELECT *
FROM (VALUES (1,NULL),
(2,1),
(3,1),
(4,NULL),
(5,4),
(6,1),
(7,4),
(8,NULL),
(9,8)) AS a(BillNumber, Reference);
GO
DROP TABLE IF EXISTS dbo.TableB;
CREATE TABLE dbo.TableB
(
BillNumber int NOT NULL PRIMARY KEY,
ProductCode varchar(10) NOT NULL,
Cost int NOT NULL
);
GO
INSERT dbo.TableB (BillNumber, ProductCode, Cost)
SELECT BillNumber, ProductCode, Cost
FROM (VALUES (1, 'ST1', 10),
(2, 'SV1', 20),
(3, 'SV2', 30),
(4, 'ST2', 10),
(5, 'SV2', 20),
(6, 'SV3', 30),
(7, 'SV4', 40),
(8, 'ST1', 50),
(9, 'SV1', 10)) AS b(BillNumber, ProductCode, Cost);
GO
WITH ParentBills
AS
(
SELECT b.BillNumber, b.ProductCode, b.Cost AS STCost
FROM dbo.TableB AS b
INNER JOIN dbo.TableA AS a
ON b.BillNumber = a.BillNumber
WHERE a.Reference IS NULL
),
SubBills
AS
(
SELECT pb.BillNumber, pb.ProductCode, pb.STCost,
b.ProductCode AS ChildProduct, b.Cost AS ChildCost
FROM ParentBills AS pb
INNER JOIN dbo.TableA AS a
ON a.Reference = pb.BillNumber
INNER JOIN dbo.TableB AS b
ON b.BillNumber = a.BillNumber
)
SELECT sb.BillNumber, sb.ProductCode, sb.STCost,
MAX(CASE WHEN sb.ChildProduct = 'SV1' THEN sb.ChildCost END) AS [SV1],
MAX(CASE WHEN sb.ChildProduct = 'SV2' THEN sb.ChildCost END) AS [SV2],
MAX(CASE WHEN sb.ChildProduct = 'SV3' THEN sb.ChildCost END) AS [SV3]
FROM SubBills AS sb
GROUP BY sb.BillNumber, sb.ProductCode, sb.STCost
ORDER BY sb.BillNumber;

You could write a function that creates you query based on your SV number.
And use "Execute Immediate" to execute the Query String and then "PIPE ROW" to generate the result.
Check This PIPE ROW EXAMPLE

I don't understand where the "SV1" value comes from on the second row.
But your problem is basically conditional aggregation:
with ab as (
select a.*, b.productcode, b.cost,
coalesce(a.reference, a.billnumber) as parent_billnumber
from a join
b
on b.billnumber = a.billnumber
)
select parent_billnumber,
max(case when reference is null then productcode end) as st,
sum(case when reference is null then cost end) as st_cost,
sum(case when productcode = 'SV1' then cost end) as sv1,
sum(case when productcode = 'SV2' then cost end) as sv2,
sum(case when productcode = 'SV3' then cost end) as sv3
from ab
group by parent_billnumber
order by parent_billnumber;
Here is a db<>fiddle.
Note this works because you have only one level of child relationships. If there are more, then recursive CTEs are needed. I would recommend that you ask a new question if this is possible.
The CTE doesn't actually add much to the query, so you can also write:
select coalesce(a.reference, a.billnumber) as parent_billnumber ,
max(case when a.reference is null then productcode end) as st,
sum(case when a.reference is null then b.cost end) as st_cost,
sum(case when b.productcode = 'SV1' then b.cost end) as sv1,
sum(case when b.productcode = 'SV2' then b.cost end) as sv2,
sum(case when b.productcode = 'SV3' then b.cost end) as sv3
from a join
b
on b.billnumber = a.billnumber
group by coalesce(a.reference, a.billnumber)
order by parent_billnumber;

How to query the previous record that is in another table?

I have a view that shows something like the following:
View VW
| ID | DT | VAL|
|----|------------|----|
| 1 | 2016-09-01 | 7 |
| 2 | 2016-08-01 | 5 |
| 3 | 2016-07-01 | 8 |
I have a table with historical date that has something like:
Table HIST
| ID | DT | VAL|
|----|------------|----|
| 1 | 2016-06-27 | 4 |
| 1 | 2016-06-29 | 3 |
| 1 | 2016-07-15 | 0 |
| 1 | 2016-09-12 | 8 |
| 2 | 2016-05-05 | 3 |
What I need is to add another column to my view with a boolean that means "the immediately previous record exist in history and has a related value greater than zero".
The expected output is the following:
| ID | DT | VAL| FLAG |
|----|------------|----|------|
| 1 | 2016-09-01 | 7 | false| -- previous is '2016-07-15' and value is zero. '2016-09-12' in hist is greater than '2016-09-01' in view, so it is not the previous
| 2 | 2016-08-01 | 5 | true | -- previous is '2016-05-05' and value is 3
| 3 | 2016-07-01 | 8 | false| -- there is no previous value in HIST table
What have I tried
I've used the query below. It works for small loads, but fails in performance in production because my view is extremely complex and the historical table is too large. Is it possible to query this without using the view multiple times? (if so, the performance should be better and I won't see anymore timeouts)
You can test here http://rextester.com/l/sql_server_online_compiler
create table vw (id int, dt date, val int);
insert into vw values (1, '2016-09-01', 7), (2, '2016-08-01', 5), (3, '2016-07-01', 8);
create table hist (id int, dt date, val int);
insert into hist values (1, '2016-06-27', 4), (1, '2016-06-29', 3), (1, '2016-07-15', 0), (1, '2016-09-12', 8), (2, '2016-05-05', 3);
select vw.id, vw.dt, vw.val, (case when hist_with_flag.flag = 'true' then 'true' else 'false' end)
from vw
left join
(
select hist.id, (case when hist.val > 0 then 'true' else 'false' end) flag
from
(
select hist.id, max(hist.dt) as dt
from hist
inner join vw on vw.id = hist.id
where hist.dt < vw.dt
group by hist.id
) hist_with_max_dt
inner join hist
on hist.id = hist_with_max_dt.id and hist.dt = hist_with_max_dt.dt
) hist_with_flag
on vw.id = hist_with_flag.id

You can use OUTER APPLY in order to get the immediately previous record:
SELECT v.ID, v.DT, v.VAL,
IIF(t.VAL IS NULL OR t.VAL = 0, 'false', 'true') AS FLAG
FROM Vw AS v
OUTER APPLY (
SELECT TOP 1 VAL, DT
FROM Hist AS h
WHERE v.ID = h.ID AND v.DT > h.DT
ORDER BY h.DT DESC) AS t

Can you please try with this query, it returns same result as your query. It should work good performance wise
SELECT vw.id, MAX(vw.dt) dt,
MAX(vw.val) val,
case when MAX(h.val) > 0 then 'true' else 'false' END flag
FROM vw
OUTER APPLY(SELECT MAX(dt) dt FROM hist WHERE vw.id = hist.id
AND dt<vw.dt GROUP BY hist.id) t
LEFT JOIN hist h ON vw.id = h.id AND h.dt = t.dt
GROUP BY vw.id

You can avoid multiple JOIN using a simple CTE with 'ROW_NUMBER'.
;with cte_1
as
(select vw.id, vw.dt, vw.val,hist.val HistVal,hist.dt HistDt,ROW_NUMBER()OVER (PARTITION BY vw.id,vw.dt ORDER BY vw.id,vw.dt,hist.dt desc) RNO
FROM vw
left join hist
on hist.id = vw.id and hist.dt < vw.dt
)
SELECT Id,Dt,Val,case when ISNULL(HistVal,0)=0 THEN 'FALSE' ELSE 'TRUE' END as FLAG
FROM cte_1 WHERE RNO=1

How to pivot or 'merge' rows with column names?

I have the following table:
crit_id | criterium | val1 | val2
----------+------------+-------+--------
1 | T01 | 9 | 9
2 | T02 | 3 | 5
3 | T03 | 4 | 9
4 | T01 | 2 | 3
5 | T02 | 5 | 1
6 | T03 | 6 | 1
I need to convert the values in 'criterium' into columns as 'cross product' with val1 and val2. So the result has to lool like:
T01_val1 |T01_val2 |T02_val1 |T02_val2 | T03_val1 | T03_val2
---------+---------+---------+---------+----------+---------
9 | 9 | 3 | 5 | 4 | 9
2 | 3 | 5 | 1 | 6 | 1
Or to say differently: I need every value for all criteria to be in one row.
This is my current approach:
select
case when criterium = 'T01' then val1 else null end as T01_val1,
case when criterium = 'T01' then val2 else null end as T01_val2,
case when criterium = 'T02' then val1 else null end as T02_val1,
case when criterium = 'T02' then val2 else null end as T02_val2,
case when criterium = 'T03' then val1 else null end as T03_val1,
case when criterium = 'T03' then val2 else null end as T04_val2,
from crit_table;
But the result looks not how I want it to look like:
T01_val1 |T01_val2 |T02_val1 |T02_val2 | T03_val1 | T03_val2
---------+---------+---------+---------+----------+---------
9 | 9 | null | null | null | null
null | null | 3 | 5 | null | null
null | null | null | null | 4 | 9
What's the fastest way to achieve my goal?
Bonus question:
I have 77 criteria and seven different kinds of values for every criterium. So I have to write 539 case statements. Whats the best way to create them dynamically?
I'm working with PostgreSql 9.4

Prepare for crosstab
In order to use crosstab() function, the data must be reorganized. You need a dataset with three columns (row number, criterium, value). To have all values in one column you must unpivot two last columns, changing at the same time the names of criteria. As a row number you can use rank() function over partitions by new criteria.
select rank() over (partition by criterium order by crit_id), criterium, val
from (
select crit_id, criterium || '_v1' criterium, val1 val
from crit
union
select crit_id, criterium || '_v2' criterium, val2 val
from crit
) sub
order by 1, 2
rank | criterium | val
------+-----------+-----
1 | T01_v1 | 9
1 | T01_v2 | 9
1 | T02_v1 | 3
1 | T02_v2 | 5
1 | T03_v1 | 4
1 | T03_v2 | 9
2 | T01_v1 | 2
2 | T01_v2 | 3
2 | T02_v1 | 5
2 | T02_v2 | 1
2 | T03_v1 | 6
2 | T03_v2 | 1
(12 rows)
This dataset can be used in crosstab():
create extension if not exists tablefunc;
select * from crosstab($ct$
select rank() over (partition by criterium order by crit_id), criterium, val
from (
select crit_id, criterium || '_v1' criterium, val1 val
from crit
union
select crit_id, criterium || '_v2' criterium, val2 val
from crit
) sub
order by 1, 2
$ct$)
as ct (rank bigint, "T01_v1" int, "T01_v2" int,
"T02_v1" int, "T02_v2" int,
"T03_v1" int, "T03_v2" int);
rank | T01_v1 | T01_v2 | T02_v1 | T02_v2 | T03_v1 | T03_v2
------+--------+--------+--------+--------+--------+--------
1 | 9 | 9 | 3 | 5 | 4 | 9
2 | 2 | 3 | 5 | 1 | 6 | 1
(2 rows)
Alternative solution
For 77 criteria * 7 parameters the above query may be troublesome. If you can accept a bit different way of presenting the data, the issue becomes much easier.
select * from crosstab($ct$
select
rank() over (partition by criterium order by crit_id),
criterium,
concat_ws(' | ', val1, val2) vals
from crit
order by 1, 2
$ct$)
as ct (rank bigint, "T01" text, "T02" text, "T03" text);
rank | T01 | T02 | T03
------+-------+-------+-------
1 | 9 | 9 | 3 | 5 | 4 | 9
2 | 2 | 3 | 5 | 1 | 6 | 1
(2 rows)

DECLARE #Table1 TABLE
(crit_id int, criterium varchar(3), val1 int, val2 int)
;
INSERT INTO #Table1
(crit_id, criterium, val1, val2)
VALUES
(1, 'T01', 9, 9),
(2, 'T02', 3, 5),
(3, 'T03', 4, 9),
(4, 'T01', 2, 3),
(5, 'T02', 5, 1),
(6, 'T03', 6, 1)
;
select [T01] As [T01_val1 ],[T01-1] As [T01_val2 ],[T02] As [T02_val1 ],[T02-1] As [T02_val2 ],[T03] As [T03_val1 ],[T03-1] As [T03_val3 ] from (
select T.criterium,T.val1,ROW_NUMBER()OVER(PARTITION BY T.criterium ORDER BY (SELECT NULL)) RN from (
select criterium, val1 from #Table1
UNION ALL
select criterium+'-'+'1', val2 from #Table1)T)PP
PIVOT (MAX(val1) FOR criterium IN([T01],[T02],[T03],[T01-1],[T02-1],[T03-1]))P

I agree with Michael's comment that this requirement looks a bit weird, but if you really need it that way, you were on the right track with your solution. It just needs a little bit of additional code (and small corrections wherever val_1 and val_2 where mixed up):
select
sum(case when criterium = 'T01' then val_1 else null end) as T01_val1,
sum(case when criterium = 'T01' then val_2 else null end) as T01_val2,
sum(case when criterium = 'T02' then val_1 else null end) as T02_val1,
sum(case when criterium = 'T02' then val_2 else null end) as T02_val2,
sum(case when criterium = 'T03' then val_1 else null end) as T03_val1,
sum(case when criterium = 'T03' then val_2 else null end) as T03_val2
from
crit_table
group by
trunc((crit_id-1)/3.0)
order by
trunc((crit_id-1)/3.0);
This works as follows. To aggregate the result you posted into the result you would like to have, the first helpful observation is that the desired result has less rows than your preliminary one. So there's some kind of grouping necessary, and the key question is: "What's the grouping criterion?" In this case, it's rather non-obvious: It's criterion ID (minus 1, to start counting with 0) divided by 3, and truncated. The three comes from the number of different criteria. After that puzzle is solved, it is easy to see that for among the input rows that are aggregated into the same result row, there is only one non-null value per column. That means that the choice of aggregate function is not so important, as it is only needed to return the only non-null value. I used the sum in my code snippet, but you could as well use min or max.
As for the bonus question: Use a code generator query that generates the query you need. The code looks like this (with only three types of values to keep it brief):
with value_table as /* possible kinds of values, add the remaining ones here */
(select 'val_1' value_type union
select 'val_2' value_type union
select 'val_3' value_type )
select contents from (
select 0 order_id, 'select' contents
union
select row_number() over () order_id,
'max(case when criterium = '''||criterium||''' then '||value_type||' else null end) '||criterium||'_'||value_type||',' contents
from crit_table
cross join value_table
union select 9999999 order_id,
' from crit_table group by trunc((crit_id-1)/3.0) order by trunc((crit_id-1)/3.0);' contents
) v
order by order_id;
This basically only uses a string template of your query and then inserts the appropriate combinations of values for the criteria and the val-columns. You could even get rid of the with-clause by reading column names from information_schema.columns, but I think the basic idea is clearer in the version above. Note that the code generated contains one comma too much directly after the last column (before the from clause). It's easier to delete that by hand afterwards than correcting it in the generator.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Grouping by column and rows - sql

Related

Postgresql - Looping through array_agg

Possible to use a column name in a UDF in SQL?

Sql query to partition and sum the records grouping by their bill number and Product code

How to query the previous record that is in another table?

How to pivot or 'merge' rows with column names?

Categories

Resources