Putting two rows together on the same row in Oracle - sql

Let's take as example this table, which associate a product to the number
of stars he received from customer's feedbacks (it happens to be ordered by number of stars):
| ProductID | Stars |
|-----------|---------|
| 23 | 10 |
| 12 | 10 |
| 17 | 9 |
| 5 | 8 |
| 20 | 8 |
| 18 | 7 |
How would I select (showing them on the same row) the IDs of the products
in pairs?
Like this:
| Product1 | Product2 |
|-----------|-------------|
| 23 | 12 |
| 17 | 5 |
| 20 | 18 |
Or like this:
| Products |
|------------------|
| 23 12 |
| 17 5 |
| 20 18 |

Use listagg to produce option2
select stars, listagg(ProductID, ' ') within group (order by ProductID) as Products
from Table1
group by Stars

It is not clear why in your output you picked 5 before 20; remember that rows in a table are NOT ORDERED. In my solution I order by stars and then by productid; if your rows are ordered by something else as well, you may use that instead.
You may change 2 to 7 in the divisions by 2 if you want to group 7 values at a time.
This is offered only to show that this CAN be done in Oracle SQL (in the database). It SHOULDN'T, but that's for you to decide.
with
inputs ( productid, stars) as (
select 23, 10 from dual union all
select 12, 10 from dual union all
select 17, 9 from dual union all
select 5, 8 from dual union all
select 20, 8 from dual union all
select 18, 7 from dual
)
-- end of test data, solution begins below
select listagg(productid, ' ') within group (order by rn) as result
from ( select productid, stars,
row_number() over (order by stars desc, productid desc) as rn
from inputs
)
group by ceil(rn/2)
order by ceil(rn/2)
;
RESULT
------
23 12
17 20
5 18

WITH t
AS (SELECT 23 product_id, 10 stars FROM DUAL
UNION ALL
SELECT 12, 10 FROM DUAL
UNION ALL
SELECT 17, 9 FROM DUAL
UNION ALL
SELECT 5, 8 FROM DUAL
UNION ALL
SELECT 20, 8 FROM DUAL
UNION ALL
SELECT 18, 7 FROM DUAL),
t2
AS ( SELECT product_id,
stars,
ROW_NUMBER () OVER (ORDER BY stars DESC)
+ MOD (ROW_NUMBER () OVER (ORDER BY stars DESC), 2)
grp
FROM t
ORDER BY stars DESC)
SELECT LISTAGG (product_id, ' ') WITHIN GROUP (ORDER BY stars DESC, ROWNUM)
AS product_id
FROM t2
GROUP BY grp
result
23 12
17 5
20 18

On oracle you can use a function call listagg which works sort of like join in C# (and other languages.) For it to work you need to have something to group buy -- it seems you want to group by every other when ordered by a column. First make a row number for the column and then group by that div 2
so
SELECT LISTAGG(ProductID, ' ') WITHIN GROUP (ORDER BY ProductID) AS ProductList
FROM (
SELECT ProductID, FLOOR((ROW_NUMBER() OVER (ORDER BY Stars)+1)/2) as GroupByMe FROM Table
) X
GROUP BY GroupByMe
If you want to do every 3, just /3 instead.
More interesting (because more people want to see it) is getting a list for every amount of stars, that looks like this:
SELECT Stars as StarCount, LISTAGG(ProductID, ' ') WITHIN GROUP (ORDER BY ProductID) As ProductList
FROM Table
GROUP BY Stars
of course this looks like a prior answer because this is what people would expect you would want.

Related

Counting Rows under a Specific Header Row

I am trying to count the number of rows under specific "header rows" - for example, I have a table that looks like this:
Row # | Description | Repair_Code | Data Type
1 | FRONT LAMP | (null) | Header
2 | left head lamp | 1235 | Database
3 | right head lamp | 1236 | Database
4 | ROOF | (null) | Header
5 | headliner | 1567 | Database
6 | WHEELS | (null) | Header
7 | right wheel | 1145 | Database
Rows 1, 4 and 6 are header rows (categories) and the others are descriptors under each of those categories. The Data Type column denotes if the row is a header or not.
I want to be able to count the number of rows under the header rows to return something that looks like:
Header | Occurrences
FRONT LAMP | 2
ROOF | 1
WHEELS | 1
Thank you for the help!
Data model looks wrong. If that's some kind of a hierarchy, table should have yet another column which represents a "parent row#".
The way it is now, it's kind of questionable whether you can - or can not - do what you wanted. The only thing you can rely on is row#, which is sequential in your example. If that's not the case, then you have a problem.
So: if you use a lead analytic function for all header rows, then you could do something like this (sample data in rows #1 - 7; query that might help begins at line #8):
SQL> with test (rn, description, code) as
2 (select 1, 'front lamp' , null from dual union all
3 select 2, 'left head lamp' , 1235 from dual union all
4 select 3, 'right head lamp', 1236 from dual union all
5 select 4, 'roof' , null from dual union all
6 select 5, 'headliner' , 1567 from dual
7 ),
8 hdr as
9 -- header rows
10 (select rn,
11 description,
12 lead(rn) over (order by rn) next_rn
13 from test
14 where code is null
15 )
16 select h.description,
17 count(*)
18 from hdr h join test t on t.rn > h.rn
19 and (t.rn < h.next_rn or h.next_rn is null)
20 group by h.description;
DESCRIPTION COUNT(*)
--------------- ----------
front lamp 2
roof 1
SQL>
If data model was different (note parent_rn column), then you wouldn't depend on sequential row# values, but
SQL> with test (rn, description, code, parent_rn) as
2 (select 0, 'items' , null, null from dual union all
3 select 1, 'front lamp' , null, 0 from dual union all
4 select 2, 'left head lamp' , 1235, 1 from dual union all
5 select 3, 'right head lamp', 1236, 1 from dual union all
6 select 4, 'roof' , null, 0 from dual union all
7 select 5, 'headliner' , 1567, 4 from dual
8 ),
9 calc as
10 (select parent_rn,
11 sum(case when code is null then 0 else 1 end) cnt
12 from test
13 connect by prior rn = parent_rn
14 start with parent_rn is null
15 group by parent_rn
16 )
17 select t.description,
18 c.cnt
19 from test t join calc c on c.parent_rn = t.rn
20 where nvl(c.parent_rn, 0) <> 0;
DESCRIPTION CNT
--------------- ----------
front lamp 2
roof 1
SQL>
I would approach this using window functions. Assign a group to each header by doing a cumulative count of the NULL values of repair_code. Then aggregate:
select max(case when repair_code is null then description end) as description,
count(repair_code) as cnt
from (select t.*,
sum(case when repair_code is null then 1 else 0 end) over (order by row#) as grp
from t
) t
group by grp
order by min(row#);
Here is a db<>fiddle.

create window group based on value of preceding row

I have a table like so:
#standardSQL
WITH k AS (
SELECT 1 id, 1 subgrp, 'stuff1' content UNION ALL
SELECT 2, 2, 'stuff2' UNION ALL
SELECT 3, 3, 'stuff3' UNION ALL
SELECT 4, 4, 'stuff4' UNION ALL
SELECT 5, 1, 'ostuff1' UNION ALL
SELECT 6, 2, 'ostuff2' UNION ALL
SELECT 7, 3, 'ostuff3' UNION ALL
SELECT 8, 4, 'ostuff4'
)
and like to group based on the subgrp value to re-create the missing grp: if subgrp value is smaller than previous row, belongs to same group.
Intermediate result would be:
| id | grp | subgrp | content |
| 1 | 1 | 1 | stuff1 |
| 2 | 1 | 2 | stuff2 |
| 3 | 1 | 3 | stuff3 |
| 4 | 1 | 4 | stuff4 |
| 5 | 2 | 1 | ostuff1 |
| 6 | 2 | 2 | ostuff2 |
| 7 | 2 | 3 | ostuff3 |
| 8 | 2 | 4 | ostuff4 |
on which I can then apply
SELECT id, grp, ARRAY_AGG(STRUCT(subgrp, content)) rcd
FROM k ORDER BY id, grp
to have I nice nested structure.
Notes:
with 'id' ordered, subgrp is always in sequence so no 3 before 2
groups are not always 4 subgrp's - this is just to illustrate so cannot hardcode
Problem: how can I (re)create the grp column here ? I played with several Window functions to no avail.
EDIT
Although Gordon's answer work, it took 3min over 104M records to run and I had to remove an ORDER BY on the final resultset because of Resources exceeded during execution: The query could not be executed in the allotted memory. ORDER BY operator used too much memory.
Anyone having an alternative solution for large dataset ?
A simple way to assign the group is to do a cumulative count of the subgrp = 1 values:
select k.*,
sum(case when subgrp = 1 then 1 else 0 end) over (order by id) as grp
from k;
You can also do it your way, using lag() and a cumulative sum. That requires a subquery:
select k.*,
sum(case when prev_subgrp = subgrp then 0 else 1 end) over (order by id) as grp
from (select k.*,
lag(subgrp) over (order by id) as prev_subgrp
from k
) k
Below can potentially perform better - but has limitation - I assume there is no gaps in numbering within subgroups and respective ids
#standardSQL
WITH k AS (
SELECT 1 id, 1 subgrp, 'stuff1' content UNION ALL
SELECT 2, 2, 'stuff2' UNION ALL
SELECT 3, 3, 'stuff3' UNION ALL
SELECT 4, 4, 'stuff4' UNION ALL
SELECT 5, 1, 'ostuff1' UNION ALL
SELECT 6, 2, 'ostuff2' UNION ALL
SELECT 7, 3, 'ostuff3' UNION ALL
SELECT 8, 4, 'ostuff4'
)
SELECT
ROW_NUMBER() OVER(ORDER BY id) grp,
rcd
FROM (
SELECT
MIN(id) id,
ARRAY_AGG(STRUCT(subgrp, content)) rcd
FROM k
GROUP BY id - subgrp
)
result is
Row grp rcd.subgrp rcd.content
1 1 1 stuff1
2 stuff2
3 stuff3
4 stuff4
2 2 1 ostuff1
2 ostuff2
3 ostuff3
4 ostuff4

Show columns not in GROUP BY clause without applying aggregate function on it

I know the question was asked before, but I might have a different case,
I have this table :
| PK_DATA | EVENT_TYPE | DATE |
-------------------------------------
| 123 | D | 12 DEC |
| 123 | I | 11 DEC |
| 123 | U | 10 DEC |
| 124 | D | 11 JAN |
| 124 | U | 12 JAN |
| 125 | I | 1 JAN |
-------------------------------------
I want a query to give max(DATE) grouped by PK_DATE and at the same time give the corresponding EVENT_TYPE .... i.e. :
| 123 | D | 12 DEC |
| 124 | U | 12 JAN |
| 125 | I | 1 JAN |
I thought to group by PK_DATA and select max(DATE) but then the EVENT_TYPE wont be displayed until either apply an aggregate function to it or add it to the group clause and neither will do what I want ... any help ?
BY the way I want to avoid any nested query , I know it can be done on two steps , a nested query to group then join again the main table with the query result
You can use KEEP clause, it's significantly faster and less resource intensive than running window function (if your data set is larger):
WITH data (PK_DATA, EVENT_TYPE, "DATE") AS (
SELECT 123, 'D', DATE'2015-12-12' FROM DUAL UNION ALL
SELECT 123, 'I', DATE'2015-12-11' FROM DUAL UNION ALL
SELECT 123, 'U', DATE'2015-12-10' FROM DUAL UNION ALL
SELECT 124, 'D', DATE'2015-01-11' FROM DUAL UNION ALL
SELECT 124, 'U', DATE'2015-01-12' FROM DUAL UNION ALL
SELECT 125, 'I', DATE'2015-01-01' FROM DUAL)
SELECT
PK_DATA,
MAX(EVENT_TYPE) KEEP (DENSE_RANK LAST ORDER BY "DATE") EVENT_TYPE,
MAX("DATE") "DATE"
FROM
data
GROUP BY
PK_DATA
EDIT: Here is comparison between ROW_NUMBER and KEEP:
PANELMANAGEMENT#panel_management> set autot trace stat
PANELMANAGEMENT#panel_management> SELECT
2 INVOICEDATE,
3 MAX(CREATED) V1,
4 MAX(TOTALCOST) KEEP (DENSE_RANK LAST ORDER BY ORDER_ID) V2
5 FROM
6 ORDERS
7 GROUP BY
8 INVOICEDATE
9 ORDER BY
10 INVOICEDATE;
269 rows selected.
Elapsed: 00:00:05.03
PANELMANAGEMENT#panel_management> SELECT
2 INVOICEDATE,
3 CREATED V1,
4 TOTALCOST V2
5 FROM (
6 SELECT
7 INVOICEDATE,
8 CREATED,
9 TOTALCOST,
10 ROW_NUMBER() OVER (PARTITION BY INVOICEDATE ORDER BY ORDER_ID DESC) FILTER
11 FROM
12 ORDERS)
13 WHERE
14 FILTER = 1
15 ORDER BY
16 INVOICEDATE;
269 rows selected.
Elapsed: 00:00:21.82
The ORDERS table has around 10 million records and 1 GB of data. The main difference is that analytic function needs to allocate much more memory because it needs to assign row number to all 10 million rows that are filtered afterwards to resulting 269 rows. Using KEEP Oracle knows that it needs to allocate just one row per INVOICEDATE. Also when you sort 10 million rows you need the memory for storing all of them. But if you need to sort 10 million rows and keep only single record for each group you can just allocate single record and when you are sorting you just replace it with the one that is greater/smaller. In this case the analytic function required around 100 MB of memory whereas KEEP "none".
You can use a window function for this to establish a row_number for each group:
select *
from (
select pk_data, event_type, date,
row_number() over (partition by pk_data order by date desc) rn
from yourtable
) t
where rn = 1
If you ties are a concern, use rank instead of row_number.
I found a solution, but not sure yet if it's better than Husqiv's in terms of performance or not, so I will post it to spread the knowledge :
WITH data (PK_DATA, EVENT_TYPE, "DATE") AS (
SELECT 123, 'D', DATE'2015-12-12' FROM DUAL UNION ALL
SELECT 123, 'I', DATE'2015-12-11' FROM DUAL UNION ALL
SELECT 123, 'U', DATE'2015-12-10' FROM DUAL UNION ALL
SELECT 124, 'D', DATE'2015-01-11' FROM DUAL UNION ALL
SELECT 124, 'U', DATE'2015-01-12' FROM DUAL UNION ALL
SELECT 125, 'I', DATE'2015-01-01' FROM DUAL)
select EVENT_TYPE, "DATE", PK_DATA
from (select EVENT_TYPE,"DATE",DATA_ID, max("DATE") over (PARTITION BY PK_DATA) max_date
from data ) where "DATE" = max_date;

How to use distinct and sum both together in oracle?

For example my table contains the following data:
ID price
-------------
1 10
1 10
1 20
2 20
2 20
3 30
3 30
4 5
4 5
4 15
So given the example above,
ID price
-------------
1 30
2 20
3 30
4 20
-----------
ID 100
How to write query in oracle? first sum(distinct price) group by id then sum(all price).
I would be very careful with a data structure like this. First, check that all ids have exactly one price:
select id
from table t
group by id
having count(distinct price) > 1;
I think the safest method is to extract a particular price for each id (say the maximum) and then do the aggregation:
select sum(price)
from (select id, max(price) as price
from table t
group by id
) t;
Then, go fix your data so you don't have a repeated additive dimension. There should be a table with one row per id and price (or perhaps with duplicates but controlled by effective and end dates).
The data is messed up; you should not assume that the price is the same on all rows for a given id. You need to check that every time you use the fields, until you fix the data.
first sum(distinct price) group by id then sum(all price)
Looking at your desired output, it seems you also need the final sum(similar to ROLLUP), however, ROLLUP won't directly work in your case.
If you want to format your output in exactly the way you have posted your desired output, i.e. with a header for the last row of total sum, then you could set the PAGESIZE in SQL*Plus.
Using UNION ALL
For example,
SQL> set pagesize 7
SQL> WITH DATA AS(
2 SELECT ID, SUM(DISTINCT price) AS price
3 FROM t
4 GROUP BY id
5 )
6 SELECT to_char(ID) id, price FROM DATA
7 UNION ALL
8 SELECT 'ID' id, sum(price) FROM DATA
9 ORDER BY ID
10 /
ID PRICE
--- ----------
1 30
2 20
3 30
4 20
ID PRICE
--- ----------
ID 100
SQL>
So, you have an additional row in the end with the total SUM of price.
Using ROLLUP
Alternatively, you could use ROLLUP to get the total sum as follows:
SQL> set pagesize 7
SQL> WITH DATA AS
2 ( SELECT ID, SUM(DISTINCT price) AS price FROM t GROUP BY id
3 )
4 SELECT ID, SUM(price) price
5 FROM DATA
6 GROUP BY ROLLUP(id);
ID PRICE
---------- ----------
1 30
2 20
3 30
4 20
ID PRICE
---------- ----------
100
SQL>
First do the DISTINCT and then a ROLLUP
SELECT ID, SUM(price) -- sum of the distinct prices
FROM
(
SELECT DISTINCT ID, price -- distinct prices per ID
FROM tab
) dt
GROUP BY ROLLUP(ID) -- two levels of aggregation, per ID and total sum
SELECT ID,SUM(price) as price
FROM
(SELECT ID,price
FROM TableName
GROUP BY ID,price) as T
GROUP BY ID
Explanation:
The inner query will select different prices for each ids.
i.e.,
ID price
-------------
1 10
1 20
2 20
3 30
4 5
4 15
Then the outer query will select SUM of those prices for each id.
Final Result :
ID price
----------
1 30
2 20
3 30
4 20
Result in SQL Fiddle.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE MYTABLE ( ID, price ) AS
SELECT 1, 10 FROM DUAL
UNION ALL SELECT 1, 10 FROM DUAL
UNION ALL SELECT 1, 20 FROM DUAL
UNION ALL SELECT 2, 20 FROM DUAL
UNION ALL SELECT 2, 20 FROM DUAL
UNION ALL SELECT 3, 30 FROM DUAL
UNION ALL SELECT 3, 30 FROM DUAL
UNION ALL SELECT 4, 5 FROM DUAL
UNION ALL SELECT 4, 5 FROM DUAL
UNION ALL SELECT 4, 15 FROM DUAL;
Query 1:
SELECT COALESCE( TO_CHAR(ID), 'ID' ) AS ID,
SUM( PRICE ) AS PRICE
FROM ( SELECT DISTINCT ID, PRICE FROM MYTABLE )
GROUP BY ROLLUP ( ID )
ORDER BY ID
Results:
| ID | PRICE |
|----|-------|
| 1 | 30 |
| 2 | 20 |
| 3 | 30 |
| 4 | 20 |
| ID | 100 |

Sorting by max value [duplicate]

This question already has answers here:
How to select records with maximum values in two columns?
(2 answers)
Closed 9 years ago.
I have a table that looks like this in an Oracle DB:
TransactionID Customer_id Sequence Activity
---------- ------------- ---------- -----------
1 85 1 Forms
2 51 2 Factory
3 51 1 Forms
4 51 3 Listing
5 321 1 Forms
6 321 2 Forms
7 28 1 Text
8 74 1 Escalate
And I want to be able to sort out all rows where sequence is the highest for each customer_id.
I there a MAX() function I could use on sequence but based on customer_id somehow?
I would like the result of the query to look like this:
TransactionID Customer_id Sequence Activity
---------- ------------- ---------- -----------
1 85 1 Forms
4 51 3 Listing
6 321 2 Forms
7 28 1 Text
8 74 1 Escalate
select t1.*
from your_table t1
inner join
(
select customer_id, max(Sequence) mseq
from your_table
group by customer_id
) t2 on t1.customer_id = t2.customer_id and t1.sequence = t2.mseq
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE tbl ( TransactionID, Customer_id, Sequence, Activity ) AS
SELECT 1, 85, 1, 'Forms' FROM DUAL
UNION ALL SELECT 2, 51, 2, 'Factory' FROM DUAL
UNION ALL SELECT 3, 51, 1, 'Forms' FROM DUAL
UNION ALL SELECT 4, 51, 3, 'Listing' FROM DUAL
UNION ALL SELECT 5, 321, 1, 'Forms' FROM DUAL
UNION ALL SELECT 6, 321, 2, 'Forms' FROM DUAL
UNION ALL SELECT 7, 28, 1, 'Text' FROM DUAL
UNION ALL SELECT 8, 74, 1, 'Escalate' FROM DUAL;
Query 1:
SELECT
MAX( TransactionID ) KEEP ( DENSE_RANK LAST ORDER BY Sequence ) AS TransactionID,
Customer_ID,
MAX( Sequence ) KEEP ( DENSE_RANK LAST ORDER BY Sequence ) AS Sequence,
MAX( Activity ) KEEP ( DENSE_RANK LAST ORDER BY Sequence ) AS Activity
FROM tbl
GROUP BY Customer_ID
ORDER BY TransactionID
Results:
| TRANSACTIONID | CUSTOMER_ID | SEQUENCE | ACTIVITY |
|---------------|-------------|----------|----------|
| 1 | 85 | 1 | Forms |
| 4 | 51 | 3 | Listing |
| 6 | 321 | 2 | Forms |
| 7 | 28 | 1 | Text |
| 8 | 74 | 1 | Escalate |
Please Try it
with cte as
(
select Customer_id,MAX(Sequence) as p from Tablename group by Customer_id
)
select b.* from cte a join Tablename b on a.p = b.Sequence where a.p = b.Sequence and a.Customer_id=b.Customer_id order by b.TransactionID