PIVOT multiple rows into columns - sql

I have table like below. Consider the query
select invoice_mth, inv_amt from table xdetails
where mobile_number=9080808080
data in the table
mobile_number invoice_mth inv_amt
9080808080 2010-10 20
9080808080 2010-11 30
9080808080 2010-12 40
I have to display the data from table like below.
I want invoice months to separate each month and amt separately.
MOBILE_NUMBER inv_m1 inv_m2 inv_m3 amt1 amt2 amt3
------- ----------------------------------------------------------
9080808080 2010-10 2010-11 2010-12 20 30 40
to display the data like above what I have to do?

You could play around with the standard PIVOT query:
SQL> SELECT * FROM t;
MOBILE_NUMBER INVOICE INV_AMT
------------- ------- ----------
9080808080 2010-10 20
9080808080 2010-11 30
9080808080 2010-12 40
SQL>
SQL> SELECT *
2 FROM
3 (SELECT mobile_number, invoice_mth, inv_amt FROM t
4 ) PIVOT (MIN(invoice_mth) AS inv_mth,
5 SUM(inv_amt) AS inv_amt
6 FOR (invoice_mth) IN ('2010-10' AS m1, '2010-11' AS m2, '2010-12' AS m3))
7 ORDER BY mobile_number;
MOBILE_NUMBER M1_INV_ M1_INV_AMT M2_INV_ M2_INV_AMT M3_INV_ M3_INV_AMT
------------- ------- ---------- ------- ---------- ------- ----------
9080808080 2010-10 20 2010-11 30 2010-12 40
SQL>

If you want a fixed number of columns in the output:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TABLE_NAME ( mobile_number, invoice_mth, inv_amt ) AS
SELECT 9080808080, '2010-10', 20 FROM DUAL
UNION ALL SELECT 9080808080, '2010-11', 30 FROM DUAL
UNION ALL SELECT 9080808080, '2010-12', 40 FROM DUAL;
Query 1:
SELECT mobile_number,
MAX( CASE RN WHEN 1 THEN invoice_mth END ) AS inv_m1,
MAX( CASE RN WHEN 2 THEN invoice_mth END ) AS inv_m2,
MAX( CASE RN WHEN 3 THEN invoice_mth END ) AS inv_m3,
MAX( CASE RN WHEN 1 THEN inv_amt END ) AS amt1,
MAX( CASE RN WHEN 2 THEN inv_amt END ) AS amt2,
MAX( CASE RN WHEN 3 THEN inv_amt END ) AS amt3
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( PARTITION BY mobile_number ORDER BY invoice_mth ASC ) AS rn
FROM TABLE_NAME t
)
GROUP BY mobile_number
Results:
| MOBILE_NUMBER | INV_M1 | INV_M2 | INV_M3 | AMT1 | AMT2 | AMT3 |
|---------------|---------|---------|---------|------|------|------|
| 9080808080 | 2010-10 | 2010-11 | 2010-12 | 20 | 30 | 40 |

Related

How to evaluate rows and get the max value based on multiple columns

I have a SQL question where I want to evaluate and get the latest work order for a location based on multiple criteria
The table looks something like this
location work order create dt. status result
1. 123 3/1/22 complete positive
1. 124 3/2/22 incomplete. null
2. 231 2/1/22 cancelled. null
2. 232 2/3/22 incomplete. null
The requirement is as follows
For each location, find the latest work order based on the following criteria
If there are multiple work orders with results, pick the one with the latest date
If there are multiple work orders but one with result and one with no result, pick the one with the result - even if it is not latest
If there are multiple work orders, but none have result, pick the latest one that is not cancelled
If there are multiple work orders, but all are cancelled, pick the latest one
The result would be something like this
location work order
1. 123
2. 232
Since for location 1, we pick the earlier one, since it has the result
And for location 2, we pick the earlier one, since it is not cancelled.
Thanks
Here's the source data
SQL> select * from t;
LOCATION WORKORDER CREATED STATUS RESULT
---------- ---------- ---------- -------------------- --------------------
1 123 03/01/2022 complete positive
1 124 03/02/2022 incomplete
2 231 02/01/2022 cancelled
2 232 02/03/2022 incomplete
We can pick up some additional data on a per location basis
SQL> select
2 t.*,
3 max(created) over ( partition by location) as last_date,
4 count(result) over ( partition by location) result_count,
5 max(case when result is not null then created end) over ( partition by location) result_date,
6 max(case when status != 'cancelled' then created end) over ( partition by location) non_cancelled_date
7 from t
8 /
LOCATION WORKORDER CREATED STATUS RESULT LAST_DATE RESULT_COUNT RESULT_DAT NON_CANCEL
---------- ---------- ---------- -------------------- -------------------- ---------- ------------ ---------- ----------
1 123 03/01/2022 complete positive 03/02/2022 1 03/01/2022 03/02/2022
1 124 03/02/2022 incomplete 03/02/2022 1 03/01/2022 03/02/2022
2 231 02/01/2022 cancelled 02/03/2022 0 02/03/2022
2 232 02/03/2022 incomplete 02/03/2022 0 02/03/2022
and use that to apply our rules
SQL> select *
2 from
3 (
4 select
5 t.*,
6 max(created) over ( partition by location) as last_date,
7 count(result) over ( partition by location) result_count,
8 max(case when result is not null then created end) over ( partition by location) result_date,
9 max(case when status != 'cancelled' then created end) over ( partition by location) non_cancelled_date
10 from t
11 )
12 where ( result_count > 1 and created = result_date ) -- rule1
13 or ( result_count = 1 and created = result_date ) -- rule2
14 or ( result_count = 0 and non_cancelled_date = created ) -- rule3
15 or ( result_count = 0 and non_cancelled_date is null and created = last_date ) -- rule4
16 /
LOCATION WORKORDER CREATED STATUS RESULT LAST_DATE RESULT_COUNT RESULT_DAT NON_CANCEL
---------- ---------- ---------- -------------------- -------------------- ---------- ------------ ---------- ----------
1 123 03/01/2022 complete positive 03/02/2022 1 03/01/2022 03/02/2022
2 232 02/03/2022 incomplete 02/03/2022 0 02/03/2022
If you're unfamiliar with these "OVER" functions, here's my tutorial series on them https://www.youtube.com/watch?v=0cjxYMxa1e4&list=PLJMaoEWvHwFIUwMrF4HLnRksF0H8DHGtt
You can use the ROW_NUMBER analytic function:
SELECT *
FROM (
SELECT t.*,
ROW_NUMBER() OVER (
PARTITION BY location
ORDER BY
CASE
WHEN result IS NOT NULL THEN 0
WHEN status = 'cancelled' THEN 2
ELSE 1
END ASC,
create_dt DESC
) AS rn
FROM table_name t
)
WHERE rn = 1;
Which, for the sample data:
CREATE TABLE table_name (location, work_order, create_dt, status, result) AS
SELECT 1, 123, DATE '2022-01-03', 'complete', 'positive' FROM DUAL UNION ALL
SELECT 1, 124, DATE '2022-02-03', 'incomplete', null FROM DUAL UNION ALL
SELECT 2, 231, DATE '2022-01-02', 'cancelled', null FROM DUAL UNION ALL
SELECT 2, 232, DATE '2022-03-02', 'incomplete', null FROM DUAL;
Outputs:
LOCATION
WORK_ORDER
CREATE_DT
STATUS
RESULT
RN
1
123
03-JAN-22
complete
positive
1
2
232
02-MAR-22
incomplete
null
1
db<>fiddle here

SQL - Order Data on a Column without including it in ranking

So I have a scenario where I need to order data on a column without including it in dense_rank(). Here is my sample data set:
This is the table:
create table temp
(
id integer,
prod_name varchar(max),
source_system integer,
source_date date,
col1 integer,
col2 integer);
This is the dataset:
insert into temp
(id,prod_name,source_system,source_date,col1,col2)
values
(1,'ABC',123,'01/01/2021',50,60),
(2,'ABC',123,'01/15/2021',50,60),
(3,'ABC',123,'01/30/2021',40,60),
(4,'ABC',123,'01/30/2021',40,70),
(5,'XYZ',456,'01/10/2021',80,30),
(6,'XYZ',456,'01/12/2021',75,30),
(7,'XYZ',456,'01/20/2021',75,30),
(8,'XYZ',456,'01/20/2021',99,30);
Now, I want to do dense_rank() on the data in such a way that for a combination of "prod_name and source_system", the rank gets incremented only if there is a change in col1 or col2 but the data should still be in ascending order of source_date.
Here is the expected result:
id
prod_name
source_system
source_date
col1
col2
Dense_Rank
1
ABC
123
01-01-21
50
60
1
2
ABC
123
15-01-21
50
60
1
3
ABC
123
30-01-21
40
60
2
4
ABC
123
30-01-21
40
70
3
5
XYZ
456
10-01-21
80
30
1
6
XYZ
456
12-01-21
75
30
2
7
XYZ
456
20-01-21
75
30
2
8
XYZ
456
20-01-21
99
30
3
As you can see above, the dates are changing but the expectation is that rank should only change if there is any change in either col1 or col2.
If I use this query
select id,prod_name,source_system,source_date,col1,col2,
dense_rank() over(partition by prod_name,source_system order by source_date,col1,col2) as rnk
from temp;
Then the result would come as:
id
prod_name
source_system
source_date
col1
col2
rnk
1
ABC
123
01-01-21
50
60
1
2
ABC
123
15-01-21
50
60
2
3
ABC
123
30-01-21
40
60
3
4
ABC
123
30-01-21
40
70
4
5
XYZ
456
10-01-21
80
30
1
6
XYZ
456
12-01-21
75
30
2
7
XYZ
456
20-01-21
75
30
3
8
XYZ
456
20-01-21
99
30
4
And, if I exclude source_date from order by in rank function i.e.
select id,prod_name,source_system,source_date,col1,col2,
dense_rank() over(partition by prod_name,source_system order by col1,col2) as rnk
from temp;
Then my result is coming as:
id
prod_name
source_system
source_date
col1
col2
rnk
3
ABC
123
30-01-21
40
60
1
4
ABC
123
30-01-21
40
70
2
1
ABC
123
01-01-21
50
60
3
2
ABC
123
15-01-21
50
60
3
6
XYZ
456
12-01-21
75
30
1
7
XYZ
456
20-01-21
75
30
1
5
XYZ
456
10-01-21
80
30
2
8
XYZ
456
20-01-21
99
30
3
Both the results are incorrect. How can I get the expected result? Any guidance would be helpful.
WITH cte AS (
SELECT *,
LAG(col1) OVER (PARTITION BY prod_name, source_system ORDER BY source_date, id) lag1,
LAG(col2) OVER (PARTITION BY prod_name, source_system ORDER BY source_date, id) lag2
FROM temp
)
SELECT *,
SUM(CASE WHEN (col1, col2) = (lag1, lag2)
THEN 0
ELSE 1
END) OVER (PARTITION BY prod_name, source_system ORDER BY source_date, id) AS `Dense_Rank`
FROM cte
ORDER BY id;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ac70104c7c5dfb49c75a8635c25716e6
When comparing multiple columns, I like to look at the previous values of the ordering column, rather than the individual columns. This makes it much simpler to add more and more columns.
The basic idea is to do a cumulative sum of changes for each prod/source system. In Redshift, I would phrase this as:
select t.*,
sum(case when prev_date = prev_date_2 then 0 else 1 end) over (
partition by prod_name, source_system
order by source_date
rows between unbounded preceding and current row
)
from (select t.*,
lag(source_date) over (partition by prod_name, source_system order by source_date, id) as prev_date,
lag(source_date) over (partition by prod_name, source_system, col1, col2 order by source_date, id) as prev_date_2
from temp t
) t
order by id;
I think I have the syntax right for Redshift. Here is a db<>fiddle using Postgres.
Note that ties on the date can cause a problem -- regardless of the solution. This uses the id to break the ties. Perhaps id can just be used in general, but your code is using the date, so this uses the date with the id.

Oracle SQL - update 2 columns in row with the oldest date

I am attempting to update 2 columns in a row. The row that should be updated is the row that has the oldest duedate
The table chorecompletion is described as:
Name Null? Type
----------------------------------------- -------- ----------------------------
CHOREID NOT NULL NUMBER(38)
GROUPID NOT NULL NUMBER(38)
DUEDATE NOT NULL DATE
COMPLETEDDATE DATE
COMPLETEDBY NUMBER(38)
This query returns the row that I want to update
select *
from
(
select choreid, duedate, row_number()
over (partition by choreid order by duedate) as rn
from chorecompletion where choreid = 12 and groupid = 6
)
where rn = 1;
Where I could use some help is how to use this query in my update statement, specifically my where clause
my current attempt:
update chorecompletion
set completeddate = sysdate, completedby=1
where --How to get the result of the previous query here?
Any help on my logic would be hugely appreciated.
Example desired result:
Before Update:
CHOREID GROUPID DUEDATE COMPLETEDDATE COMPLETEDBY
-------------------------------------------------------------------
12 6 2018-11-1
12 6 2018-10-1
After Update
CHOREID GROUPID DUEDATE COMPLETEDDATE COMPLETEDBY
-------------------------------------------------------------------
12 6 2018-11-1
12 6 2018-10-1 2018-09-30 1
Something like this?
SQL> create table test
2 (choreid number,
3 groupid number,
4 duedate date,
5 completeddate date,
6 completedby number
7 );
Table created.
SQL> insert into test
2 select 12, 6, date '2018-01-11', null, null from dual union all
3 select 12, 6, date '2018-01-10', null, null from dual;
2 rows created.
SQL> update test t set
2 t.completeddate = sysdate,
3 t.completedby = 1
4 where t.duedate = (select min(t1.duedate)
5 from test t1
6 where t1.choreid = t.choreid
7 and t1.groupid = t.groupid)
8 and t.choreid = 12
9 and t.groupid = 6;
1 row updated.
SQL> select * From test;
CHOREID GROUPID DUEDATE COMPLETEDD COMPLETEDBY
---------- ---------- ---------- ---------- -----------
12 6 2018-01-11
12 6 2018-01-10 2018-09-30 1
SQL>
You can use a MERGE statement and can join on the ROWID pseudo-column so that you can correlated directly to the matched row:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE chorecompletion ( choreid, groupid, duedate, completeddate, completedby ) AS
SELECT 12, 6, DATE '2018-09-29', CAST( null AS DATE ), CAST( null AS NUMBER ) FROM DUAL UNION ALL
SELECT 12, 6, DATE '2018-09-30', null, null FROM DUAL;
Query 1:
MERGE INTO chorecompletion dst
USING (
SELECT ROWID AS rid
FROM (
SELECT *
FROM chorecompletion
WHERE choreid = 12
AND groupid = 6
ORDER BY duedate ASC
)
WHERE ROWNUM = 1
) src
ON ( src.RID = dst.ROWID )
WHEN MATCHED THEN
UPDATE
SET completeddate = sysdate,
completedby = 1
Results:
1 Row Updated.
Query 2:
SELECT * FROM chorecompletion
Results:
| CHOREID | GROUPID | DUEDATE | COMPLETEDDATE | COMPLETEDBY |
|---------|---------|----------------------|----------------------|-------------|
| 12 | 6 | 2018-09-29T00:00:00Z | 2018-09-30T18:42:45Z | 1 |
| 12 | 6 | 2018-09-30T00:00:00Z | (null) | (null) |
Query 3: You could also use an UPDATE statement with the ROWID pseudo-column:
UPDATE chorecompletion dst
SET completeddate = sysdate,
completedby = 2
WHERE ROWID = (
SELECT ROWID
FROM (
SELECT ROW_NUMBER() OVER ( PARTITION BY choreid ORDER BY duedate ) rn
FROM chorecompletion
WHERE choreid = 12
AND groupid = 6
ORDER BY duedate ASC
)
WHERE rn = 1
)
Results:
1 Row Updated.
Query 4:
SELECT * FROM chorecompletion
Results:
| CHOREID | GROUPID | DUEDATE | COMPLETEDDATE | COMPLETEDBY |
|---------|---------|----------------------|----------------------|-------------|
| 12 | 6 | 2018-09-29T00:00:00Z | 2018-09-30T18:42:45Z | 2 |
| 12 | 6 | 2018-09-30T00:00:00Z | (null) | (null) |
You can use a correlated subquery. If I understand correctly:
update chorecompletion
set completeddate = (select min(duedate)
from chorecompletion cc
where cc.choreid = chorecompletion.coreid
)
where choreid = 12 and groupid = 6

Oracle 12c - sql to find out of order rows

I have a table with following columns:
FILE_NAME VARCHAR2(30);
STATUS VARCHAR2(2);
DEPT_ID NUMBER;
DEPT_SUB_ID NUMBER;
CREATE_DATE DATE;
sample data:
FILE_NAME STATUS DEPT_ID DEPT_SUB_ID CREATE_DATE
--------- ------- -------- ----------- ----------
TEST_20180806222127 C 1 10 07-AUG-18 01.04.47.821795000 AM
TEST_20180806221940 C 1 10 07-AUG-18 04.12.20.957400000 AM
TEST_20180806221733 C 1 10 07-AUG-18 03.35.27.809494000 AM
TEST_20180805202020 C 1 20 06-AUG-18 02.24.47.821795000 AM
TEST_20180805201640 C 1 20 06-AUG-18 00.42.20.957400000 AM
TEST_20180805201530 C 1 20 06-AUG-18 03.55.27.809494000 AM
FILE_NAME consists of: <TYPE>_<DATETIME>
I want to write a query for each DEPT_ID, DEPT_SUB_ID to determine which files with STATUS = 'C' were created out of order based on the <DATETIME> on FILE_NAME and CREATE_DATE field. In this example, for DEPT_SUB_ID = 10, file TEST_20180806222127 was created before the other 2 based on the DATE_TIME on the file name so I would need to return only this file name in result for DEPT_SUB_ID = 10. For DEPT_SUB_ID = 20, result should contain TEST_20180805201640 and TEST_20180805202020 since both were created before TEST_20180805201530, which is considered out of order.
Expected results from query will output all file_name's which were created before it's order of run.
You can assign two rankings to each row, one based on the order of the timestamp embedded int he file name, or other on the order of the creation date:
select yt.*,
row_number() over (partition by dept_id, dept_sub_id
order by to_date(substr(file_name, -14), 'YYYYMMDDHH24MISS')) as rn_file_name,
row_number() over (partition by dept_id, dept_sub_id
order by create_date) as rn_create_date
from your_table yt;
FILE_NAME S DEPT_ID DEPT_SUB_ID CREATE_DATE RN_FILE_NAME RN_CREATE_DATE
------------------- - ---------- ----------- ----------------------------- ------------ --------------
TEST_20180806221733 C 1 10 2018-08-07 03:35:27.809494000 1 2
TEST_20180806221940 C 1 10 2018-08-07 04:12:20.957400000 2 3
TEST_20180806222127 C 1 10 2018-08-07 01:04:47.821795000 3 1
TEST_20180805201530 C 1 20 2018-08-06 03:55:27.809494000 1 3
TEST_20180805201640 C 1 20 2018-08-06 00:42:20.957400000 2 1
TEST_20180805202020 C 1 20 2018-08-06 02:24:47.821795000 3 2
Then filter to see the mismatches:
select file_name, status, dept_id, dept_sub_id, create_date
from (
select yt.*,
row_number() over (partition by dept_id, dept_sub_id
order by to_date(substr(file_name, -14), 'YYYYMMDDHH24MISS')) as rn_file_name,
row_number() over (partition by dept_id, dept_sub_id
order by create_date) as rn_create_date
from your_table yt
)
where rn_file_name > rn_create_date;
FILE_NAME S DEPT_ID DEPT_SUB_ID CREATE_DATE
------------------- - ---------- ----------- -----------------------------
TEST_20180806222127 C 1 10 2018-08-07 01:04:47.821795000
TEST_20180805201640 C 1 20 2018-08-06 00:42:20.957400000
TEST_20180805202020 C 1 20 2018-08-06 02:24:47.821795000
And you can add a filter for a specific ID or sub-ID, either in the inner or outer query, if you don't want to see them all at once.

SQL check if group containes certain values of given column (ORACLE)

I have table audit_log with these records:
log_id | request_id | status_id
1 | 2 | 5
2 | 2 | 10
3 | 2 | 20
4 | 3 | 10
5 | 3 | 20
I would like to know if there exists request_ids having status_id 5 and 10 at the same time. So this query should return request_id = 2 as its column status_id has values 5 and 10 (request_id 3 is omitted because status_id column has only value of 10 without 5).
How could I do this with SQL?
I think I should use group by request_id, but I don't know how to check if group has status_id with values 5 and 10?
Thanks,
mismas
This could be a way:
/* input data */
with yourTable(log_id , request_id , status_id) as (
select 1 , 2 , 5 from dual union all
select 2 , 2 , 10 from dual union all
select 3 , 2 , 20 from dual union all
select 4 , 3 , 10 from dual union all
select 5 , 3 , 20 from dual
)
/* query */
select request_id
from yourTable
group by request_id
having count( distinct case when status_id in (5,10) then status_id end) = 2
How it works:
select request_id,
case when status_id in (5,10) then status_id end as checkColumn
from yourTable
gives
REQUEST_ID CHECKCOLUMN
---------- -----------
2 5
2 10
2
3 10
3
So the condition count (distinct ...) = 2 does the work
SELECT request_id
FROM table_name
GROUP BY request_id
HAVING COUNT( CASE status_id WHEN 5 THEN 1 END ) > 0
AND COUNT( CASE status_id WHEN 10 THEN 1 END ) > 0
To check if both values exists (without regard to additional values) you can filter before aggregation:
select request_id
from yourTable
where status_id in (5,10)
group by request_id
having count(*) = 2 -- status_id is unique
-- or
having count(distinct status_id) = 2 -- status_id exists multiple times
This should do it:
select
log5.*, log10.status_id
from
audit_log log5
join audit_log log10 on log10.request_id = log5.request_id
where
log5.status_id = 5
and log10.status_id = 10
order by
log5.request_id
;
Here's the output:
+ ----------- + --------------- + -------------- + -------------- +
| log_id | request_id | status_id | status_id |
+ ----------- + --------------- + -------------- + -------------- +
| 1 | 2 | 5 | 10 |
+ ----------- + --------------- + -------------- + -------------- +
1 rows
And here's the sql to set up the example:
create table audit_log (
log_id int,
request_id int,
status_id int
);
insert into audit_log values (1,2,5);
insert into audit_log values (2,2,10);
insert into audit_log values (3,2,20);
insert into audit_log values (4,3,10);
insert into audit_log values (5,3,20);