Compare column value with previous record in Oracle - sql

My oracle table data is as below.
Org_ID
Product_ID
Order_Month
Amount
101
201
JAN-2021
2000
101
201
FEB-2021
2000
101
201
MAR-2021
2000
101
201
APR-2021
1500
101
201
MAY-2021
2000
101
202
JUN-2021
2000
101
202
JUL-2021
2000
We need to compare previous value for amount and find records with mis-matched amount and with respect to Product_ID.
My output should be like below. Tried using lag but couldn't find the solution. Can someone please provides inputs on how to approach for solving this.
Org_ID
Product_ID
Order_Month
Amount
101
201
JAN-2021 to MAR-2021
2000
101
201
APR-2021 to APR-2021
1500
101
201
MAY-2021 to MAY-2021
1500
101
202
JUN-2021 to JUL-2021
1500

You may try the following
SELECT
"Org_ID",
"Product_ID",
CONCAT(
CONCAT(Order_Month_Group,' to '),
TO_CHAR(MAX(actual_date),'MON-YYYY')
) as Order_Month,
"Amount"
FROM (
SELECT
t1.*,
LAG(
"Order_Month",
CASE WHEN continued=0 THEN 0 ELSE seq_num-1 END
,"Order_Month") OVER (
PARTITION BY "Org_ID","Product_ID","Amount"
ORDER BY actual_date
) as Order_Month_Group
FROM (
SELECT
t.*,
TO_DATE(t."Order_Month",'MON-YYYY') as actual_date,
ROW_NUMBER() OVER (
PARTITION BY t."Org_ID",t."Product_ID",t."Amount"
ORDER BY TO_DATE("Order_Month",'MON-YYYY')
) as seq_num,
CASE
WHEN t."Amount" = LAG(t."Amount",1,t."Amount") OVER (
PARTITION BY t."Org_ID",t."Product_ID"
ORDER BY TO_DATE("Order_Month",'MON-YYYY')
) THEN 1
ELSE 0
END as continued
FROM
my_oracle_table t
) t1
) t2
GROUP BY "Org_ID", "Product_ID", Order_Month_Group, "Amount"
ORDER BY MIN(actual_date)
or
SELECT
"Org_ID",
"Product_ID",
CONCAT(
CONCAT(TO_CHAR(MIN(actual_date),'MON-YYYY'),' to '),
TO_CHAR(MAX(actual_date),'MON-YYYY')
) as Order_Month,
"Amount"
FROM (
SELECT
t1.*,
SUM(continued) OVER ( ORDER BY actual_date) as grp
FROM (
SELECT
t.*,
TO_DATE("Order_Month",'MON-YYYY') as actual_date,
CASE
WHEN t."Amount" = LAG(t."Amount",1,t."Amount") OVER (
PARTITION BY t."Org_ID",t."Product_ID"
ORDER BY TO_DATE("Order_Month",'MON-YYYY')
) THEN 0
ELSE 1
END as continued
FROM
my_oracle_table t
) t1
) t2
GROUP BY "Org_ID", "Product_ID", grp, "Amount"
ORDER BY MIN(actual_date)
View Demo on DB Fiddle
Let me know if this works for you.

This is a type of gaps-and-islands problem. In this case, the simplest solution is probably the difference of row numbers. The following assumes that order_month is actually a string (rather than a date):
select org_id, product_id, amount,
min(order_month), max(order_month)
from (select t.*,
row_number() over (partition by org_id, product_id order by to_date(order_month, 'MON-YYYY')) as seqnum,
row_number() over (partition by org_id, product_id, amount order by to_date(order_month, 'MON-YYYY')) as seqnum_2
from t
) t
group by org_id, product_id, amount, (seqnum - seqnm_2);
Why this works is a little tricky to explain. However, if you look at the results of the subquery, you will see how the difference between these two values is constant on adjacent months.

Related

SQL Server : create group of N rows each and give group number for each group

I want to create a SQL query that SELECT a ID column and adds an extra column to the query which is a group number as shown in the output below.
Each group consists of 3 rows and should have the MIN(ID) as a GroupID for each group. The order by should be ASC on the ID column.
ID GroupNr
------------
100 100
101 100
102 100
103 103
104 103
105 103
106 106
107 106
108 106
I've tried solutions with ROW_NUMBER() and DENSE_RANK(). And also this query:
SELECT
*, MIN(ID) OVER (ORDER BY ID ASC ROWS 2 PRECEDING) AS Groupnr
FROM
Table
ORDER BY
ID ASC
Use row_number() to enumerate the rows, arithmetic to assign the group and then take the minimum of the id:
SELECT t.*, MIN(ID) OVER (PARTITION BY grp) as groupnumber
FROM (SELECT t.*,
( (ROW_NUMBER() OVER (ORDER BY ID) - 1) / 3) as grp
FROM Table
) t
ORDER BY ID ASC;
It is possible to do this without a subquery, but the logic is rather messy:
select t.*,
(case when row_number() over (order by id) % 3 = 0
then lag(id, 2) over (order by id)
when row_number() over (order by id) % 3 = 2
then lag(id, 1) over (order by id)
else id
end) as groupnumber
from table t
order by id;
Assuming you want the lowest value in the group, and they are always groups of 3, rather than the NTILE (as Saravantn suggests, which splits the data into that many even(ish) groups), you could use a couple of window functions:
WITH Grps AS(
SELECT V.ID,
(ROW_NUMBER() OVER (ORDER BY V.ID) -1) / 3 AS Grp
FROM (VALUES(100),
(101),
(102),
(103),
(104),
(105),
(106),
(107),
(108))V(ID))
SELECT G.ID,
MIN(G.ID) OVER (PARTITION BY G.Grp) AS GroupNr
FROM Grps G;
SELECT T2.ID, T1.ID
FROM (
SELECT MIN(ID) AS ID, GroupNr
FROM
(
SELECT ID, ( Row_number()OVER(ORDER BY ID) - 1 ) / 3 + 1 AS GroupNr
FROM Table
) AS T1
GROUP BY GroupNr
) AS T1
INNER JOIN (
SELECT ID, ( Row_number()OVER(ORDER BY ID) - 1 ) / 3 + 1 AS GroupNr
FROM Table
) T2 ON T1.GroupNr = T2.GroupNr

How to build SQL to capture most unique value?

I am trying to build a query results with SQL. Here is my table:
CUST_ID ORDER_ID STORE_FREQUENCY
---------- ----------- ---------------
100 20122 500
100 20100 500
100 20100 737
200 20119 287
300 20130 434
300 20150 434
300 20130 434
300 20120 120
The expected output is:
CUST_ID UNIQUE_ORDERS TOP_STORE
--------- ----------------- ---------
100 2 737
200 1 287
300 3 434
The requirement for the output is:
TOP_STORE = Per CUST_ID, sort the STORE_FREQUENCY column by DESC and get the greatest store frequency
UNIQUE_ORDERS = Per CUST_ID, the number of unique ORDER_IDs in the column
I have started this SELECT statement, but having difficulties completing it to include the 2 columns correctly:
Select cust_id, Count(order_id) as unique_orders
From ORDERS_TABLE
Group By Order_ID
Can you help me complete the 2 columns?
Use aggregate functions such as COUNT(DISTINCT ...) and MAX()
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID), MAX(STORE_FREQUENCY )
FROM TableName
GROUP BY CUST_ID
Here's a DEMO.
It seems to be that the top store should be the store with the greatest number of orders. If so, then CUST_ID 100 should have store 500 as the top store, not 737. In other words, I would expect the following output:
This requirement changes the query strategy, because we no longer can just do a single simple aggregation over the entire table. One approach is to do a separate calculation to find the top store for each customer, then join that result to a query similar to the other answers.
WITH cte AS (
SELECT CUST_ID, STORE_FREQUENCY, cnt,
ROW_NUMBER() OVER (PARTITION BY CUST_ID ORDER BY cnt DESC) rn
FROM
(
SELECT CUST_ID, STORE_FREQUENCY,
COUNT(*) OVER (PARTITION BY CUST_ID, STORE_FREQUENCY) cnt
FROM yourTable
) t
)
SELECT
t1.CUST_ID,
t1.UNIQUE_ORDERS,
t2.TOP_STORE
FROM
(
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID) AS UNIQUE_ORDERS
FROM yourTable
GROUP BY CUST_ID
) t1
INNER JOIN
(
SELECT CUST_ID, STORE_FREQUENCY AS TOP_STORE
FROM cte
WHERE rn = 1
) t2
ON t1.CUST_ID = t2.CUST_ID;
Demo

how to use same column twice with different criteria with one common column in sql

I have a table
ID P_ID Cost
1 101 1000
2 101 1050
3 101 1100
4 102 5000
5 102 2000
6 102 6000
7 103 3000
8 103 5000
9 103 4000
I want to use 'Cost' column twice to fetch first and last inserted value in cost corresponding to each P_ID
I want output as:
P_ID First_Cost Last_Cost
101 1000 1100
102 5000 6000
103 3000 4000
;WITH t AS
(
SELECT P_ID, Cost,
f = ROW_NUMBER() OVER (PARTITION BY P_ID ORDER BY ID),
l = ROW_NUMBER() OVER (PARTITION BY P_ID ORDER BY ID DESC)
FROM dbo.tablename
)
SELECT t.P_ID, t.Cost, t2.Cost
FROM t INNER JOIN t AS t2
ON t.P_ID = t2.P_ID
WHERE t.f = 1 AND t2.l = 1;
In 2012 you will be able to use FIRST_VALUE():
SELECT DISTINCT
P_ID,
FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID),
FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID DESC)
FROM dbo.tablename;
You get a slightly more favorable plan if you remove the DISTINCT and instead use ROW_NUMBER() with the same partitioning to eliminate multiple rows with the same P_ID:
;WITH t AS
(
SELECT
P_ID,
f = FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID),
l = FIRST_VALUE(Cost) OVER (PARTITION BY P_ID ORDER BY ID DESC),
r = ROW_NUMBER() OVER (PARTITION BY P_ID ORDER BY ID)
FROM dbo.tablename
)
SELECT P_ID, f, l FROM t WHERE r = 1;
Why not LAST_VALUE(), you ask? Well, it doesn't work like you might expect. For more details, see the comments under the documentation.
SELECT t.P_ID,
SUM(CASE WHEN ID = t.minID THEN Cost ELSE 0 END) as FirstCost,
SUM(CASE WHEN ID = t.maxID THEN Cost ELSE 0 END) as LastCost
FROM myTable
JOIN (
SELECT P_ID, MIN(ID) as minID, MAX(ID) as maxID
FROM myTable
GROUP BY P_ID) t ON myTable.ID IN (t.minID, t.maxID)
GROUP BY t.P_ID
Admittedly, #AaronBertrand's approach is cleaner here. However, this solution will work on older versions of SQL Server (that don't support CTE's or window functions), or on pretty much any other DBMS.
Do you want first and last in terms of Min and Max, or do you want which one was entered first and which one was entered last? If you want Min and max you can group by.
SELECT P_ID, MIN(Cost), MAX(Cost) FROM table_name GROUP BY P_ID
I believe this does your thing also, just without self joins or subqueries:
SELECT DISTINCT
P_ID
,MIN(Cost) OVER (PARTITION BY P_ID) as FirstCost
,MAX(Cost) OVER (PARTITION BY P_ID) as LastCost
FROM Table

Get the row with max(column) for distinct key

I have some data like
code amount month
aaa1 100 1
aaa1 200 2
aaa1 300 3
aaa4 450 2
aaa4 400 3
aaa6 0 2
From the above, for each code I want to get the row with max(month)
code amount month
aaa1 300 3
aaa4 400 3
aaa6 0 2
How can I create a ms sql query for this?
;WITH MyCTE AS
(
SELECT code,
amount,
month,
ROW_NUMBER() OVER(PARTITION BY code ORDER BY code,month DESC) AS rownum
FROM table
)
SELECT *
FROM MyCTE
WHERE rownum = 1
You can use the ranking function ROW_NUMBER() with PARTITION BY code ORDER BY month DESC to do this:
WITH CTE
AS
(
SELECT
code, amount, month,
ROW_NUMBER() OVER(PARTITION BY code
ORDER BY month DESC) AS RN
FROM Tablename
)
SELECT code, amount, month
FROM CTE
WHERE RN = 1;
This will give you the maximum month for each code.
SQL Fiddle Demo
Try this
SELECT *
FROM
(SELECT MAX(MONTH) month, code
FROM table1
GROUP BY code) res
JOIN table1
ON res.month = table1.month
AND res.code = table1.code
Here is the SQLfiddle

Rows inside the greatest streak?

Given the Rows
symbol_id profit date
1 100 2009-08-18 01:01:00
1 100 2009-08-18 01:01:01
1 156 2009-08-18 01:01:04
1 -56 2009-08-18 01:01:06
1 18 2009-08-18 01:01:07
How would I most efficiently select the rows that are involved in the greatest streak (of profit).
The greatest streak would be the first 3 rows, and I would want those rows. The query I came up with is just a bunch of nested queries and derived tables. I am looking for an efficient way to do this using common table expressions or something more advanced.
You haven't defined how 0 profit should be treated or what happens if there is a tie for longest streak. But something like...
;WITH T1 AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY symbol_id ORDER BY date) -
ROW_NUMBER() OVER (PARTITION BY symbol_id, SIGN(profit)
ORDER BY date) AS Grp
FROM Data
), T2 AS
(
SELECT *,
COUNT(*) OVER (PARTITION BY symbol_id,Grp) AS StreakLen
FROM T1
)
SELECT TOP 1 WITH TIES *
FROM T2
ORDER BY StreakLen DESC
Or - if you are looking for most profitable streak
;WITH T1 AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY symbol_id ORDER BY date) -
ROW_NUMBER() OVER (PARTITION BY symbol_id, CASE WHEN profit >= 0 THEN 1 END
ORDER BY date) AS Grp
FROM Data
), T2 AS
(
SELECT *,
SUM(profit) OVER (PARTITION BY symbol_id,Grp) AS StreakProfit
FROM T1
)
SELECT TOP 1 WITH TIES *
FROM T2
ORDER BY StreakProfit DESC
declare #T table
(
symbol_id int,
profit int,
[date] datetime
)
insert into #T values
(1, 100, '2009-08-18 01:01:00'),
(1, 100, '2009-08-18 01:01:01'),
(1, 156, '2009-08-18 01:01:04'),
(1, -56, '2009-08-18 01:01:06'),
(1, 18 , '2009-08-18 01:01:07')
;with C1 as
(
select *,
row_number() over(order by [date]) as rn
from #T
),
C2 as
(
select *,
rn - row_number() over(order by rn) as grp
from C1
where profit >= 0
)
select top 1 with ties *
from C2
order by sum(profit) over(partition by grp) desc
Result:
symbol_id profit date rn grp
----------- ----------- ----------------------- -------------------- --------------------
1 100 2009-08-18 01:01:00.000 1 0
1 100 2009-08-18 01:01:01.000 2 0
1 156 2009-08-18 01:01:04.000 3 0
If that's a MSSQL server then you want to consider using TOP 3 in your select clause
and ORDER BY PROFIT DESC.
If mysql/postgres you might want to consider using limit in your select clause with
the same order by too.
hope this helps.