Specific Order by Clause Oracle - sql

Let say I have grocery shop with 2 type of customer Regular - R and 'Corporate - C` with them i have an agreement of prices based on dates. Sample data would look like.
Type(C/R) CustID From Date Cost
C 1/11/2017 10
C 1 1/11/2017 12
1/11/2017 14
R 1/11/2017 9
C 1 10/11/2017 11
C 11/11/2017 15
From the table you can see Type,Custid are not mandatory. My rate picker matches max matching columns from input based on from date to give me cost to apply.
Sample Input:(Input will always have type,custid and from date)
Case 1: Type - c,Cust ID - 1, dealdate(fromdate) - 2/11/2017
Output: Row number 2 with price 12
Case 2: Type - C, Cust ID - 2,dealdate(fromdate) - 2/11/2017
Output: Row number 1 with price 10
Case 3: Type - C, Cust ID - 2,dealdate(fromdate) - 12/11/2017
Output: Row number 6 with price 15
My output will match max matching column first and then will check for valid date it means matching columns have high priority than from date(but yes it has to be valid period).
My Approach:
select * from (
select row_number() over(partition by partition_column order by
from_date desc,type,custid) rn,a.* from (
select *,'1' as partition_column from rate
where from_date <= :d_date and (type = :type or type is null) and
(custid = :custid or custid is null)) a) where rn=1;
I am not getting the desired result. Can anyone help please.

Ok, I think I understand what you want.
create table rate(
type varchar2(5)
,cust_id number
,from_date date not null
,cost number not null
);
insert into rate(type, cust_id, from_date, cost) values('C', null, date '2017-11-01', 10);
insert into rate(type, cust_id, from_date, cost) values('C', 1, date '2017-11-01', 12);
insert into rate(type, cust_id, from_date, cost) values(null, null, date '2017-11-01', 14);
insert into rate(type, cust_id, from_date, cost) values('R', null, date '2017-11-01', 9);
insert into rate(type, cust_id, from_date, cost) values('C', 1, date '2017-11-10', 11);
insert into rate(type, cust_id, from_date, cost) values('C', null, date '2017-11-11', 15);
This statement works by finding records that match either type or customer. The Date input must be satisfied. In the end, higher priority is given to customer ID than customer type, and in case multiple records come, the one with most recent from date is picked.
select type, cust_id, cost, from_date
from (select r.*
,case when cust_id = 2 then 1 end as cust_id_matches
,case when type = 'C' then 1 end as type_matches
from rate r
where (type = 'C' or cust_id = 2) -- Either attribute my match
and from_date <= date '2017-11-12' -- Mandatory, must be valid
order
by cust_id_matches asc nulls last -- Order customer ID matches first
,type_matches asc nulls last -- Then Matches for type
,from_date desc -- Pick most recent if multiple records
)
where rownum = 1;
Here is a SQL Fiddle

Related

Select latest available SQL entry state

Consider this DDL:
CREATE TABLE cash_depot_state
(
id INTEGER NOT NULL PRIMARY KEY,
date DATE,
amount REAL,
cash_depot_id INTEGER
);
INSERT INTO cash_depot_state (date, amount, cash_depot_id)
VALUES (DATE('2022-03-02'), 382489, 5);
INSERT INTO cash_depot_state (date, amount, cash_depot_id)
VALUES (DATE('2022-03-03'), 750, 2);
INSERT INTO cash_depot_state (date, amount, cash_depot_id)
VALUES (DATE('2022-03-04'), 750, 3);
INSERT INTO cash_depot_state (date, amount, cash_depot_id)
VALUES (DATE('2022-03-05'), 0, 5);
For an array of dates I need to select sum of all cash depots' actual amounts:
2022-03-01 - no data available - expect 0
2022-03-02 - cash depot #5 has changed it's value to 382489 - expect 382489
2022-03-03 - cash depot #2 has changed it's value to 750 - expect 382489 + 750
2022-03-03 - cash depot #3 has changed it's value to 750 - expect 382489 + 750 + 750
2022-03-04 - cash depot #5 has changed it's value to 0 - expect 0 + 750 + 750
My best attempt: http://sqlfiddle.com/#!5/94ad0d/1
But I can't figure out how to pick winner of a subgroup
You could define the latest amount per cash depot as the record that has row number 1, when you divvy up records by cash_depot_id, and order them descending by date:
SELECT
id,
cash_depot_id,
date,
amount,
ROW_NUMBER() OVER (PARTITION BY cash_depot_id ORDER BY date DESC) rn
FROM
cash_depot_state
This will highlight the latest data from your table - all the relevant rows will have rn = 1:
id
cash_depot_id
date
amount
rn
2
2
2022-03-03
750.0
1
3
3
2022-03-04
750.0
1
4
5
2022-03-05
0.0
1
1
5
2022-03-02
382489.0
2
Now you can use a WHERE clause to filter records to a certain date, e.g. WHERE data <= '2022-03-05':
SELECT
SUM(amount) sum_amount
FROM
(
SELECT amount, ROW_NUMBER() OVER (PARTITION BY cash_depot_id ORDER BY date DESC) rn
FROM cash_depot_state
WHERE date <= '2022-03-05'
) latest
WHERE
rn = 1;
will return 1500.
A more traditional way to solve this would be a correlated sub-query:
SELECT
SUM(amount) sum_amount
FROM
cash_depot_state s
WHERE
date = (
SELECT MAX(date)
FROM cash_depot_state
WHERE date <= '2022-03-05' AND cash_depot_id = s.cash_depot_id
)
or a join against a materialized sub-query:
SELECT
SUM(amount) sum_amount
FROM
cash_depot_state s
INNER JOIN (
SELECT MAX(date) date, cash_depot_id
FROM cash_depot_state
WHERE date <= '2022-03-05'
GROUP BY cash_depot_id
) latest ON latest.cash_depot_id = s.cash_depot_id AND latest.date = s.date
In large tables, these are potentially faster than the ROW_NUMBER() variant. YMMV, take measurements.
An index that covers date, cash_depot_id, and amount helps all shown approaches:
CREATE INDEX ix_latest_cash ON cash_depot_state (date DESC, cash_depot_id ASC, amount);
To run against a CTE that produces a calendar, any of the above can be correlated as a subquery
WITH RECURSIVE dates(date) AS (
SELECT '2022-03-01'
UNION ALL
SELECT date(date, '+1 day') FROM dates WHERE date < DATE('now')
)
SELECT
date,
IFNULL(
(
-- any of the above approaches with `WHERE date <= dates.date`
), 0
) balance
FROM
dates;
e.g. http://sqlfiddle.com/#!5/94ad0d/12

SQL:Insert rows with sum of DLY rows less than the WKLY rows

Requirement: Is to insert rows ONLY FOR those rows whose difference b/w SUM of DLY rows are less than WKLY value and the DATES of DLY are within the range of DATES of WKLY
DDL:
create or replace table table_a
(
ID number,
qty number,
date_from date,
date_to date,
grain String
);
insert into tempdw.table_a values (1,102,'2020-07-04','2020-07-04','DLY');
insert into tempdw.table_a values (1,1028,'2020-07-05','2020-07-05','DLY');
insert into tempdw.table_a values (1,2828,'2020-07-06','2020-07-06','DLY');
insert into tempdw.table_a values (1,3870,'2020-07-05','2020-07-11','WKLY');
I need to insert a new row (yellow) with the difference of SUM of DLY(Orange) and WKLY(Green)
Tried :
select ID , sum(impression) over(partition by id , time_grain),date_from,date_to,time_grain
from tempdw.test_impress;
I don't have access to Snowflake but here's an example worked out (and tested with your sample data) using PostgreSQL. Hopefully you can tweak it for your own flavour of SQL.
INSERT INTO table_a
SELECT id,
missing_qty,
missing_date,
missing_date,
'DLY' AS grain
FROM ( /* NOTE: Use Average of b.qty because the value repeats on each row selected */
SELECT a.id,
Cast(Avg(b.qty) - Sum(a.qty) AS INTEGER) AS missing_qty,
( /* NOTE: Find an unused date in the week */
SELECT date(date_from + interval '1 day')
FROM table_a
WHERE grain = 'DLY'
AND date_from + interval '1 day' NOT IN
(
SELECT date_from
FROM table_a
WHERE id = a.id
AND grain <> 'WKLY') ) AS missing_date
FROM table_a a
JOIN table_a b
ON a.id = b.id
AND a.date_from BETWEEN b.date_from AND b.date_to
AND a.grain = 'DLY'
AND b.grain = 'WKLY'
GROUP BY a.id ) x
WHERE missing_qty > 0
This seems to work based on the data you've provided:
alter session set week_start = 7; -- Sets start of week to Sunday
insert into table_a (ID, qty, date_from, date_to, grain)
with t1 as (
select *
, concat(year(date_from),'-',week(date_from)) as year_week -- Week used to group records
, max(date_to) over (partition by grain, year_week) as max_dly_date -- Max date already used within week
,dateadd(day,1,max_dly_date) as new_dly_date -- Next date after the max date
,sum(qty) over (partition by grain, year_week) as sum_dly_qty -- Total qty by week and grain
from table_a
)
select dly.ID, (wkly.qty - dly.sum_dly_qty), dly.new_dly_date, dly.new_dly_date, 'DLY'
from t1 dly
inner join t1 wkly on dly.year_week = wkly.year_week and wkly.grain = 'WKLY'
where dly.grain = 'DLY' and dly.date_to = dly.max_dly_date; -- We only need one DLY record in each week

Using SQL Server 2012 how to iterate through an unknown number of rows and calculate date differences

I need to calculate the average number of days if there are two or more dates for each ID: the days between date1 and date2, date2 and date3 etc. The output needs to be the average number of days between each interval per ID. I am looking for a solution that iterates through each date for each ID and then averages the number of days
I could create a row number and partition by the id but in the actual data there can be up to 20 rows for each ID.
CREATE TABLE #ATABLE(
ID INTEGER NOT NULL
,DATE DATE NOT NULL
);
INSERT INTO #ATABLE(ID,DATE) VALUES (1,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/10/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/20/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (2,'1/30/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (3,'1/1/2019');
INSERT INTO #ATABLE(ID,DATE) VALUES (3,'1/10/2019');
--get avg days between orders
DROP TABLE #ATABLE
The out put for the above would be:
ID AvgDatediff
1 Null
2 10
3 9
You can use lag to get the previous row (per row), and then find the diff between it and the current row. Then, you can average them out:
SELECT id, AVG(diff)
FROM (SELECT id,
DATEDIFF(DAY, date, LAG(date) OVER (PARTITION BY id
ORDER BY date DESC)) AS diff
FROM #atable) t
GROUP BY id;
The simplest way to get the average difference is:
SELECT id, DATEDIFF(DAY, MIN(date), MAX(date)) / NULLIF(COUNT(*) - 1, 0)
FROM #atable) t
GROUP BY id;
Note: You may want a * 1.0 if you don't want an integer average.
In other words, the average difference is the latest date minus the earliest date divided by one less than the count. Try it. It works.
SELECT id, AVG(DayDiff)
FROM (
SELECT id,
DATEDIFF(dd, date, LEAD(date) OVER (PARTITION BY id ORDER BY date)) AS DayDiff
FROM #atable
) as AA
GROUP BY id;
LEAD(source_column) ==> picks the next data on basis of the order by clause i.e. here date.

Merge the records for overlapping dates

I have data as below and want to merge the records for overlapping dates. MIN and MAX of start and end dates for overlapping records should be the Start and end date of merged record.
Before merge:
Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 20-Jun-2004
111 22-May-2004 07-Jun-2004
111 20-Jun-2004 13-Aug-2004
111 27-May-2004 30-Aug-2004
111 02-Sep-2004 23-Dec-2004
222 21-May-2004 19-Aug-2004
Required output:
Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 30-Aug-2004
111 02-Sep-2004 23-Dec-2004
222 21-May-2004 19-Aug-2004
you can create sample data using
create table item(item_code number, start_date date, end_date date);
insert into item values (111,to_date('15-May-2004','DD-Mon-YYYY'),to_date('20-Jun-2004','DD-Mon-YYYY'));
insert into item values (111,to_date('22-May-2004','DD-Mon-YYYY'),to_date('07-Jun-2004','DD-Mon-YYYY'));
insert into item values (111,to_date('20-Jun-2004','DD-Mon-YYYY'),to_date('13-Aug-2004','DD-Mon-YYYY'));
insert into item values (111,to_date('27-May-2004','DD-Mon-YYYY'),to_date('30-Aug-2004','DD-Mon-YYYY'));
insert into item values (111,to_date('02-Sep-2004','DD-Mon-YYYY'),to_date('23-Dec-2004','DD-Mon-YYYY'));
insert into item values (222,to_date('21-May-2004','DD-Mon-YYYY'),to_date('19-Aug-2004','DD-Mon-YYYY'));
commit;
The code for this type of problem is rather tricky. Here is one approach that works pretty well:
with item (item_code, start_date, end_date) as (
select 111,to_date('15-05-2004','DD-MM-YYYY'),to_date('20-06-2004','DD-MM-YYYY') from dual union all
select 111,to_date('22-05-2004','DD-MM-YYYY'),to_date('07-06-2004','DD-MM-YYYY') from dual union all
select 111,to_date('20-06-2004','DD-MM-YYYY'),to_date('13-08-2004','DD-MM-YYYY') from dual union all
select 111,to_date('27-05-2004','DD-MM-YYYY'),to_date('30-08-2004','DD-MM-YYYY') from dual union all
select 111,to_date('02-09-2004','DD-MM-YYYY'),to_date('23-12-2004','DD-MM-YYYY') from dual union all
select 222,to_date('21-05-2004','DD-MM-YYYY'),to_date('19-08-2004','DD-MM-YYYY') from dual
),
id as (
select item_code, start_date as dte, count(*) as inc
from item
group by item_code, start_date
union all
select item_code, end_date, - count(*) as inc
from item
group by item_code, end_date
),
id2 as (
select id.*, sum(inc) over (partition by item_code order by dte) as running_inc
from id
),
id3 as (
select id2.*, sum(case when running_inc = 0 then 1 else 0 end) over (partition by item_code order by dte desc) as grp
from id2
)
select item_code, min(dte) as start_date, max(dte) as end_date
from id3
group by item_code, grp;
And a rextester to validate it.
What is this doing? Good question. The idea in these problems is to define the adjacent groups. This method does so by counting the number of "starts" and "ends" up to a given date. When the value is 0, a group ends.
The specific steps are as follows:
(1) Break out all the dates onto separate rows along with an indicator of whether the date is a start date or end date. This indicator is key to defining the ranges -- +1 to "enter" and "-1" to exit.
(2) Calculate the running total of the indicators. The 0s in this total are the ends of overlapping ranges.
(3) Do a reverse cumulative sum of the 0s to identify the groups.
(4) Aggregate to get the final results.
You can look at each of the CTEs to see what is happening in the data.
It's a variation of a gaps&islands problem. First calculate the maximum previous end date for each row. Then filter the rows where the current row's start date is greater than that max date, this is the start of a new group and the group's end date is found in the next row.
WITH max_dates AS
(
SELECT
item_code
,start_date
,Max(end_date) -- get the maximum prevous end_date
Over (PARTITION BY item_code
ORDER BY start_date
ROWS BETWEEN Unbounded Preceding AND 1 Preceding) AS max_prev_date
,Max(end_date) -- get the maximum overall date (only needed for the last group)
Over (PARTITION BY item_code) AS max_date
FROM item
)
SELECT
item_code
,start_date
,Coalesce(Lead(max_prev_date) -- next row got the end date for the current row
Over (PARTITION BY item_code
ORDER BY start_date)
,max_date ) AS end_date -- no next row for the last row --> overall maximum end_date
FROM max_dates
WHERE max_prev_date < start_date -- maximum previous end date is less than current start date --> start of a new group
OR max_prev_date IS NULL -- first row
In SQL Server you can try this. It will give your desired output but as performance point of view the query might slow down, When there is a large number of data to be checked.
DECLARE #item Table(item_code int, start_date date, end_date date);
insert into #item values (111,'15-May-2004','20-Jun-2004');
insert into #item values (111,'22-May-2004','07-Jun-2004');
insert into #item values (111,'20-Jun-2004','13-Aug-2004');
insert into #item values (111,'27-May-2004','30-Aug-2004');
insert into #item values (111,'02-Sep-2004','23-Dec-2004');
insert into #item values (222,'21-May-2004','19-Aug-2004');
SELECT * FROM #item WHERE item_code IN (SELECT item_code FROM #item GROUP BY item_code) AND
(start_date IN (SELECT max(start_date) FROM #item GROUP BY item_code) or start_date In (SELECT min(start_date) FROM #item GROUP BY item_code))
with help of above answers i am able to simplify this as below
WITH max_dates AS
(
SELECT
item_code
,start_date
,end_date
,Max(end_date)
Over (PARTITION BY item_code
ORDER BY start_date
) AS max_date
FROM item
) ,
max_dates1 as
(
select max_dates.* , lag(max_date) over(partition by item_code order by 1) as MPD from max_dates
)
select ITEM_CODE,start_date,end_date from max_dates1
WHERE MPD < start_date
OR MPD IS NULL

Oracle SQL - Putting together potentially contradictory or overlapping date ranges

I have a table like this:
Id Begin_Date End_date
1 01-JAN-12 05-JAN-12
1 01-FEB-12 01-MAR-12
1 15-FEB-12 05-MAR-12
For a given Id, it gives a set of date ranges. Let's say that if a date is between the begin and end date for that Id, then that Id is "on". Otherwise, "off"
The problem here is these last two rows -- the date ranges overlap and contradict each other. The second row claims that the 1 was "on" between 01-FEB-12 and 01-MAR-123, but the third row claims that 1 was off before before 14-FEB-12. Similarly, the second row claims that 1 was off on 02-MAR-12, but row 3 claims it was on.
The reconciliation logic I'd like to apply is that, in cases of contradictions, pick the earliest possible begin date and the earliest possible end date after it. The result would therefore be:
Id Begin_Date End_date
1 01-JAN-12 05-JAN-12
1 01-FEB-12 01-MAR-12
I was able to pull this off with the lag analytical function, but I ran into difficulty with other use cases. Take this input data set.
Id Begin_Date End_date
1 01-JAN-12 10-JAN-12
1 5-JAN-12 8-JAN-12
1 12-JAN-12 15-JAN-12
1 1-JAN-12 14-JAN-12
What I expect here as output is:
Id Begin_Date End_date
1 01-JAN-12 8-JAN-12
1 01-JAN-12 14-JAN-12
...because the first row is the earliest begin date, and its end date is the earliest end date after that. The next row is the earliest begin date after the previous end date, and the end date of that row is the earliest end date after that. There are no begin dates after 14-JAN-12, so I'm done.
I'm having very little luck solving this problem. One approach I tried was getting the rank partitioned by id and compare it to the max rank. I then used the lag function to compare to previous ranks. However, this strategy totally fails for use cases above.
Any suggestions?
Well, the critical requirement rests on this:
The reconciliation logic I'd like to apply is that, in cases of
contradictions, pick the earliest possible begin date and the earliest
possible end date after it.
sqlfiddle here
CREATE TABLE table1
(
id INT,
DateStart DATE,
DateEnd DATE
);
INSERT INTO table1
VALUES
(1, TO_DATE('20110101','YYYYMMdd'), TO_DATE('20110110','YYYYMMdd'));
INSERT INTO table1
VALUES
(2, TO_DATE('20110105','YYYYMMdd'), TO_DATE('20110108','YYYYMMdd'));
INSERT INTO table1
VALUES
(3, TO_DATE('20110112','YYYYMMdd'), TO_DATE('20110115','YYYYMMdd'));
INSERT INTO table1
VALUES
(4, TO_DATE('20110101','YYYYMMdd'), TO_DATE('20110114','YYYYMMdd'));
INSERT INTO table1
VALUES
(5, TO_DATE('20110206','YYYYMMdd'), TO_DATE('20110208','YYYYMMdd'));
INSERT INTO table1
VALUES
(6, TO_DATE('20110201','YYYYMMdd'), TO_DATE('20110207','YYYYMMdd'));
The select statement:
SELECT ID, DATESTART, DATEEND
FROM
(
SELECT ID, TYPE, DATES AS DATESTART,
LEAD(DATES) OVER (ORDER BY DATES) AS DATEEND
FROM
(
SELECT ID, TYPE,DATES,
LAG(ID) OVER (ORDER BY DATES) AS LASTID,
LAG(TYPE) OVER (ORDER BY DATES) AS LASTTYPE,
LAG(DATES) OVER (ORDER BY DATES) AS LASTDATES
FROM
(
SELECT ID,'START' AS TYPE,DATESTART AS DATES
FROM table1
UNION ALL
SELECT ID,'END',DATEEND
FROM table1
)
) H
WHERE TYPE != LASTTYPE OR LASTTYPE IS NULL
)
WHERE TYPE = 'START'
ORDER BY DATESTART
Here's a step by step for each subquery:
explode each row's date start and date end into one column
copy the last row using LAG and put it in current row
filter out the rows which is are in the middle (e.g. 1,2,3,4 remove 2,3)
get the end date in the next row because these are either first or last rows
extract only useful rows, those rows which has TYPE = START
For the second data set:
Id Begin_Date End_date
1 01-JAN-12 10-JAN-12
1 5-JAN-12 8-JAN-12
1 12-JAN-12 15-JAN-12
1 1-JAN-12 14-JAN-12
After your reconciliation logic, the result would be:
Id Begin_Date End_date
1 01-JAN-12 8-JAN-12 (includes the rows 1,2 and 4 -> minimum begin_date is 1-JAN, minimum end_date is 8-JAN)
1 12-JAN-12 15-JAN-12 (includes row 3)