Add remaining value to next rows in sql server - sql

I have table, as below and its contains customer electricity volume for the period as.Available data like
OwnerID StartDate EndDate Volume
1 2019-01-01 2019-01-15 10.40
1 2019-01-16 2019-01-31 5.80
1 2019-02-01 2019-02-10 7.90
1 2019-02-11 2019-02-28 8.50
2 2019-03-01 2019-03-04 10.50
And another table having their existing remaining volume. Both table are connected with Column OwnerID
OwnerID ExistingVolume
1 0.90
2 0.60
Now add (apply) the ExistingVolume with current Volume (first table) as
Calculate the new volume as whole numer and remaining decimal value add to next period to the customer.
So expected result set should like,
OwnerId StartDate EndDate CalulatedVolume RemainingExistingVolume
1 2019-01-01 2019-01-15 11 0.30
1 2019-01-16 2019-01-31 6 0.10
1 2019-02-01 2019-02-10 8 0.00
1 2019-02-11 2019-02-28 8 0.50
2 2019-03-01 2019-03-04 11 0.10
Don't round off the CalulatedVolume. Just get the whole when add the table1.Volume + table2.ExistingVolume.
And Remaining decimal value (from 1st row) should be applied the next row value table1.Volume
Could you someone suggest how to achieve this is in SQL query?

If I understand correctly, you want to accumulative the "error" from rounding and apply that against the value in the second table.
You can use a cumulative sum for this purpose -- along with some arithmetic:
select t1.ownerid, t1.startdate, t1.enddate,
round(t1.volume, 0) as calculatedvolume,
( sum( t1.volume - round(t1.volume, 0) ) over (partition by t1.ownerid order by t1.startdate) +
t2.existingvolume
) as remainingexisting
from table1 t1 left join
table2 t2
on t1.ownerid = t2.ownerid;
You have a non-standard definition of rounding. This can be implemented as ceil(x - 0.5). With this definition, the code is:
select t1.ownerid, t1.startdate, t1.enddate,
ceiling(t1.volume - 0.5) as calculatedvolume,
( sum( t1.volume - ceiling(t1.volume - 0.5) ) over (partition by t1.ownerid order by t1.startdate) +
t2.existingvolume
) as remainingexisting
from table1 t1 left join
table2 t2
on t1.ownerid = t2.ownerid;
Here is a db<>fiddle.

Related

SQL: Left join on calendar table (spark SQL)

I am trying to join data to a calendar table cross joined with user id, to get other columns corresponding to it. I have tried joining on date condition, without the date condition. Created a cross joined master table to left join the other data on. However, seems like I am missing something.
DATE_TBL looks like:
CAL_DT BUYER_ID
2019-03-31 1
2019-03-31 2
2019-03-31 3
2019-03-30 1
2019-03-30 2
2019-03-30 3
2019-03-29 1
2019-03-29 2
2019-03-29 3 ......
DATA2 looks like:
CREATED_DT BUYER_ID ITEM_PRICE
2019-03-31 1 10
2019-03-30 2. 12
2019-03-29 3. 45
2019-03-29 2. 13 ........
Here is my code:
WITH DATE_TBL AS
(
SELECT CAL.CAL_DT, CK.BUYER_ID
FROM DATA1 CAL
CROSS JOIN DATA2 CK
WHERE cal.CAL_DT BETWEEN '2018-01-01' AND '2019-03-31'
AND CK.BYR_CNTRY_ID IN (1,2,3) AND CK.CREATED_DT BETWEEN '2019-03-01' AND '2019-03-31'
GROUP BY 1,2
)
,
REVENUE_CALC AS
(
SELECT CAL.CAL_DT
,CK.BYR_CNTRY_ID
,CK.BUYER_ID
,CK.CREATED_DT AS CREATED_DT
,SUM(CK.ITEM_PRICE) AS ITEM_PRICE
,SUM(CK.QUANTITY) AS QUANTITY
,MAX(COALESCE(I.CURNCY_PLAN_RATE, 1)) AS CURNCY_PLAN_RATE
,SUM(CK.ITEM_PRICE *CK.QUANTITY *I.CURNCY_PLAN_RATE) AS REVENUE
FROM DATE_TBL CAL
LEFT JOIN DATA2 CK
ON CAL.BUYER_ID = CK.BUYER_ID AND CAL.CAL_DT = CK.CREATED_DT
LEFT JOIN DATA3 I
ON I.CURNCY_ID = CK.LSTG_CURNCY_ID
GROUP BY 1,2,3,4
ORDER BY CAL.CAL_DT DESC, CK.BUYER_ID
)
SELECT *
FROM REVENUE_CALC
Desired Result Must look like:
CAL_DT BUYER_ID ITEM ITEM_PRICE
2019-03-31 1. 10
2019-03-31 2. null
2019-03-31 3. null
2019-03-30 1. null
2019-03-30 2. 12
2019-03-30 3. null
2019-03-29 1. null
2019-03-29 2. 13
2019-03-29 3. 45......
What I get is only the data for common dates. Could someone help me understand what I am doing wrong?

T-SQL Override special rates and generate final date range

I have transaction table which has date range and basic rate for the range. I have another table for special rate which has date range for special rate and its rate. I would like to split my original transaction in multiple records if special rates falls in transaction date range.
Just for simplicity I have created two tables with limited columns
DECLARE #ClientTrx AS TABLE (ClientId int, StartDate Date, EndDate Date, Rate decimal(10,2))
DECLARE #SpecialRate AS TABLE (ClientId int, StartDate Date, EndDate Date, Rate decimal(10,2))
insert into #ClientTrx select 1, '1/1/2020', '1/15/2020', 10
insert into #ClientTrx select 1, '1/16/2020', '1/31/2020', 10
insert into #ClientTrx select 2, '1/1/2020', '1/15/2020', 20
insert into #ClientTrx select 2, '1/16/2020', '1/31/2020', 20
insert into #ClientTrx select 2, '2/1/2020', '2/13/2020', 20
insert into #SpecialRate select 1, '12/25/2019', '1/3/2020', 13
insert into #SpecialRate select 1, '1/4/2020', '1/6/2020', 15
insert into #SpecialRate select 1, '1/11/2020', '1/18/2020', 12
insert into #SpecialRate select 2, '1/25/2020', '1/31/2020', 23
insert into #SpecialRate select 2, '2/4/2020', '2/8/2020', 25
insert into #SpecialRate select 2, '2/11/2020', '2/29/2020', 22
I need help write a query which produce following results:
ClientId StartDate EndDate Rate
1 2020-01-01 2020-01-03 13.00 special rate
1 2020-01-04 2020-01-06 15.00 special rate
1 2020-01-07 2020-01-10 10.00 regular rate
1 2020-01-11 2020-01-15 12.00 special rate
1 2020-01-16 2020-01-18 12.00 special rate splitting pay period
1 2020-01-19 2020-01-31 10.00 regular rate
2 2020-01-01 2020-01-15 20.00 regular rate
2 2020-01-16 2020-01-24 20.00 regular rate
2 2020-01-25 2020-01-31 23.00 special rate
2 2020-02-01 2020-02-03 20.00 regular rate
2 2020-02-04 2020-02-08 25.00 special rate
2 2020-02-09 2020-02-10 20.00 regular rate
2 2020-02-11 2020-02-13 22.00 special rate
I think using CTE its possible but I can't figure it out. can anyone please help?
Note: I have made some changes in my input and expected output, i think I need one more group level, can you please help?
This is an approach which uses and ad-hoc tally table to expand the datasets and then applies a Gaps-and-Islands for the final summary
Example
;with cte as (
Select A.ClientId
,D
,Rate = coalesce(NewRate,A.Rate)
,Grp = datediff(day,'1900-01-01',D) - row_number() over (partition by ClientID,coalesce(NewRate,A.Rate) Order by D)
From #ClientTrx A
Cross Apply (
Select Top (DateDiff(DAY,StartDate,EndDate)+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),StartDate)
From master..spt_values n1,master..spt_values n2
) B
Outer Apply (
Select NewRate=Rate
From #SpecialRate
Where D between StartDate and EndDate
and ClientId=A.ClientID
) C
)
Select ClientID
,StartDate= min(D)
,EndDate = max(D)
,Rate = Rate
From cte
Group By ClientID,Grp,Rate
Order by ClientID,min(D)
Returns
ClientID StartDate EndDate Rate
1 2020-01-01 2020-01-03 13.00
1 2020-01-04 2020-01-06 15.00
1 2020-01-07 2020-01-10 10.00
1 2020-01-11 2020-01-18 12.00
1 2020-01-19 2020-01-31 10.00
2 2020-01-01 2020-01-24 20.00
2 2020-01-25 2020-01-31 23.00
2 2020-02-01 2020-02-03 20.00
2 2020-02-04 2020-02-08 25.00
2 2020-02-09 2020-02-10 20.00
2 2020-02-11 2020-02-15 22.00
Notes:
Cross Apply B generates a record for each date between startDate and endDate in #ClientTrx.
Outer Apply C attempts to find the Exception or NewRate
the CTE generates one record per date and toggles the default or exception rate. It looks like this
Notice how GRP changes. This is a simple technique to "feed" the Gaps-and-Islands
Then is becomes a small matter to group the results from cte by ClientID and Grp

How duplicate a rows in SQL base on difference between date columns and divided aggregated column per duplicate row?

I have a table with some records about fuel consumption. The important columns in the table are: CONSUME_DATE_FROM and CONSUM_DATE_TO.
I want to calculate average fuel consumption per cars on a monthly basis but some rows are not in the same month. For example some have a three month difference between them and the total of gas per litre is aggregated in a single row.
Now I should find records that have difference more than a month between CONSUME_DATE_FROM and CONSUM_DATE_TO, and duplicate them in current or second table per count of month and divide the total gas per litre between related rows.
I've this table with the following data:
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 600
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 400
4 103 2018-03-29 2018-05-29 200
5 104 2018-02-05 2018-02-09 50
The expected output table should be as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 200
3 102 2018-12-31 2019-01-01 200
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
5 104 2018-02-05 2018-02-09 50
Or as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER DATE_RELOAD_GAS
1 100 2018-10-25 2018-12-01 200 2018-10-01
1 100 2018-10-25 2018-12-01 200 2018-11-01
1 100 2018-10-25 2018-12-01 200 2018-12-01
2 101 2018-07-19 2018-07-24 100 2018-07-01
3 102 2018-12-31 2019-01-01 200 2018-12-01
3 102 2018-12-31 2019-01-01 200 2019-01-01
4 103 2018-03-29 2018-05-29 66.66 2018-03-01
4 103 2018-03-29 2018-05-29 66.66 2018-04-01
4 103 2018-03-29 2018-05-29 66.66 2018-05-01
5 104 2018-02-05 2018-02-09 50 2018-02-01
Can someone please help me out with this query?
I'm using oracle database
Your business rule treats the difference between CONSUME_DATE_FROM and CONSUM_DATE_TO as absolute months. So you expect the difference between 2018-10-25 and 2018-12-01 to be three months whereas the difference in days actually equates to about 1.1 months. So we can't use simple date arithmetic to get your desired output, we need to do some additional massaging of the dates.
The query below implements your desired logic by deriving the first day of the month for CONSUME_DATE_FROM and the last day of the month for CONSUME_DATE_TO, then using ceil() to round the difference up to the nearest whole number of months.
This is calculated in a subquery which is used in the main query with the old connect by level trick to multiply a record by level number of times:
with cte as (
select f.*
, ceil(months_between(last_day(CONSUM_DATE_TO)
, trunc(CONSUME_DATE_FROM,'mm'))) as diff
from fuel_consumption f
)
select cte.id
, cte.VehicleId
, cte.CONSUME_DATE_FROM
, cte.CONSUM_DATE_TO
, cte.GAS_PER_LITER/cte.diff as GAS_PER_LITER
, add_months(trunc(cte.CONSUME_DATE_FROM, 'mm'), level-1) as DATE_RELOAD_GAS
from cte
connect by level <= cte.diff
and prior cte.id = cte.id
and prior sys_guid() is not null
;
"what about if add a additional column "DATE_RELOAD_GAS" that display difference date for similar rows"
From your posted sample it seems like DATE_RELOAD_GAS is the first day of the month for each month bounded by CONSUME_DATE_FROM and CONSUM_DATE_TO. I have amended my solution to implement this rule.
By using connect by level structure with considering to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') as month I was able to resolve as below :
select ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO,
trunc(GAS_PER_LITER/max(rn) over (partition by ID order by ID),2) as GAS_PER_LITER,
'01.'||substr(myMonth,5,2)||'.'||substr(myMonth,1,4) as DATE_RELOAD_GAS
from
(
with consumption( ID, VehicleId, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER ) as
(
select 1,100,date'2018-10-25',date'2018-12-01',600 from dual union all
select 2,101,date'2018-07-19',date'2018-07-24',100 from dual union all
select 3,102,date'2018-12-31',date'2019-01-01',400 from dual union all
select 4,103,date'2018-03-29',date'2018-05-29',200 from dual union all
select 5,104,date'2018-02-05',date'2018-02-09', 50 from dual
)
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID >= 2
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
union all
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID = 1
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
) q
group by ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER, rn
order by ID, myMonth;
I met an interesting issue that if I consider the join condition in the subquery as c.ID >= 1 query hangs on for huge period of time, so splitted into two parts by union all
as c.ID >= 2 and c.ID = 1
Rextester Demo

what is the best method to Identify which pairs of rows have identical Products, Customers and Measures, and overlapping date ranges?

image of sample question where i have to identify duplicate rows then make date ranges not overlap.
The overlapping for row 1, 2 is represented as:
rows 1 and 2 are overlap , like this:
20130101 |--------------------| 20130401
20130301 |----------------------| 20131231
You can use T-Sql language in MS SQL Server:
select t1a.id , t1.id second_id,
t1.valid_from_day , t1.valid_to_day ,
t1a.valid_from_day second_valid_from_day ,
t1a.valid_to_day second_valid_to_day
from t1 t1a
cross apply
(
select * from t1
where t1.product = t1a.product
and t1.customer = t1a.customer
and t1.measure = t1a.measure
and t1.id <> t1a.id
and t1.valid_from_day >= t1a.valid_from_day -- overlap
and t1.valid_to_day >= t1a.valid_to_day
) t1
The results of the query is:
id second_id valid_from_day valid_to_day second_valid_from_day second_valid_to_day
1 2 2013-03-01 2013-12-31 2013-01-01 2013-04-01
4 5 2013-03-01 2014-04-01 2013-01-01 2013-04-01
9 10 2014-04-01 2015-01-01 2013-03-01 2013-12-31
so The pairs identical are:
pair 1,2
pair 4,5
pair 9,10

SQL Query to distance traveled in a day based on timestamp and lat long

I have a table with Date and latitude, longitude values in each row. I want a sql query to calculate the distance travailed in a day.
Say for date 2013-03-01 I want the total distance traveled,
ID DATE LAT LONG V_ID
---------------------------------------------------
123 2013-03-01 06:05:24 45.544 86.544 1
124 2013-03-01 06:15:17 45.676 86.676 1
125 2013-03-01 06:25:24 46.544 86.544 2
126 2013-03-01 06:38:14 46.651 86.651 2
127 2013-03-02 07:12:04 46.876 86.876 1
128 2013-03-02 10:38:14 46.871 86.871 1
129 2013-03-02 10:56:14 46.871 86.671 2
130 2013-03-02 15:28:02 46.243 86.871 2
To calculate the distance what I wrote a sql function :
CREATE FUNCTION [dbo].[fnCalcDistanceKM](#lat1 FLOAT, #lat2 FLOAT, #lon1 FLOAT, #lon2 FLOAT)
RETURNS FLOAT
AS
BEGIN
RETURN ACOS(SIN(PI()*#lat1/180.0)*SIN(PI()*#lat2/180.0)+COS(PI()*#lat1/180.0)*COS(PI()*#lat2/180.0)*COS(PI()*#lon2/180.0-PI()*#lon1/180.0))*6371
END
but I want the total distance traveled in a day and for day 2013-03-01 I have top four row and I want total distance traveled in 2013-03-01
and similar to this date 2013-03-02 has last four row , how do I calculate distance for these rows.
You can use self join with ROW_NUMBER() to get distance travelled like this.
SQL Fiddle
Query
;WITH CTE AS
(
SELECT *,CONVERT(DATE,[Date]) as tday,ROW_NUMBER()OVER(PARTITION BY CONVERT(DATE,[Date]) ORDER BY [Date] ASC) rn
FROM Travel
)
SELECT T1.tday,SUM([dbo].[fnCalcDistanceKM](T1.lat,T2.lat,T1.long,T2.long)) as dist
FROM CTE T1
INNER JOIN CTE T2
ON T1.tday = T2.tday
AND T1.rn = T2.rn -1
GROUP BY T1.tday
Output
| tday | dist |
|------------|--------------------|
| 2013-03-01 | 129.40048639456964 |
| 2013-03-02 | 87.36216677343607 |