I want to generate a query in SqlServer 2014 from two tables, have no relation with each other.
The first one represents the demands. And the second one represents the supplies for them.
Demands(
[DemandId] [int] NOT NULL,
[ItemCode] [nvarchar](50) NULL,
[TotalCount] [int] NULL,
[Date] [datetime] NULL)
Supplies(
[SupplyId] [int] NOT NULL,
[ItemCode] [nvarchar](50) NULL,
[Count] [int] NULL,
[Date] [datetime] NULL)
For example, we have a demand with (TotalCount = 1000, ItemCode = 1, Date = d1)
and two Supplies in (Date = d2, Count = 300, ItemCode = 1) and (Date = d3, Count = 700, ItemCode = 1)
the demand finished in d3 Date, so I want a query to indicate when supplies have finished the demands.
consider the following data:
the result should be:
Item01 2020-01-07
Item02 2020-01-06
I appreciate any help.
A simple summary could be...
treat a demand as a negative amount of supply
combine the two datasets in to a single time series
use a cumulative sum to see the net availability
Such as...
WITH
NetContribution AS
(
SELECT [ItemCode], [Date], [Count] FROM Supplies
UNION ALL
SELECT [ItemCode], [Date], -[TotalCount] FROM Demands
)
SELECT
[ItemCode],
[Date],
[Count] AS NetAvailabilityChange,
SUM([Count])
OVER (PARTITION BY [ItemCode]
ORDER BY [Date],
[Count] DESC
)
AS NetAvailability
FROM
NetContribution
While the NetAvailability is negative, Supply has not yet met Demand. While it's positive, Supply has exceeded Demand.
EDIT: In response to your question edit...
Just use the above query and add a WHERE clause...
WITH
NetContribution AS
(
SELECT [ItemCode], [Date], [Count] FROM Supplies
UNION ALL
SELECT [ItemCode], [Date], -[TotalCount] FROM Demands
),
NetAvailability AS
(
SELECT
[ItemCode],
[Date],
[Count] AS Delta,
SUM([Count])
OVER (PARTITION BY [ItemCode]
ORDER BY [Date],
[Count] DESC
)
AS Amount
FROM
NetContribution
)
SELECT
*
FROM
NetAvailability
WHERE
Amount >= 0
This is my source data
Demand :
'1', 'A', '1000', '2020-12-01'
'4', 'B', '2000', '2020-12-01'
Supply :
'2', 'A', '700', '2020-12-05'
'3', 'A', '300', '2020-12-08'
'5', 'B', '1000', '2020-12-05'
'6', 'B', '1000','2020-12-08'
Performed the below query :
select a.itemcode, case when totaldemand - totalsupply = 0 then endsupplydate
else null end enddate from </b>
(
select 'demand' type,itemcode,sum(quantity) totaldemand,min(demanddate) as
date from demand b group by type,itemcode ) b
inner join (
select 'supply' type,itemcode,sum(quantity) totalsupply,max(supplydate) as
endsupplydate from supply group by type,itemcode) a
on a.itemcode = b.itemcode;
Output you will be getting :
ItemCode,DemandStart,SupplyEnd,QuantityLeft
'A', '2020-12-08'
'B', '2020-12-08'
In the absence of using SUM() OVER() to generate a cumulative sum, you can use a triangular join (Join the current row on to all preceding rows), but on large data sets is nastily slow...
WITH
NetContribution AS
(
SELECT [ItemCode], [Date], SUM([Count]) AS [Count]
FROM (
SELECT [ItemCode], [Date], [Count] FROM Supplies
UNION ALL
SELECT [ItemCode], [Date], -[TotalCount] FROM Demands
)
combined
GROUP BY [ItemCode], [Date]
),
NetAvailability AS
(
SELECT
a.[ItemCode],
a.[Date],
a.[Count] AS Delta,
SUM(b.[Count]) AS Amount
FROM
NetContribution AS a
INNER JOIN
NetContribution AS b
ON a.[ItemCode] = b.[ItemCode]
AND a.[Date] >= b.[Date]
GROUP BY
a.[ItemCode],
a.[Date],
a.[Count]
)
SELECT
*
FROM
NetAvailability
WHERE
Amount >= 0
https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=48660224fc63bcb2803f5a08b8b1311e
Related
I have the following problem: from the table of pays and dues, I need to find the date of the last overdue. Here is the table and data for example:
create table t (
Id int
, [date] date
, Customer varchar(6)
, Deal varchar(6)
, Currency varchar(3)
, [Sum] int
);
insert into t values
(1, '2017-12-12', '1110', '111111', 'USD', 12000)
, (2, '2017-12-25', '1110', '111111', 'USD', 5000)
, (3, '2017-12-13', '1110', '122222', 'USD', 10000)
, (4, '2018-01-13', '1110', '111111', 'USD', -10100)
, (5, '2017-11-20', '2200', '222221', 'USD', 25000)
, (6, '2017-12-20', '2200', '222221', 'USD', 20000)
, (7, '2017-12-31', '2201', '222221', 'USD', -10000)
, (8, '2017-12-29', '1110', '122222', 'USD', -10000)
, (9, '2017-11-28', '2201', '222221', 'USD', -30000);
If the value of "Sum" is positive - it means overdue has begun; if "Sum" is negative - it means someone paid on this Deal.
In the example above on Deal '122222' overdue starts at 2017-12-13 and ends on 2017-12-29, so it shouldn't be in the result.
And for the Deal '222221' the first overdue of 25000 started at 2017-11-20 was completly paid at 2017-11-28, so the last date of current overdue (we are interested in) is 2017-12-31
I've made this selection to sum up all the payments, and stuck here :(
WITH cte AS (
SELECT *,
SUM([Sum]) OVER(PARTITION BY Deal ORDER BY [Date]) AS Debt_balance
FROM t
)
Apparently i need to find (for each Deal) minimum of Dates if there is no 0 or negative Debt_balance and the next date after the last 0 balance otherwise..
Will be gratefull for any tips and ideas on the subject.
Thanks!
UPDATE
My version of solution:
WITH cte AS (
SELECT ROW_NUMBER() OVER (ORDER BY Deal, [Date]) id,
Deal, [Date], [Sum],
SUM([Sum]) OVER(PARTITION BY Deal ORDER BY [Date]) AS Debt_balance
FROM t
)
SELECT a.Deal,
SUM(a.Sum) AS NET_Debt,
isnull(max(b.date), min(a.date)),
datediff(day, isnull(max(b.date), min(a.date)), getdate())
FROM cte as a
LEFT OUTER JOIN cte AS b
ON a.Deal = b.Deal AND a.Debt_balance <= 0 AND b.Id=a.Id+1
GROUP BY a.Deal
HAVING SUM(a.Sum) > 0
I believe you are trying to use running sum and keep track of when it changes to positive, and it can change to positive multiple times and you want the last date at which it became positive. You need LAG() in addition to running sum:
WITH cte1 AS (
-- running balance column
SELECT *
, SUM([Sum]) OVER (PARTITION BY Deal ORDER BY [Date], Id) AS RunningBalance
FROM t
), cte2 AS (
-- overdue begun column - set whenever running balance changes from l.t.e. zero to g.t. zero
SELECT *
, CASE WHEN LAG(RunningBalance, 1, 0) OVER (PARTITION BY Deal ORDER BY [Date], Id) <= 0 AND RunningBalance > 0 THEN 1 END AS OverdueBegun
FROM cte1
)
-- eliminate groups that are paid i.e. sum = 0
SELECT Deal, MAX(CASE WHEN OverdueBegun = 1 THEN [Date] END) AS RecentOverdueDate
FROM cte2
GROUP BY Deal
HAVING SUM([Sum]) <> 0
Demo on db<>fiddle
You can use window functions. These can calculate intermediate values:
Last day when the sum is negative (i.e. last "good" record).
Last sum
Then you can combine these:
select deal, min(date) as last_overdue_start_date
from (select t.*,
first_value(sum) over (partition by deal order by date desc) as last_sum,
max(case when sum < 0 then date end) over (partition by deal order by date) as max_date_neg
from t
) t
where last_sum > 0 and date > max_date_neg
group by deal;
Actually, the value on the last date is not necessary. So this simplifies to:
select deal, min(date) as last_overdue_start_date
from (select t.*,
max(case when sum < 0 then date end) over (partition by deal order by date) as max_date_neg
from t
) t
where date > max_date_neg
group by deal;
I am creating a sample query that'll convert rows to column something as follows:
Person_Id Total Earned Leave Earned Leave Enjoyed Remaining Earned Leave Total Casual Leave Casual Leave Enjoyed Remaining Casual Leave
1001 20 10 10 20 4 16
So above is the output I get and used multiple sub-queries using the following query:
SELECT DISTINCT m.Person_Id, (SELECT k.Leave_Allocation FROM LeaveDetails k WHERE k.Leave_Name = 'Earn Leave'
AND k.Person_Id = 1001 AND k.[Year] = '2017') AS 'Total Earned Leave',
(SELECT o.Leave_Enjoy FROM LeaveDetails o WHERE o.Leave_Name = 'Earn Leave'
AND o.Person_Id = 1001 AND o.[Year] = '2017') AS 'Earned Leave Enjoyed',
(SELECT p.Leave_Remain FROM LeaveDetails p WHERE p.Leave_Name = 'Earn Leave'
AND p.Person_Id = 1001 AND p.[Year] = '2017') AS 'Remaining Earned Leave',
(SELECT k.Leave_Allocation FROM LeaveDetails k WHERE k.Leave_Name = 'Casual Leave'
AND k.Person_Id = 1001 AND k.[Year] = '2017') AS 'Total Casual Leave',
(SELECT o.Leave_Enjoy FROM LeaveDetails o WHERE o.Leave_Name = 'Casual Leave'
AND o.Person_Id = 1001 AND o.[Year] = '2017') AS 'Casual Leave Enjoyed',
(SELECT p.Leave_Remain FROM LeaveDetails p WHERE p.Leave_Name = 'Casual Leave'
AND p.Person_Id = 1001 AND p.[Year] = '2017') AS 'Remaining Casual Leave'
FROM LeaveDetails m WHERE m.Person_Id = 1001 AND m.[Year] = '2017'
I am not sure if I am going to have performance issue here as there will be lots of data and was arguing if this will be better than Pivot or Run-Time Table Creation. I just want to make sure if this is going to be a better choice for the purpose I am trying to accomplish. You can share your ideas as well samples using SQL Server, MySQL or Oracle for better performance issue - Thanks.
Sample Table and Data:
CREATE TABLE [dbo].[LeaveDetails](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Person_Id] [nvarchar](20) NULL,
[Leave_Name] [nvarchar](40) NULL,
[Leave_Allocation] [float] NULL,
[Leave_Enjoy] [float] NULL,
[Leave_Remain] [float] NULL,
[Details] [nvarchar](100) NULL,
[Year] [nvarchar](10) NULL,
[Status] [bit] NULL
)
INSERT [dbo].[LeaveDetails] ([Id], [Person_Id], [Leave_Name], [Leave_Allocation], [Leave_Enjoy], [Leave_Remain], [Details], [Year], [Status]) VALUES (1, N'1001', N'Earn Leave', 20, 10, 10, NULL, N'2017', 1)
INSERT [dbo].[LeaveDetails] ([Id], [Person_Id], [Leave_Name], [Leave_Allocation], [Leave_Enjoy], [Leave_Remain], [Details], [Year], [Status]) VALUES (2, N'1001', N'Casual Leave', 20, 4, 16, NULL, N'2017', 1)
Use conditional aggregation:
SELECT m.Person_Id,
MAX(CASE WHEN m.Leave_Name = 'Earn Leave' THEN k.Leave_Allocation END) as [Total Earned Leave],
MAX(CASE WHEN m.Leave_Name = 'Earn Leave' THEN m.Leave_Enjoy END) as [Earned Leave Enjoyed],
MAX(CASE WHEN m.Leave_Name = 'Earn Leave' THEN m.Leave_Remain END) as [Remaining Earned Leave],
MAX(CASE WHEN m.Leave_Name = 'Casual Leave' THEN k.Leave_Allocation END) as [Total Casual Leave],
MAX(CASE WHEN m.Leave_Name = 'Casual Leave' THEN k.Leave_Remain END) as [Casual Leave Enjoyed],
MAX(CASE WHEN m.Leave_Name = 'Casual Leave' THEN k.Leave_Remain END) as [Remaining Casual Leave]
FROM LeaveDetails m
WHERE m.Person_Id = 1001 AND m.[Year] = '2017'
GROUP BY m.Person_ID;
Note: I do not advocate having special characters (such as spaces) in column aliases. If you do, use the proper escape character (square braces). Only use single quotes for string and date constants.
PIVOT would work, but it looks like this is simply a single row that you want pivoted to a columnar output and the column names are known explicitly. If that's the case, you could just UNION the single column results together:
SELECT 'Person_ID' as col_name, Person_Id as col_value FROM LeaveDetails WHERE Person_Id = 1001 AND [Year] = '2017'
UNION
SELECT 'Leave_Enjoy' as col_name, Leave_Enjoy as col_value FROM LeaveDetails WHERE Person_Id = 1001 AND [Year] = '2017'
UNION
...
It's a lot simpler to write, cleaner to read, and should run a little faster - there is still one table scan for each column. Is the table indexed on Person_ID and Year?
If speed is an issue you could create a temp table of the one row:
SELECT * into #ld_temp FROM LeaveDetails WHERE Person_Id = 1001 AND [Year] = '2017'
then select from the temp table in the SELECT/UNION code:
SELECT 'Person_ID' as col_name, Person_Id as col_value FROM #ld_temp
UNION
SELECT 'Leave_Enjoy' as col_name, Leave_Enjoy as col_value FROM #ld_temp
UNION
...
Now you're down to just a single scan of the big table.
I hope this helps.
I have attendance data list which is showing below. Now I am trying to find data by a specific date range (01/05/2016 ā 07/05/2016) with total Present Column, Total Present Column will be calculated from previous present data (P). Suppose today is 04/05/2016. If a person has 01,02,03,04 status āpā then it will show date 04-05-2016 total present 4.
Could you help me to find total present from this result set.
You can check this example, which have logic to calculate previous sum value.
declare #t table (employeeid int, datecol date, status varchar(2) )
insert into #t values (10001, '01-05-2016', 'P'),
(10001, '02-05-2016', 'P'),
(10001, '03-05-2016', 'P'),
(10001, '04-05-2016', 'P'),
(10001, '05-05-2016', 'A'),
(10001, '06-05-2016', 'P'),
(10001, '07-05-2016', 'P'),
(10001, '08-05-2016', 'L'),
(10002, '07-05-2016', 'P'),
(10002, '08-05-2016', 'L')
--select * from #t
select * ,
SUM(case when status = 'P' then 1 else 0 end) OVER (PARTITION BY employeeid ORDER BY employeeid, datecol
ROWS BETWEEN UNBOUNDED PRECEDING
AND current row)
from
#t
Another twist of the same thing via cte (as you written SQLSERVER2012, this below solution only work in Sqlserver 2012 and above)
;with cte as
(
select employeeid , datecol , ROW_NUMBER() over(partition by employeeid order by employeeid, datecol) rowno
from
#t where status = 'P'
)
select t.*, cte.rowno ,
case when ( isnull(cte.rowno, 0) = 0)
then LAG(cte.rowno) OVER (ORDER BY t.employeeid, t.datecol)
else cte.rowno
end LagValue
from #t t left join cte on t.employeeid = cte.employeeid and t.datecol = cte.datecol
order by t.employeeid, t.datecol
You could use a subquery to calculate TotalPresent for each row:
SELECT
main.EmployeeID,
main.[Date],
main.[Status],
(
SELECT SUM(CASE WHEN t.[Status] = 'P' THEN 1 ELSE 0 END)
FROM [TableName] t
WHERE t.EmployeeID = main.EmployeeID AND t.[Date] <= main.[Date]
) as TotalPresent
FROM [TableName] main
ORDER BY
main.EmployeeID,
main.[Date]
Here I used subquery to count the sum of records that have the same EmployeeID and date is less or equal to the date of current row. If status of the record is 'P', then 1 is added to the sum, otherwise 0, which counts only records that have status P.
Interesting question, this should work:
select *
, (select count(retail) from p g
where g.date <= p.date and g.id = p.id and retail = 'P')
from p
order by ID, Date;
So I believe I understand correctly. You would like to count the occurences of P per ID datewise.
This makes a lot of sense. That is why the first occurrence of ID2 was L and the Total is 0. This query will count P status for each occurrence, pause at non-P for each ID.
Here is an example
I have table like this
declare #data table
(
id int not null,
groupid int not null,
startDate datetime not null,
endDate datetime not null
)
insert into #data values
(1, 1, '20150101', '20150131'),
(2, 1, '20150114', '20150131'),
(3, 1, '20150201', '20150228');
and my current selecting statement is:
select groupid, 'some data', min(id), count(*)
from #data
group by groupid
But now I need to group records if it have intersected periods
desired result:
1, 'some data', 1, 2
1, 'some data', 3, 1
Is someone know how to do this?
One method is to identify the beginning of each group -- because it doesn't overlap with the previous one. Then, count the number of these as a group identifier.
with overlaps as (
select id
from #data d
where not exists (select 1
from #data d2
where d.groupid = d2.groupid and
d.startDate >= d2.startDate and
d.startDate < d2.endDate
)
),
groups as (
select d.*,
count(o.id) over (partition by groupid
order by d.startDate) as grpnum
from #data d left join
overlaps o
on d.id = o.id
)
select groupid, min(id), count(*),
min(startDate) as startDate, max(endDate) as endDate
from groups
group by grpnum, groupid;
Notes: This is using cumulative counts, which are available in SQL Server 2012+. You can do something similar with a correlated subquery or apply in earlier versions.
Also, this query assumes that the start dates are unique. If they are not, the query can be tweaked, but the logic becomes a bit more complicated.
So I have a Visitor table, and a Visitor_activity table. Say:
Visitor
Visitor_ID Int
Visitor_name varchar(20)
Visitor_Activity
ID Int
Visitor_ID Int
Activity_Type char(3) -- values IN or OUT
Activity_Time datetime
Visitors might sign in and out multiple times in a day.
I'd like a nice query to tell me all visitors who are in: i.e. the last activity for today (on activity_time) was an "IN" not an "OUT". Any advice much appreciated.
It's T-SQL by the way, but I think it's more of an in-principle question.
One way to solve this is to use a correlated not exists predicate:
select Activity_Time, Visitor_ID
from Visitor_Activity t1
where Activity_Type = 'IN'
and not exists (
select 1
from Visitor_Activity
where Activity_Type = 'OUT'
and Visitor_ID = t1.Visitor_ID
and Activity_Time > t1.Activity_Time
and cast(Activity_Time as date) = cast(t1.Activity_Time as date)
)
This basically says get all visitor_id that have type = IN for which there doesn't exists any type = OUT record with a later time (on the same date).
Sample SQL Fiddle
SELECT
v.*
FROM
Visitors v
JOIN Visitor_Activity va ON va.Visitor_ID = v.Visitor_ID
WHERE
va.Activity_Type = 'IN'
AND NOT EXISTS ( SELECT
*
FROM
Visitor_Activity va_out
WHERE
va_out.Visitor_ID = va.Visitor_ID
AND va_out.Activity_Type = 'OUT'
AND va_out.Activity_Time > va.Activity_Time )
with visitorsInOut as (
select Visitor_id,
max(case when Activity_Type = 'in' then Activity_Time else null end) inTime,
max(case when Activity_Type = 'out' then Activity_Time else null end) outTime
from Visitor_Activity
where datediff(dd, Activity_Time, getdate()) = 0
group by Visitor_id)
select Visitor_id
from visitorsInOut
where inTime > outTime or outTime is null
This uses a CTE to find the activity record with the greatest Activity_Time where the Activity_Type = 'IN' and assigns it RowNum 1. Then you can INNER JOIN the CTE to the Visitor table, filtering by the CTE results where RowNum = 1.
; WITH VisAct AS(
SELECT act.Visitor_ID
, ROW_NUMBER() OVER(PARTITION BY Visitor_ID ORDER BY Activity_Time DESC) AS RowNum
FROM Visitor_Activity act
WHERE act.Activity_Type = 'IN'
AND act.Activity_Time >= CAST(GETDATE() AS DATE)
)
SELECT vis.Visitor_ID, vis.Visitor_name
FROM Visitor vis
INNER JOIN VisAct act
ON act.Visitor_ID = vis.Visitor_ID
WHERE act.Row_Num = 1
You can pull the most recent action for each visitor, and then only return those where the last action for today was to check in.
SELECT v.Visitor_ID, v.Visitor_Name, va.Activity_Type, va.Activity_Time
FROM Visitor AS v
INNER JOIN (SELECT Visitor_ID, Activity_Type, Activity_Time, RANK() OVER (PARTITION BY Visitor_ID ORDER BY Activity_Time DESC) AS LastAction
FROM Visitor_Activity
-- checks for today, can be omitted if you still want
-- to see someone checked in from yesterday
WHERE DATEDIFF(d, 0, Activity_Time) = DATEDIFF(d, 0, getdate())
) AS va ON va.Visitor_ID = v.Visitor_ID
WHERE LastAction = 1
AND Activity_Type = 'IN'
With CROSS APPLY:
DECLARE #d DATE = '20150320'
DECLARE #v TABLE
(
visitor_id INT ,
visitor_name NVARCHAR(MAX)
)
DECLARE #a TABLE
(
visitor_id INT ,
type CHAR(3) ,
time DATETIME
)
INSERT INTO #v
VALUES ( 1, 'A' ),
( 2, 'B' ),
( 3, 'C' )
INSERT INTO #a
VALUES ( 1, 'in', '2015-03-20 19:32:27.513' ),
( 1, 'out', '2015-03-20 19:32:27.514' ),
( 1, 'in', '2015-03-20 19:32:27.515' ),
( 2, 'in', '2015-03-20 19:32:27.516' ),
( 2, 'out', '2015-03-20 19:32:27.517' ),
( 3, 'in', '2015-03-20 19:32:27.518' ),
( 3, 'out', '2015-03-20 19:32:27.519' ),
( 3, 'in', '2015-03-20 19:32:27.523' )
SELECT *
FROM #v v
CROSS APPLY ( SELECT *
FROM ( SELECT TOP 1
type
FROM #a a
WHERE a.visitor_id = v.visitor_id
AND a.time >= #d
AND a.time < DATEADD(dd, 1, #d)
ORDER BY time DESC
) i
WHERE type = 'in'
) c
Output:
visitor_id visitor_name type
1 A in
3 C in
The principle:
First you are selecting all visitors.
Then you are applying to visitor last activity
SELECT TOP 1
type
FROM #a a
WHERE a.visitor_id = v.visitor_id
AND a.time >= #d
AND a.time < DATEADD(dd, 1, #d)
ORDER BY time DESC
Then you are selecting from previous step in order to get empty set which will filter out visitors whose last activity was not 'in'. If last activity was 'in' you get one row in result and thus applying works. If last activity is 'out' then outer query will result in empty set, and by design CROSS APPLY will eliminate such visitor.