Identify Continuous Periods of Time - sql

Today my issue has to do with marking continuous periods of time where a given criteria is met. My raw data of interest looks like this.
Salesman ID Pay Period ID Total Commissionable Sales (US dollars)
1 101 525
1 102 473
1 103 672
1 104 766
2 101 630
2 101 625
.....
I want to mark continous periods of time where a salesman has achieved $500 of sales or more. My ideal result should look like this.
[Salesman ID] [Start time] [End time] [# Periods] [Average Sales]
1 101 101 1 525
1 103 107 5 621
2 101 103 3 635
3 104 106 3 538
I know how to everything else, but I cannot figure out a non-super expensive way to identify start and end dates. Help!

Try something like this. The innermost select-statement basically adds a new column to the original table with a flag determining when a new group begins. Outside this statement, we use this flag in a running total, that then enumerates the groups - we call this column [Group ID]. All that is left, is then to filter out the rows where [Sales] < 500, and group by [Salesman ID] and [Group ID].
SELECT [Salesman ID], MIN([Pay Period ID]) AS [Start time],
MAX([Pay Period ID]) AS [End time], COUNT(*) AS [# of periods],
AVG([Sales]) AS [Average Sales]
FROM (
SELECT [Salesman ID], [Pay Period ID], [Sales],
SUM(NewGroup) OVER (PARTITION BY [Salesman ID] ORDER BY [Pay Period ID]
ROWS UNBOUNDED PRECEDING) AS [Group ID]
FROM (
SELECT T1.*,
CASE WHEN T1.[Sales] >= 500 AND (Prev.[Sales] < 500 OR Prev.[Sales] IS NULL)
THEN 1 ELSE 0 END AS [NewGroup]
FROM MyTable T1
LEFT JOIN MyTable Prev ON Prev.[Salesman ID] = T1.[Salesman ID]
AND Prev.[Pay Period ID] = T1.[Pay Period ID] - 1
) AS InnerQ
) AS MiddleQ
WHERE [Sales] >= 500
GROUP BY [Salesman ID], [Group ID]

Related

MS Access - Sub Query with Running Total using DSUM with filter

In order to generate running total of Sales Qty in MS Access, I used below query, it is working as expected
SELECT ID, [Product Line], DSUM("[Qty]","[SalesData]","[Product Line] like '*Electronics*' AND [ID] <=" & [ID]) AS RunningTotal, FROM SalesData WHERE ([Product Line]) Like '*Electronics*';
Now, I need to filter all the record with RunningTotal < 100,
I ran the below sub query
SELECT * FROM(
SELECT ID, [Product Line], DSUM("[Qty]","[SalesData]","[Product Line] like '*Electronics*' AND [ID] <=" & [ID]) AS RunningTotal, FROM SalesData WHERE ([Product Line]) Like '*Electronics*')
DSUM("[Qty]","[","[Product Line] like '*Electronics*' AND [ID] <=" & [ID]) < 100;
It is not working and table is freezed many times while running this query
Data Table
ID Product Line Qty RunningTotal
1 Electronics 15 15
2 R.K. Electricals 20 20
3 Samsung Electronics 10 25
4 Electricals 30 50
5 Electricals 45 95
6 Electronics Components 18 43
7 Electricals 25 120
8 Electronics 50 93
9 Electricals Machines 65 185
10 Electronics 15 108
11 ABC Electronics Ltd 52 160
12 Electricals 15 200
Here RunningTotal is calculated field (not table field)
Electricals RunningTotal is different and Electronics RunningTotal is different
Expected output for Product Line like Electronics with RunningTotal < 100
ID Product Line Qty RunningTotal
1 Electronics 15 15
3 Samsung Electronics 10 25
6 Electronics Components 18 43
8 Electronics 50 93
Could you please help me to rectify the above query?
Thanks in advance.
Rather than using domain aggregate functions (such as DSum) which are known to be notoriously slow, I would suggest using a correlated subquery, such as the following:
select q.* from
(
select t.id, t.[product line], t.qty,
(
select sum(u.qty)
from salesdata u
where u.[product line] = t.[product line] and u.id <= t.id
) as runningtotal
from salesdata t
where t.[product line] like "*Electronics*"
) q
where q.runningtotal < 100
EDIT:
select t.*, q.runningtotal from salesdata t inner join
(
select t.id,
(
select sum(u.qty)
from salesdata u
where u.[product line] like "*Electronics*" and u.id <= t.id
) as runningtotal
from salesdata t
) q on t.id = q.id
where q.runningtotal < 100 and t.[product line] like "*Electronics*"

Summing up of columns from different tables

here is my data,
Table 1:
STORAGE HANDLING TOTAL BILLING
--------------------------------------
1300 10900
0 10950
0 6000
0 5950
Table 2:
LINER REVENUE
---------------
1300
250
3000
200
I need to calculate Total Billing:
Total Billing = Storage+Handling+Liner Revenue.
Can someone guide me a query for this.
Hope this helps,
SELECT h.Storage+h.Handling+j.[Liner Revenue]
From(SELECT
STORAGE
,HANDLING
,[TOTAL BILLING]
,ROW_NUMBER () over(order by rand()) as p
FROM [Table 1] )h
INNER JOIN
(SELECT
[LINER REVENUE]
,ROW_NUMBER () over(order by rand()) as p
FROM [Table 2] )j
on h.p=j.p

How to produce such results using SQL

I edited my question as it seems like people misunderstood what I wanted.
I have a table which has the following columns:
Company
Transaction ID
Transaction Date
The result I want is:
| COMPANY | Transaction ID |Transaction Date | GROUP
|---------------------|------------------|------------------|----------
| Company A | t_0001 | 01-01-2014 | 1
| Company A | t_0002 | 02-01-2014 | 1
| Company A | t_0003 | 04-01-2014 | 1
| Company A | t_0003 | 10-01-2014 | 2
| Company B | t_0004 | 02-01-2014 | 1
| Company B | t_0005 | 02-01-2014 | 1
| Company C | t_0006 | 03-01-2014 | 1
| Company C | t_0007 | 05-01-2014 | 2
where the transactions and dates are firstly group into companies. The transactions within the company are sorted from the earliest to the latest. The transactions are checked, row by row, if the previous transaction was performed less than 3 days ago in a moving window period.
For example, t_0002 and t_0001 are less than 3 days apart so they fall under group 1. t_0003 and t_0002 are less than 3 days apart so they fall under group 1 even though t_0003 and t_0003 are >= 3 days apart.
I figured the way to go about doing this is to group the data by companies first, following by sorting the transactions by the dates, but I got stuck after this. Like what methods are there I could use to produce this results? Any help on this?
P.S. I am using SQL Server 2014.
I have determined days difference between each company following by transaction id. so if days difference is less than 3 goes to group 1 other are 2. Based on your requirement alter the lag clause and use it.
select *,isnull(
case when datediff(day,
lag([Transaction Date]) over(partition by company order by [transaction id]),[Transaction Date])>=2
then
2
end ,1)group1
from #Table1
If you don't care about the numbering in groups, use
select *,
dense_rank() over(partition by company order by transaction_date) -
(select count(distinct transaction_date) from t
where t1.company=company
and datediff(dd,transaction_date,t1.transaction_date) between 1 and 2) grp
from t t1
order by 1,3
Sample Demo
If continuous numbers are needed for groups, use
select company,transaction_id,transaction_date,
dense_rank() over(partition by company order by grp) grp
from (select *, dense_rank() over(partition by company order by transaction_date) -
(select count(distinct transaction_date) from t
where t1.company=company
and datediff(dd,transaction_date,t1.transaction_date) between 1 and 2) grp
from t t1
) x
order by 1,3
create table xn (
[Company] char(1),
[Transaction ID] char(6),
[Transaction Date] date,
primary key ([Company], [Transaction ID], [Transaction Date])
);
insert into xn values
('A', 't_0001', '2014-01-01'),
('A', 't_0002', '2014-01-02'),
('A', 't_0003', '2014-01-04'),
('A', 't_0003', '2014-01-10'),
('B', 't_0004', '2014-01-02'),
('B', 't_0005', '2014-01-02'),
('C', 't_0006', '2014-01-03'),
('C', 't_0007', '2014-01-05');
Each query builds on the one before. There are more concise ways to write queries like this, but I think this way helps when you're learning window functions like lag(...) over (...).
The first one here brings the previous transaction date into the "current" row.
select
[Company],
[Transaction ID],
[Transaction Date],
lag ([Transaction Date]) over (partition by [Company] order by [Transaction Date]) as [Prev Transaction Date]
from xn
This query determines the number of days between the "current" transaction date and the previous transaction date.
select
[Company],
[Transaction ID],
[Transaction Date],
[Prev Transaction Date],
DateDiff(d, [Prev Transaction Date], [Transaction Date]) as [Days Between]
from (select
[Company],
[Transaction ID],
[Transaction Date],
lag ([Transaction Date]) over (partition by [Company] order by [Transaction Date]) as [Prev Transaction Date]
from xn) x
This does the grouping based on the number of days.
select
[Company],
[Transaction ID],
[Transaction Date],
case when [Days Between] between 0 and 3 then 1
when [Days Between] is null then 1
when [Days Between] > 3 then 2
else 'Ummm'
end as [Group Num]
from (
select
[Company],
[Transaction ID],
[Transaction Date],
[Prev Transaction Date],
DateDiff(d, [Prev Transaction Date], [Transaction Date]) as [Days Between]
from (select
[Company],
[Transaction ID],
[Transaction Date],
lag ([Transaction Date]) over (partition by [Company] order by [Transaction Date]) as [Prev Transaction Date]
from xn) x
) y;

Getting the first value from a list of values in SQL

I have the following table
ID Name Activity date total time
1 AB 1/10/2015 209
1 AB 1/11/2015 1234
1 AB 1/12/2015 10
2 CD 1/10/2015 2347
2 CD 1/11/2015 0
2 CD 1/12/2015 0
2 CD 1/13/2015 5
3 EF 1/10/2015 53
3 EF 1/11/2015 14
4 XY 1/11/2015 76
I need the following result from it
ID Name Activity date total time
1 AB 1/10/2015 209
2 CD 1/10/2015 2347
3 EF 1/10/2015 53
4 XY 1/11/2015 76
Basically, I need the first value for all the names, I used the below query but its giving blank for Name value
SELECT distinct(Id),
FIRST_VALUE(Name) OVER (ORDER BY Id Asc) AS Name,
FIRST_VALUE(ActivityDate) OVER (ORDER BY Id Asc) AS Date,
FIRST_VALUE(TimeInQueue) OVER (ORDER BY Id Asc) AS Totaltime
FROM Historytable
GROUP BY Id,ActivityDate,Name
Use Window function
SELECT ID,
NAME,
[Activity date],
total time
FROM (SELECT Row_number() OVER(partition BY Id ORDER BY [Activity date]) Rn,
ID,
NAME,
[Activity date],
[total time] from yourtable) A
WHERE rn = 1
or find the min [Activity date] date per id and join the result back to the table using Id and [Activity date]
SELECT a.ID,
a.NAME,
a.[Activity date],
a.[total time]
FROM yourtable A
JOIN (SELECT Min([Activity date]) [Activity date],
ID
FROM yourtable) B
ON a.id = b.id
AND a.[Activity date] = b.[Activity date]

SQL calculating sum based on another column

I have a table with the following data in it:
Account number Amount
13 40
34 30
14 30
13 60
14 10
I would like to know how I can write a query to return the following results
Account number Total amount
13 100
14 40
34 30
The query should calculate the sum of all of the amounts in the amount column that share the same account number.
Any help would be much appreciated!
Use Group By + SUM
SELECT [Account number],
SUM(Amount) As [Total Amount]
FROM dbo.Table1
GROUP BY [Account Number]
ORDER BY SUM(Amount) DESC
Demo
Please try:
select
[Account Number],
sum(Amount) Amount
from
YourTable
Group by [Account Number]