How to pick the 2 highest prices paid - sql

This is the input data.
Dept Company Code Payment Amt
Gardeners Sort:Insurance Carrier 100 20.00
Gardeners Sort:Insurance Carrier 100 22.00
Gardeners Sort:Insurance Carrier 100 21.00
Gardeners Sort:Insurance Carrier 100 20.00
Gardeners Sort:Insurance Carrier 100 22.00
I want to return
Sort:Insurance Carrier 100 - 22.00 and 21.00
Not 22.00 and 22.00 I fear this code is returning 22 and 22 arguably the 2 top prices paid but not really so.
I have this SQL
SELECT
[DEPT], [Sort: Procedure Code] as Code, [Sort: Insurance Carrier],
SUM(CASE WHEN num = 1 THEN [Pmt Amount] ELSE 0 END) AS [first high],
SUM(CASE WHEN num = 2 THEN [Pmt Amount] ELSE 0 END) AS [second high]
FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY
[DEPT], [Sort: Procedure Code], [Sort: Insurance Carrier]
ORDER BY [Pmt Amount] DESC) AS num,
[DEPT], [Sort: Procedure Code], [Sort: Insurance Carrier],
[Pmt Amount]
FROM
[revenuedetail$]
) AS t
WHERE num IN (1, 2)
GROUP BY [DEPT], [Sort: Procedure Code], [Sort: Insurance Carrier]

If you want the same value to have the same number, then you should use dense_rank() instead of row_number(). But you are on the right track!
Also change sum() to max() to avoid summing the values with the same dense_rank().
Try this:
select
[dept]
, [Sort: Procedure Code] as Code
, [Sort: Insurance Carrier]
, max(case when num = 1 then [Pmt Amount] else 0 end) as [first high]
, max(case when num = 2 then [Pmt Amount] else 0 end) as [second high]
from (
select
dense_rank() over(
partition by [dept], [Sort: Procedure Code], [Sort: Insurance Carrier]
order by [Pmt Amount] desc
) as num
, [dept]
, [Sort: Procedure Code]
, [Sort: Insurance Carrier]
, [Pmt Amount]
from [revenuedetail$]
) as t
where num in (1, 2)
group by [dept], [Sort: Procedure Code], [Sort: Insurance Carrier]
rextester demo: http://rextester.com/PJCDDC90476
returns:
+-----------+------+-------------------------+------------+-------------+
| dept | Code | Sort: Insurance Carrier | first high | second high |
+-----------+------+-------------------------+------------+-------------+
| Gardeners | 100 | Sort:Insurance Carrier | 22.00 | 21.00 |
+-----------+------+-------------------------+------------+-------------+

You seem to want dense_rank() rather than row_number():
SELECT [DEPT], [Sort: Procedure Code] as Code, [Sort: Insurance Carrier],
SUM(CASE WHEN num = 1 THEN [Pmt Amount] END) AS [first high],
SUM(CASE WHEN num = 2 THEN [Pmt Amount] END) AS [second high]
FROM (SELECT DENSE_RANK() OVER (PARTITION BY [DEPT], [Sort: Procedure Code], [Sort: Insurance Carrier]
ORDER BY [Pmt Amount] DESC
) AS num,
rd.*
FROM [revenuedetail$] rd
) rd
WHERE num IN (1, 2)
GROUP BY [DEPT], [Sort: Procedure Code], [Sort: Insurance Carrier];
Notes:
I removed the ELSE 0. If there is no second value, then this version returns NULL rather than 0. I find that more intuitive (add the ELSE 0 back if that is not the behavior you want).
I added more meaningful table aliases. rd makes more sense than t.
I used rd.* in the subquery. That actually shortens the query and makes it easier to modify.
You should reconsider your column names. All the square braces just make the code harder to write and to read.

If the sql server version is 2012 and above, Then Lead() can be used:
select Top 1 [DEPT], [Sort: Procedure Code], [Sort: Insurance Carrier],
[Pmt Amount] AS [first high],
Lead([Pmt Amount],1)over(partition by [DEPT], [Sort: Procedure Code],
[Sort: Insurance Carrier] ORDER BY [Pmt Amount] DESC)AS [Second high]
from [revenuedetail$] order by [Pmt Amount] desc

Related

Incorrect 4th & 1st Quarter Sales Values in query

I've been writing a query to group sales by year with other columns containing quarterly sales, growth per quarter in percentage, quarter on quarter change in quarterly sales and total annual sales in the last column from the .
I have ran the following query:
WITH Sales_By_Quarter AS
(
SELECT
DATEPART(YEAR, OrderDate) AS [Year],
DATEPART(QUARTER, OrderDate) AS [Quarter],
SUM(TotalDue) AS [Quarterly Sales],
SUM(TotalDue) - LAG(SUM(TotalDue)) OVER (PARTITION BY DATEPART(YEAR, OrderDate) ORDER BY DATEPART(QUARTER, OrderDate)) AS [Change]
FROM Sales.SalesOrderHeader
GROUP BY DATEPART(YEAR, OrderDate), DATEPART(QUARTER, OrderDate)
),
Annual_Sales AS
(
SELECT
[Year],
SUM([Quarterly Sales]) AS [Total Annual Sales],
SUM([Quarterly Sales]) - LAG(SUM([Quarterly Sales])) OVER (ORDER BY [Year]) AS [Annual Growth]
FROM Sales_By_Quarter
GROUP BY [Year]
)
-- SELECT * FROM Annual_Sales;
SELECT
Sales_By_Quarter.[Year],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 1 THEN Sales_By_Quarter.[Quarterly Sales] END) AS [Q1],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 1 THEN (Sales_By_Quarter.[Quarterly Sales]/Annual_Sales.[Total Annual Sales]*100) END) as [Annual %],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 1 THEN Sales_By_Quarter.[Quarterly Sales] END) AS [4 to 1],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 2 THEN Sales_By_Quarter.[Quarterly Sales] END) AS [Q2],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 2 THEN Sales_By_Quarter.[Quarterly Sales]/Annual_Sales.[Total Annual Sales]*100 END) as [Annual %],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 2 THEN Sales_By_Quarter.[Change] END) AS [1 to 2],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 3 THEN Sales_By_Quarter.[Quarterly Sales] END) AS [Q3],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 3 THEN Sales_By_Quarter.[Quarterly Sales]/Annual_Sales.[Total Annual Sales]*100 END) as [Annual %],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 3 THEN Sales_By_Quarter.[Change] END) AS [2 to 3],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 4 THEN Sales_By_Quarter.[Quarterly Sales] END) AS [Q4],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 4 THEN Sales_By_Quarter.[Quarterly Sales]/Annual_Sales.[Total Annual Sales]*100 END) as [Annual %],
SUM(CASE WHEN Sales_By_Quarter.[Quarter] = 4 THEN Sales_By_Quarter.[Change] END) AS [3 to 4],
Annual_Sales.[Total Annual Sales]
FROM Sales_By_Quarter
JOIN Annual_Sales ON Sales_By_Quarter.[Year] = Annual_Sales.[Year]
GROUP BY Sales_By_Quarter.[Year], Annual_Sales.[Total Annual Sales], Annual_Sales.[Annual Growth]
ORDER BY Sales_By_Quarter.[Year];
I am getting right values in all columns except the 4 to 1 column. I need some help in fixing this query.
Fixed the definition of Quarterly Change, by not Partitioning on Year, and just Ordering by it instead.
Removed the uneccesary CTE (and the join on it) for yearly figures.
Corrected the column being pivoted for [4 to 1].
Ensured all columns have unique names.
WITH
Sales_By_Quarter AS
(
SELECT
DATEPART(YEAR, OrderDate) AS [Year],
DATEPART(QUARTER, OrderDate) AS [Quarter],
SUM(TotalDue) AS [Quarterly Sales],
SUM(TotalDue) - LAG(SUM(TotalDue)) OVER (ORDER BY DATEPART(YEAR, OrderDate), DATEPART(QUARTER, OrderDate)) AS [Change]
FROM
Sales.SalesOrderHeader
GROUP BY
DATEPART(YEAR, OrderDate),
DATEPART(QUARTER, OrderDate)
)
SELECT
Q.[Year],
SUM(CASE WHEN Q.[Quarter] = 1 THEN Q.[Quarterly Sales] END) AS [Q1],
SUM(CASE WHEN Q.[Quarter] = 1 THEN Q.[Quarterly Sales] END) * 100.0 / SUM(Q.[Quarterly Sales]) AS [Q1 Annual %],
SUM(CASE WHEN Q.[Quarter] = 1 THEN Q.[Change] END) AS [4 to 1],
SUM(CASE WHEN Q.[Quarter] = 2 THEN Q.[Quarterly Sales] END) AS [Q1],
SUM(CASE WHEN Q.[Quarter] = 2 THEN Q.[Quarterly Sales] END) * 100.0 / SUM(Q.[Quarterly Sales]) AS [Q2 Annual %],
SUM(CASE WHEN Q.[Quarter] = 2 THEN Q.[Change] END) AS [1 to 2],
SUM(CASE WHEN Q.[Quarter] = 3 THEN Q.[Quarterly Sales] END) AS [Q3],
SUM(CASE WHEN Q.[Quarter] = 3 THEN Q.[Quarterly Sales] END) * 100.0 / SUM(Q.[Quarterly Sales]) AS [Q3 Annual %],
SUM(CASE WHEN Q.[Quarter] = 3 THEN Q.[Change] END) AS [2 to 3],
SUM(CASE WHEN Q.[Quarter] = 4 THEN Q.[Quarterly Sales] END) AS [Q4],
SUM(CASE WHEN Q.[Quarter] = 4 THEN Q.[Quarterly Sales] END) * 100.0 / SUM(Q.[Quarterly Sales]) AS [Q4 Annual %],
SUM(CASE WHEN Q.[Quarter] = 4 THEN Q.[Change] END) AS [3 to 4],
SUM(Q.[Quarterly Sales]) AS [Total Annual Sales],
SUM(Q.[Quarterly Sales]) - LAG(SUM(Q.[Quarterly Sales])) OVER (ORDER BY Q.[Year]) AS [Annual Growth]
FROM
Sales_By_Quarter As Q
GROUP BY
Q.[Year]
ORDER BY
Q.[Year]

Display Percentage of grouped SUM to total SUM

I currently have results like
total sales | total cost | total profit | department
----------------------------------------------------
100 50 50 A
80 20 60 B
250 120 130 C
Using columns from tables
Invoice_Itemized
itemnum | costper | priceper | quantity | invoice_number
--------------------------------------------------------
Invoice_Totals
invoice_number | datetime
---------------------------
Inventory
itemnum | dept_id
------------------
Departments
dept_id | description
----------------------
with the following code
select sum(invoice_itemized.priceper* invoice_itemized.quantity) as "Total Sales",
sum(invoice_itemized.quantity*inventory.cost) as "Total Cost",
sum(invoice_itemized.priceper* invoice_itemized.quantity)-
sum(invoice_itemized.quantity*inventory.cost) as "Total Profit",
departments.description as Department
from invoice_itemized, invoice_totals, inventory, departments
where invoice_itemized.invoice_number=invoice_totals.invoice_number
and year(invoice_totals.datetime)=2018 and month(invoice_totals.datetime)=10
and inventory.itemnum=invoice_itemized.itemnum
and inventory.dept_id=departments.dept_id
and departments.description<>'shop use'
and departments.description<>'none'
and departments.description<>'ingredients'
group by departments.description
order by "total profit" desc
I would like results like
total sales | total cost | total profit | percentage total profit | department
-------------------------------------------------------------------------------
100 50 50 20.83 A
80 20 60 25 B
250 120 130 54.17 C
The problem I encounter is that I'm trying to divide a the grouped results of a SUM-SUM by the total of the same SUM-SUM. I've tried something similar to the suggestion made in
Percentage from Total SUM after GROUP BY SQL Server
but that didn't seem to work for me. I was getting binding errors. Any suggestions?
You can do this with window functions:
with t as (
<your query here>
)
select t.*,
profit * 100.0 / sum(profit) over () as profit_percentage
from t;
This should work:
Select q.[Total Sales],
q.[Total Cost],
q.[Total Profit],
q.Total Profit] / q1.Total Profit] as [Percentage Total Profit],
q.Department
from (
select sum(invoice_itemized.priceper* invoice_itemized.quantity) as [Total Sales],
sum(invoice_itemized.quantity*inventory.cost) as [Total Cost],
sum(invoice_itemized.priceper* invoice_itemized.quantity) - sum(invoice_itemized.quantity*inventory.cost) as [Total Profit],
departments.description as Department
from invoice_itemized, invoice_totals, inventory, departments
where invoice_itemized.invoice_number=invoice_totals.invoice_number
and year(invoice_totals.datetime)=2018 and month(invoice_totals.datetime)=10
and inventory.itemnum=invoice_itemized.itemnum
and inventory.dept_id=departments.dept_id
and departments.description<>'shop use'
and departments.description<>'none'
and departments.description<>'ingredients'
group by departments.description) q
join (
select sum(t.[Total Profit]) as [Total Profit]
from (select sum(invoice_itemized.priceper* invoice_itemized.quantity) as [Total Sales],
sum(invoice_itemized.quantity*inventory.cost) as [Total Cost],
sum(invoice_itemized.priceper* invoice_itemized.quantity) - sum(invoice_itemized.quantity*inventory.cost) as [Total Profit],
departments.description as Department
from invoice_itemized, invoice_totals, inventory, departments
where invoice_itemized.invoice_number=invoice_totals.invoice_number
and year(invoice_totals.datetime)=2018 and month(invoice_totals.datetime)=10
and inventory.itemnum=invoice_itemized.itemnum
and inventory.dept_id=departments.dept_id
and departments.description<>'shop use'
and departments.description<>'none'
and departments.description<>'ingredients'
group by departments.description) t
) q1 on q1.[Total Profit] = q1.[Total Profit]
order by q.[Total Profit] desc

How to pivot on multiple columns?

I'm trying to pivot on multiple columns and I'm using SQL Server 2014, however, I cannot figure out how to do that. Here's what I've tried so far:
DECLARE #Table TABLE (
Name NVARCHAR(MAX),
TypeId INT,
TotalOrders INT,
GrandTotal MONEY
)
INSERT INTO #Table
(Name, TypeId, TotalOrders, GrandTotal)
VALUES
('This Month', 1, 10, 1),
('This Month', 2, 5, 7),
('This Week', 1, 8, 3),
('Last Week', 1, 8, 12),
('Yesterday', 1, 10, 1),
('Yesterday', 2, 1, 5)
Which produces the following result:
Name TypeId TotalOrders GrandTotal
-------------------------------- ----------- ----------- ---------------------
This Month 1 10 1.00
This Month 2 5 7.00
This Week 1 8 3.00
Last Week 1 8 12.00
Yesterday 1 10 1.00
Yesterday 2 1 5.00
To bring those rows into columns, I've tried this:
SELECT
TypeId,
ISNULL([Yesterday], 0) AS YesterdayTotalOrders,
ISNULL([This Week], 0) AS ThisWeekTotalOrders,
ISNULL([Last Week], 0) AS LastWeekTotalOrders,
ISNULL([This Month], 0) AS ThisMonthTotalOrders
FROM
(SELECT Name, TypeId, TotalOrders FROM #Table) AS src
PIVOT (
SUM(TotalOrders) FOR Name IN (
[Yesterday],
[This Week],
[Last Week],
[This Month]
)
) AS p1
Which produces the following result set:
TypeId YesterdayTotalOrders ThisWeekTotalOrders LastWeekTotalOrders ThisMonthTotalOrders
----------- -------------------- ------------------- ------------------- --------------------
1 10 8 8 10
2 1 0 0 5
Now, I need to have few other columns for GrandTotal such as YesterdayGrandTotal, ThisWeekGrandTotal, and so on and so forth but I can't figure out how to achieve this.
Any help would be highly appreciated.
UPDATE#1: Here's the expected result set:
TypeId YesterdayTotalOrders ThisWeekTotalOrders LastWeekTotalOrders ThisMonthTotalOrders YesterdayGrandTotal ThisWeekGrandTotal LastWeekGrandTotal ThisMonthGrandTotal
----------- -------------------- ------------------- ------------------- -------------------- --------------------- --------------------- --------------------- ---------------------
1 10 8 8 10 1.00 3.00 12.00 1.00
2 1 0 0 5 5.00 0.00 0.00 7.00
Conditional aggregation may be a solution:
select typeID,
SUM(case when name = 'Yesterday' then totalOrders else 0 end) as YesterdayTotalOrders,
SUM(case when name = 'This Week' then totalOrders else 0 end) as ThisWeekTotalOrders,
SUM(case when name = 'Last Week' then totalOrders else 0 end) as LastWeekTotalOrders,
SUM(case when name = 'This Month' then totalOrders else 0 end) as ThisMonthTotalOrders,
SUM(case when name = 'Yesterday' then GrandTotal else 0 end) as YesterdayGrandTotal,
SUM(case when name = 'This Week' then GrandTotal else 0 end) as ThisWeekGrandTotal,
SUM(case when name = 'Last Week' then GrandTotal else 0 end) as LastWeekGrandTotal,
SUM(case when name = 'This Month' then GrandTotal else 0 end) as ThisMonthGrandTotal
from #table
group by typeID
or, you can use the CROSS APPLY and PIVOT like this
SELECT
TypeId,
ISNULL([Yesterday], 0) AS YesterdayTotalOrders,
ISNULL([This Week], 0) AS ThisWeekTotalOrders,
ISNULL([Last Week], 0) AS LastWeekTotalOrders,
ISNULL([This Month], 0) AS ThisMonthTotalOrders,
ISNULL([grant Yesterday], 0) AS YesterdayGrandTotal,
ISNULL([grant This Week], 0) AS ThisWeekGrandTotal,
ISNULL([grant Last Week], 0) AS LastWeekGrandTotal,
ISNULL([grant This Month], 0) AS ThisMonthGrandTotal
FROM
(
SELECT t.*
FROM #Table
CROSS APPLY (values(Name, TypeId, TotalOrders),
('grant ' + Name, TypeId, GrandTotal))
t(Name, TypeId, TotalOrders)
) AS src
PIVOT (
SUM(TotalOrders) FOR Name IN (
[Yesterday],
[This Week],
[Last Week],
[This Month],
[grant Yesterday],
[grant This Week],
[grant Last Week],
[grant This Month]
)
) AS p1
demo
Both solutions will scan the input table just once and they have a very similar query plan. Both solutions are better than JOIN of two pivots (the solution that I have originally provided) since two pivots need to scan the input table twice.
You could also use of CTE by separating your pivots.. P1 for TotalOrders & P2 for GrandTotal
;WITH CTE AS
(
SELECT
P1.TypeId,
ISNULL(P1.[Yesterday], 0) AS YesterdayTotalOrders,
ISNULL(P1.[This Week], 0) AS ThisWeekTotalOrders,
ISNULL(P1.[Last Week], 0) AS LastWeekTotalOrders,
ISNULL(P1.[This Month], 0) AS ThisMonthTotalOrders
FROM
(SELECT Name, TypeId, TotalOrders FROM #Table) AS src
PIVOT (SUM(TotalOrders) FOR src.Name IN (
[Yesterday],
[This Week],
[Last Week],
[This Month])) AS P1
), CTE1 AS
(
SELECT
P1.TypeId,
ISNULL(P1.[Yesterday], 0) AS YesterdayGrandTotal,
ISNULL(P1.[This Week], 0) AS ThisWeekTGrandTotal,
ISNULL(P1.[Last Week], 0) AS LastWeekGrandTotal,
ISNULL(P1.[This Month], 0) AS ThisMonthGrandTotal
FROM
(SELECT Name, TypeId, GrandTotal FROM #Table) AS src
PIVOT (SUM(GrandTotal) FOR src.Name IN (
[Yesterday],
[This Week],
[Last Week],
[This Month])) AS P1
)
SELECT C.TypeId , C.YesterdayTotalOrders, C.ThisWeekTotalOrders, C.LastWeekTotalOrders, C.ThisMonthTotalOrders , C1.YesterdayGrandTotal , C1.ThisWeekTGrandTotal ,C1.LastWeekGrandTotal , C1.ThisMonthGrandTotal FROM CTE C
INNER JOIN CTE1 C1 ON C1.TypeId = C.TypeId
Result :
TypeId YesterdayTotalOrders ThisWeekTotalOrders LastWeekTotalOrders ThisMonthTotalOrders YesterdayGrandTotal ThisWeekGrandTotal LastWeekGrandTotal ThisMonthGrandTotal
----------- -------------------- ------------------- ------------------- -------------------- --------------------- --------------------- --------------------- ---------------------
1 10 8 8 10 1.00 3.00 12.00 1.00
2 1 0 0 5 5.00 0.00 0.00 7.00
IN oracle i would do something like tihs. it works fine for me
select * from (
SELECT Name, TypeId,TotalOrders
--hier
,GrandTotal
FROM test_b )
PIVOT ( SUM(TotalOrders)TotalOrders,
--hier
SUM(grandtotal) grandtotal FOR Name IN
('Yesterday'Yesterday,'This Week'ThisWeek,'Last Week'LastWeek,'This
Month'ThisMonth )
) ;
so try this in sql server
SELECT *
/* TypeId,
ISNULL([Yesterday], 0) AS YesterdayTotalOrders,
ISNULL([This Week], 0) AS ThisWeekTotalOrders,
ISNULL([Last Week], 0) AS LastWeekTotalOrders,
ISNULL([This Month], 0) AS ThisMonthTotalOrders*/
FROM
(SELECT Name, TypeId, TotalOrders
,grandtotal
FROM #Table) AS src
PIVOT (
SUM(TotalOrders)TotalOrders, SUM(grandtotal)grandtotal FOR Name IN (
[Yesterday]Yesterday,
[This Week]ThisWeek,
[Last Week]LastWeek,
[This Month]ThisMonth
)
) AS p1

How to produce such results using SQL

I edited my question as it seems like people misunderstood what I wanted.
I have a table which has the following columns:
Company
Transaction ID
Transaction Date
The result I want is:
| COMPANY | Transaction ID |Transaction Date | GROUP
|---------------------|------------------|------------------|----------
| Company A | t_0001 | 01-01-2014 | 1
| Company A | t_0002 | 02-01-2014 | 1
| Company A | t_0003 | 04-01-2014 | 1
| Company A | t_0003 | 10-01-2014 | 2
| Company B | t_0004 | 02-01-2014 | 1
| Company B | t_0005 | 02-01-2014 | 1
| Company C | t_0006 | 03-01-2014 | 1
| Company C | t_0007 | 05-01-2014 | 2
where the transactions and dates are firstly group into companies. The transactions within the company are sorted from the earliest to the latest. The transactions are checked, row by row, if the previous transaction was performed less than 3 days ago in a moving window period.
For example, t_0002 and t_0001 are less than 3 days apart so they fall under group 1. t_0003 and t_0002 are less than 3 days apart so they fall under group 1 even though t_0003 and t_0003 are >= 3 days apart.
I figured the way to go about doing this is to group the data by companies first, following by sorting the transactions by the dates, but I got stuck after this. Like what methods are there I could use to produce this results? Any help on this?
P.S. I am using SQL Server 2014.
I have determined days difference between each company following by transaction id. so if days difference is less than 3 goes to group 1 other are 2. Based on your requirement alter the lag clause and use it.
select *,isnull(
case when datediff(day,
lag([Transaction Date]) over(partition by company order by [transaction id]),[Transaction Date])>=2
then
2
end ,1)group1
from #Table1
If you don't care about the numbering in groups, use
select *,
dense_rank() over(partition by company order by transaction_date) -
(select count(distinct transaction_date) from t
where t1.company=company
and datediff(dd,transaction_date,t1.transaction_date) between 1 and 2) grp
from t t1
order by 1,3
Sample Demo
If continuous numbers are needed for groups, use
select company,transaction_id,transaction_date,
dense_rank() over(partition by company order by grp) grp
from (select *, dense_rank() over(partition by company order by transaction_date) -
(select count(distinct transaction_date) from t
where t1.company=company
and datediff(dd,transaction_date,t1.transaction_date) between 1 and 2) grp
from t t1
) x
order by 1,3
create table xn (
[Company] char(1),
[Transaction ID] char(6),
[Transaction Date] date,
primary key ([Company], [Transaction ID], [Transaction Date])
);
insert into xn values
('A', 't_0001', '2014-01-01'),
('A', 't_0002', '2014-01-02'),
('A', 't_0003', '2014-01-04'),
('A', 't_0003', '2014-01-10'),
('B', 't_0004', '2014-01-02'),
('B', 't_0005', '2014-01-02'),
('C', 't_0006', '2014-01-03'),
('C', 't_0007', '2014-01-05');
Each query builds on the one before. There are more concise ways to write queries like this, but I think this way helps when you're learning window functions like lag(...) over (...).
The first one here brings the previous transaction date into the "current" row.
select
[Company],
[Transaction ID],
[Transaction Date],
lag ([Transaction Date]) over (partition by [Company] order by [Transaction Date]) as [Prev Transaction Date]
from xn
This query determines the number of days between the "current" transaction date and the previous transaction date.
select
[Company],
[Transaction ID],
[Transaction Date],
[Prev Transaction Date],
DateDiff(d, [Prev Transaction Date], [Transaction Date]) as [Days Between]
from (select
[Company],
[Transaction ID],
[Transaction Date],
lag ([Transaction Date]) over (partition by [Company] order by [Transaction Date]) as [Prev Transaction Date]
from xn) x
This does the grouping based on the number of days.
select
[Company],
[Transaction ID],
[Transaction Date],
case when [Days Between] between 0 and 3 then 1
when [Days Between] is null then 1
when [Days Between] > 3 then 2
else 'Ummm'
end as [Group Num]
from (
select
[Company],
[Transaction ID],
[Transaction Date],
[Prev Transaction Date],
DateDiff(d, [Prev Transaction Date], [Transaction Date]) as [Days Between]
from (select
[Company],
[Transaction ID],
[Transaction Date],
lag ([Transaction Date]) over (partition by [Company] order by [Transaction Date]) as [Prev Transaction Date]
from xn) x
) y;

Identify Continuous Periods of Time

Today my issue has to do with marking continuous periods of time where a given criteria is met. My raw data of interest looks like this.
Salesman ID Pay Period ID Total Commissionable Sales (US dollars)
1 101 525
1 102 473
1 103 672
1 104 766
2 101 630
2 101 625
.....
I want to mark continous periods of time where a salesman has achieved $500 of sales or more. My ideal result should look like this.
[Salesman ID] [Start time] [End time] [# Periods] [Average Sales]
1 101 101 1 525
1 103 107 5 621
2 101 103 3 635
3 104 106 3 538
I know how to everything else, but I cannot figure out a non-super expensive way to identify start and end dates. Help!
Try something like this. The innermost select-statement basically adds a new column to the original table with a flag determining when a new group begins. Outside this statement, we use this flag in a running total, that then enumerates the groups - we call this column [Group ID]. All that is left, is then to filter out the rows where [Sales] < 500, and group by [Salesman ID] and [Group ID].
SELECT [Salesman ID], MIN([Pay Period ID]) AS [Start time],
MAX([Pay Period ID]) AS [End time], COUNT(*) AS [# of periods],
AVG([Sales]) AS [Average Sales]
FROM (
SELECT [Salesman ID], [Pay Period ID], [Sales],
SUM(NewGroup) OVER (PARTITION BY [Salesman ID] ORDER BY [Pay Period ID]
ROWS UNBOUNDED PRECEDING) AS [Group ID]
FROM (
SELECT T1.*,
CASE WHEN T1.[Sales] >= 500 AND (Prev.[Sales] < 500 OR Prev.[Sales] IS NULL)
THEN 1 ELSE 0 END AS [NewGroup]
FROM MyTable T1
LEFT JOIN MyTable Prev ON Prev.[Salesman ID] = T1.[Salesman ID]
AND Prev.[Pay Period ID] = T1.[Pay Period ID] - 1
) AS InnerQ
) AS MiddleQ
WHERE [Sales] >= 500
GROUP BY [Salesman ID], [Group ID]