SQL statement using WHERE from a GROUP or RANK - sql

I have a sales snapshot with about 35,000 rows. Let's call the columns:
Sales Rep | Account ID | Total Contract Value | Date
I need to group everything by Sales Rep and then from there, select that Sales Rep's top 35 accounts based off of Total Contract Value where the Total Contract Value is >= $10,000 for the Month (Date) of January 2013.
So for example, say John Doe had 294 accounts in this table from January, I only want to see his top 35 accounts >= $10,000 , same for Jane Doe, etc. etc. It's very important that the query be as efficient in it's resource usage as possible.
Thoughts?

The answer is already in your title, partition by SalesRep and AccountID and Rank by Total Contact Value.
A SQL Server solution will look like:
DECLARE #minimumValue decimal(20,2) = 10000
DECLARE #numberOfAccounts int = 35
DECLARE #from datetime = '1/1/2013'
DECLARE #till datetime = DATEADD(MONTH, 1, #from)
SELECT
[sub].[Sales Rep],
[sub].[Rank],
[sub].[Account ID],
[sub].[Total Contract Value]
FROM
(
SELECT
[Sales Rep],
[Account ID],
[Total Contract Value],
DENSE_RANK() OVER (PARTITION BY [Sales Rep] ORDER BY [Total Contract Value] DESC) AS [Rank]
FROM [Sales]
WHERE
[Total Contract Value] >= #minimumValue
AND [Date] > #from
AND [Date] < #till
) AS [sub]
WHERE [sub].[Rank] <= #numberOfAccounts
ORDER BY
[Sales Rep] ASC,
[Rank] ASC
Here is a (simple) Sql Fiddle.

For this, you want to use a function called row_number():
select ss.*
from (select ss.*, row_number() over (partition by salesrep order by ContractValue desc) as seqnum
from snapshot ss
where TotalContractValue >= 10000 and date between '2013-01-01' and '2013-01-31'
) ss
where seqnum <= 35
You don't specify the database you are using. In databases that don't have row_number(), there are alternatives that are less efficient.

Related

Returning First/Last Date and PIVOT

I am trying to put a view together in SQL that uses transactional type data to find the first transaction date, last transaction date, and whether it was a credit or debit. This is what it looks like currently:
Account Number
Date
Credit/Debit
123
1-1-22
Debit
123
1-2-22
Credit
456
1-1-22
Debit
456
1-2-22
Credit
I want it to look like this:
Account Number
FirstDate
LastDate
First Credit/Debit
Last Credit/Debit
123
1-1-22
1-2-22
Debit
Credit
456
1-1-22
1-2-22
Debit
Credit
I have created something close with the following code, but am having trouble figuring out how to bring in the First/Last Credit/Debit columns.
SELECT * FROM
(
SELECT * FROM
(
SELECT 'Earliest' as [TransDate], [Account], [Date], [Credit/Debit],
ROW_NUMBER() OVER (PARTITION BY [Account] ORDER BY [Date]) as rn
FROM DataTable
) e
WHERE e.rn = 1
UNION ALL
SELECT * FROM
(
SELECT 'Latest' as [TransDate], [Account], [Date], [Credit/Debit],
ROW_NUMBER() OVER (PARTITION BY [Account] ORDER BY [Date] DESC) as rn
FROM DataTable
) l
WHERE l.rn = 1
) t1
PIVOT (min([Date])) FOR [TransDate] in ([Latest], [Earliest])
) P
You can use ROW_NUMBER() function to get the first and the last row for each account number
;with
dd as (
select *,
ROW_NUMBER() over (partition by [Account Number] order by [Date]) rFirst,
ROW_NUMBER() over (partition by [Account Number] order by [Date] desc) rLast
from DataTable
)
select
d1.[Account Number],
d1.[Date] FirstDate, d2.[Date] LastDate,
d1.[Credit/Debit] [First Credit/Debit], d2.[Credit/Debit] [Last Credit/Debit]
from dd d1
join dd d2 on d1.[Account Number] = d2.[Account Number]
and d1.rFirst=1 and d2.rLast=1
Pay attention, I think you have an error in your data, you wrote 455 instead of 456
SELECT
DISTINCT
Account_Number,
FIRST_VALUE(DATE) OVER (PARTITION BY Account_Number ORDER BY DATE ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS FirstDate,
LAST_VALUE(DATE) OVER (PARTITION BY Account_Number ORDER BY DATE ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LastDate,
FIRST_VALUE(Credit_Debit) OVER (PARTITION BY Account_Number ORDER BY DATE ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS FirstCreditDebit,
LAST_VALUE(Credit_Debit) OVER (PARTITION BY Account_Number ORDER BY DATE ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LastCreditDebit
FROM t
We can achieve this result with group by. No need to pivot.
select Account_Number
,min(date) as FirstDate
,max(date) as LastDate
,min(fcd) as "First Credit/Debit"
,min(lcd) as "Last Credit/Debit"
from (
select *
,first_value(Credit_Debit) over(partition by Account_Number order by Date) as fcd
,last_value(Credit_Debit) over(partition by Account_Number order by Date) as lcd
from t
) t
group by Account_Number
Account_Number
FirstDate
LastDate
First Credit/Debit
Last Credit/Debit
123
2022-01-01
2022-01-02
Debit
Credit
456
2022-01-01
2022-01-02
Debit
Credit
Fiddle

Return only the highest row number for a partitioned column

I'm trying to partition a list of submitted machining jobs by the date they were submitted and return a maximum row number for each partition.
I have tried using Group By, but I want to retain all rows in the result. Partition By does what I need, but I want to display all rows except the maximum row number as blank.
SELECT [Created Date]
,row_number() over(partition by format([Created Date],'d','en-gb') order by [Created Date] desc) AS [Jobs Submitted That Day]
FROM [UK_App].[dbo].[rvxDevMCRequests]
order by [Created Date] desc
Results:
Created Date Jobs Submitted That Day
31/12/2014 1
31/10/2019 1
31/10/2019 2
31/10/2019 3
31/10/2018 1
31/10/2018 2
The order by function is not working correctly, and I can't figure out how to display only the highest row number. I would like it to output this:
Created Date Jobs Submitted That Day
31/12/2014 1
31/10/2018
31/10/2018 2
31/10/2019
31/10/2019
31/10/2019 3
Not an elegant solution:
SELECT [Created Date]
, case when row_number() over(partition by format([Created Date],'d','en-gb') order by [Created Date] desc)
= count(*) over(partition by format([Created Date],'d','en-gb'))
then count(*) over(partition by format([Created Date],'d','en-gb'))
else null end AS [Jobs Submitted That Day]
FROM [UK_App].[dbo].[rvxDevMCRequests]
order by [Created Date] desc
Try this one:
SELECT
a.CreatedDate,
CASE
WHEN y.rnum IS NULL
THEN ''
ELSE
a.JobsSubmitted
END AS JobsSubmitted
FROM
input a
LEFT OUTER JOIN
(
SELECT
x.CreatedDate, x.JobsSubmitted, x.rnum
FROM (
SELECT
a.*,
ROW_NUMBER() OVER(PARTITION BY a.CreatedDate ORDER BY a.JobsSubmitted DESC) AS rnum
FROM
input a
) x
WHERE
x.rnum = 1
) y
ON (
a.CreatedDate = y.CreatedDate
AND a.JobsSubmitted = y.JobsSubmitted
);
SQL Fiddle link for demo: http://www.sqlfiddle.com/#!18/511abf/17
Why are you using format()? There is no reason to convert a date to a string, especially in this case.
One significant issue is that the column [Created Date] has duplicates. When you order by that column, the duplicates can be in any order. In fact, two different order bys on the column in the same query can result in different ordering.
The solution to that is to capture the ordering once in a subquery and then use that:
select [Created Date],
(case when cnt = seqnum then seqnum
end) as [Jobs Submitted That Day]
from (select r.*,
row_number() over (partition by [Created Date] order by [Created Date] desc) as seqnum,
count(*) over (partition by [Created Date]) as cnt
from [UK_App].[dbo].[rvxDevMCRequests]
) r
order by [Created Date] desc, seqnum;
In the above query, seqnum captures the ordering, so it is used for the outer order by.

How to select max date over the year function

I am trying to select the max date over the year, but it is not working. Any ideas on what to do?
SELECT a.tkinit [TK ID],
YEAR(a.tkeffdate) [Rate Year],
max(a.tkeffdate) [Max Date],
tkrt03 [Standard Rate]
FROM stageElite.dbo.timerate a
join stageElite.dbo.timekeep b ON b.tkinit = a.tkinit
WHERE a.tkinit = '02672'
and tkeffdate BETWEEN '2014-01-01' and '12-31-2014'
GROUP BY a.tkinit,
tkrt03,
a.tkeffdate
Perhaps you only want it by year and not rolled up by calendar date. For SQL server you can try this.
SELECT
…
MaxDate = MAX(a.tkeffdate) OVER (PARTITION BY a.tkinit, YEAR(a.tkeffdate)))
…
Or you could modify the query above to group by the year instead of date-->
GROUP BY a.tkinit,
tkrt03,
YEAR(a.tkeffdate)
You seem to want only one row and all the columns. Use ORDER BY and TOP:
SELECT TOP (1) tr.tkinit as [TK ID],
YEAR(tr.tkeffdate) as [Rate Year],
a.tkeffdate as [Max Date],
tkrt03 as [Standard Rate]
FROM stageElite.dbo.timerate tr JOIN
stageElite.dbo.timekeep tk
ON tk.tkinit = tr.tkinit
WHERE tr.tkinit = '02672' AND
tr.tkeffdate >= '2014-01-01' AND
tr.tkeffdate < '2015-01-01'
ORDER tr.tkeffdate DESC;
Note that I also fixed your date comparisons and table aliases.

SQL Calculate Percentage in Group By

I have an SQL query that is used as the basis for a report. The report shows the amount of fuel used grouped by Year, Month and Fuel Type. I would like to calculate the percentage of the total for each fuel type, but I'm not having much luck. In order to calculate the percentage of the whole, I need to be able to get the total amount of fuel used regardless of the group it is in and I can't seem to figure out how to do this. Here is my query:
SELECT Year([DT1].[TransactionDate]) AS [Year], Month([DT1].[TransactionDate]) AS [Month], DT1.FuelType, Format(Sum(DT1.Used),"#.0") AS [Total Used],
FROM (SELECT TransactionDate, FuelType, Round([MeterAfter]-[MeterBefore],2) AS Used FROM FuelLog) AS DT1
WHERE (((DT1.TransactionDate) Between [Start Date] And [End Date]))
GROUP BY Year([DT1].[TransactionDate]), Month([DT1].[TransactionDate]), DT1.FuelType
ORDER BY Year([DT1].[TransactionDate]), Month(DT1.TransactionDate), DT1.FuelType;
I tried adding the following as a subquery but I get an error saying the subquery returns more than one result.
(SELECT Sum(Round([MeterAfter]-[MeterBefore],2)) AS Test
FROM Fuellog
WHERE Year([Year]) and Month([Month])
GROUP BY Year([TransactionDate]), Month([TransactionDate]))
Once I get the total of all fuel I will need to divide the amount of fuel used by the total amount of both fuel types. Should I be approaching this a different way?
Try this
SELECT A.[Year]
,A.[Month]
,A.[FuelType]
,A.[Total Used]
,(A.[Total Used] / B.[Total By Year Month]) * 100 AS Percentage
FROM
(
SELECT Year([DT1].[TransactionDate]) AS [Year]
, Month([DT1].[TransactionDate]) AS [Month]
, DT1.FuelType
, Format(Sum(DT1.Used),"#.0") AS [Total Used]
FROM (
SELECT TransactionDate
, FuelType
, Round([MeterAfter]-[MeterBefore],2) AS Used
FROM FuelLog
) AS DT1
WHERE (((DT1.TransactionDate) Between [Start Date] And [End Date]))
GROUP BY Year([DT1].[TransactionDate]), Month([DT1].[TransactionDate]), DT1.FuelType
ORDER BY Year([DT1].[TransactionDate]), Month(DT1.TransactionDate), DT1.FuelType
) A
INNER JOIN
(
SELECT Sum(Round([MeterAfter]-[MeterBefore],2)) AS [Total By Year Month]
, Year([TransactionDate]) AS [Year]
, Month([TransactionDate])) AS [Month]
FROM Fuellog
GROUP
BY Year([TransactionDate])
, Month([TransactionDate]))
) B
ON A.[Year] = B.[Year]
AND A.[Month] = B.[Month]
You need to join to the totals -- something like this (untested might have typos)
SELECT
Year([DT1].[TransactionDate]) AS [Year],
Month([DT1].[TransactionDate]) AS [Month],
DT1.FuelType,
Format(Sum(DT1.Used),"#.0") AS [Total Used],
(Sum(DT1.Used) / FT.Total) * 100 AS Percent
FROM (
SELECT
TransactionDate,
FuelType,
Round([MeterAfter]-[MeterBefore],2) AS Used
FROM FuelLog
) AS DT1
JOIN (
SELECT
Sum(Round([MeterAfter]-[MeterBefore],2)) AS Total
FuelType
FROM Fuellog
WHERE TransactionDate Between [Start Date] And [End Date]
GROUP BY FuelType
) FT ON DT1.FuelType = FT.FeulType
WHERE DT1.TransactionDate Between [Start Date] And [End Date]
GROUP BY Year([DT1].[TransactionDate]), Month([DT1].[TransactionDate]), DT1.FuelType, FT.Total
ORDER BY Year([DT1].[TransactionDate]), Month(DT1.TransactionDate), DT1.FuelType, FT.Total;

Grouping and retrieving most recent entry in a table for each group

First off, please bear with me if I don't state the SQL question correctly.
I have a table that has multiple columns of data. The selection criteria for my table groups based on column 1(order #). There could be multiple items on each order, but the item #'s are not grouped together.
Example:
Order Customer Order Date Order Time Item Quantity
123456 45 01/02/2010 08:00 140 4
123456 45 01/02/2010 08:30 270 29
123456 45 03/03/2010 09:00 140 6
123456 45 04/02/2010 09:30 140 10
123456 45 04/02/2010 10:00 270 35
What I need is a result like:
Order Customer Order Date Order Time Item Quantity
123456 45 04/02/2010 09:30 140 10
123456 45 04/02/2010 10:00 270 35
This result shows that after all the changes the final order includes 10 of Item 140 and 35 of Item 270.
Is this possible.
python
Since you didn't mention it, I'll assume you're using Oracle:
SELECT ORDER, CUSTOMER, ITEM, MAX(ORDER_TIMESTAMP), MAX(QUANTITY)
FROM (SELECT ORDER,
CUSTOMER,
ITEM,
TO_DATE(TO_CHAR(ORDER_DATE, 'YYYY-MM-DD') || ' ' ||
TO_CHAR(ORDER_TIME, 'HH:MI:SS'), 'YYYY-MM-DD HH:MI:SS')
AS ORDER_TIMESTAMP,
QUANTITY
FROM MY_TABLE)
GROUP BY ORDER, CUSTOMER, ITEM;
Share and enjoy.
Bob's answer looks good, but from reading the query it looks like what is wanted is the maximum of the quantity column rather than the sum, which would mean changing the "SUM(QUANTITY)" aggregate expression to "MAX(QUANTITY)".
Since you did not specify which database product or version, I'll show a solution that would work in SQL Server 2005 or higher:
With RankedItems As
(
Select Order, Customer, [Order Date], [Order Time], Item, Quantity
, Row_Number() Over( Partition By Order, Customer, Item Order By [Order Date] Desc, [Order Time] Desc ) As Num
From Table
)
Select Order, Customer, [Order Date], [Order Time], Item, Quantity
From RankedItems
Where Num = 1
Here is a more database-agnostic solution:
Select T.Order, T.Customer, T.[Order Date], T.[Order Time], T.Item, T.Quantity
From Table As T
Where T1.[Order Date] = (
Select Max(T1.[Order Date])
From Table As T1
Where T1.Order = T.Order
And T1.Customer = T.Customer
And T1.Item = T.Item
)
And T1.[Order Time] = (
Select Max(T1.[Order Time])
From Table As T1
Where T1.Order = T.Order
And T1.Customer = T.Customer
And T1.Item = T.Item
And T1.[Order Date] = T.[Order Date]
)
The catch in this later solution is that if there are multiple rows with the same Order, Customer, Item, Order Date and Order Time, you will get multiple rows in the above output.