SQL statement using WHERE from a GROUP or RANK (part 2) - sql

I recently posted a question about a SQL Where Statement/Grouping here:
SQL statement using WHERE from a GROUP or RANK
Now I've got somewhat of a follow-up.
So similar to the previous question, let's assume I have a table of say 35,000 rows with these columns:
Sales Rep | Parent Account ID| Account ID | Total Contract Value | Date
Each row is individual by account id but multiple account IDs can fall under a parent account ID.
Similar to the responses on the first question, this is probably going to be a table w/i a table. So first, everything has to be grouped by Sales Rep. From that, everything needs to be grouped by Parent Account ID where the grouped total contract value of all the accounts is >= 10,000. Then everything will be displayed and ranked by the total TCV of the Parent account ID and I need the top 35 Parent account IDs by agent.
So the first couple of lines of data may look like this:
Sales Rep | Parent Account ID| Account ID | Total Contract Value | Date | Rank
John Doe | ParentABC12345 | ABC425 | 5,000 | 1/2/2013 |1
John Doe | ParentABC12345 | ABC426 | 10,000 | 1/2/2013 |1
John Doe | ParentDJE12345 | DJE523 | 11,000 | 1/2/2013 |2
John Doe | ParentFBC12345 | FBC6723 | 4,000 | 1/2/2013 |3
John Doe | ParentFBC12345 | FBC6727 | 4,000 | 1/2/2013 |3
Notice how the ranking works based off of the parent Account ID. The account ID DJE523 has the single greatest TCV but it's ranked second b/c the grouped value of parent account ID ParentABC12345 is greater. So there would be a ranking of 35 parent account IDs but in that ranking their could be say 100+ lines of actual data.
Any thoughts?

Always nice to follow up. The "parent rank" is added as an INNER JOIN.
Edit: As correctly mentioned by Dan Bracuk, my first answer was not correct. I altered the query to meet the correct conditions. I also applied the timespan to the Parent Account's.
DECLARE #minimumValue decimal(20,2) = 10000
DECLARE #numberOfAccounts int = 35
DECLARE #from datetime = '1/1/2013'
DECLARE #till datetime = DATEADD(MONTH, 1, #from)
SELECT
[sub].[Sales Rep],
[sub].[Rank],
[sub].[Account ID],
[sub].[Total Contract Value],
[sub].[Parent Account ID],
[sub].[Total],
[sub].[ParentRank]
FROM
(
SELECT
[s].[Sales Rep],
[s].[Account ID],
[s].[Total Contract Value],
DENSE_RANK() OVER (PARTITION BY [s].[Sales Rep] ORDER BY [s].[Total Contract Value] DESC) AS [Rank],
[p].[Parent Account ID],
[p].[Total],
[p].[ParentRank]
FROM [Sales] [s]
INNER JOIN
(
SELECT
[Parent Account ID],
SUM([Total Contract Value]) AS [Total],
RANK() OVER(ORDER BY SUM([Total Contract Value]) DESC) AS [ParentRank]
FROM [Sales]
WHERE[Date] > #from AND [Date] < #till
GROUP BY [Parent Account ID]
HAVING SUM([Total Contract Value]) > #minimumValue
) AS [p] ON [s].[Parent Account ID] = [p].[Parent Account ID]
WHERE [Date] > #from AND [Date] < #till
) AS [sub]
WHERE [sub].[Rank] <= #numberOfAccounts
ORDER BY
[Sales Rep] ASC,
[ParentRank] ASC,
[Rank] ASC
And here is a new Fiddle.

I think this will do it for you, if you're using SQL Server:
Select top 35
SalesRep,
ParentAccountId,
sum(TotalContractValue) from Table
group by SalesRep, ParentAccountId
order by sum(TotalContractValue) desc

Related

Hello , currently learning and I am in need of some assistance. I need to write a query that will return a table with select columns [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
Write a query that will return a table with the following columns:
User ID, Site ID, User Name, Total Sales, Total Refunds, Net Amount Collected
I need to write a query that will return a table which at the moment im trying to figure out thanks.
Tried select statement but failed.enter image description here
I agree with the others that simply handing this over will not help your learning much. There are a bunch of concepts that you need to learn here. Submitting this answer for your homework might be awkward (and result in a score of 0) if you can't explain it!
Common Table Expressions
Aggregate Functions
Outer Joins
with cte_sales as
(
select
t.[User Id],
t.[Site Id],
sum(t.Amount) as [Total Sales]
from Transactions t
where t.[Transaction Type] = 'Sale'
group by t.[User Id],
t.[Site Id]
),
cte_refunds as
(
select
t.[User Id],
t.[Site Id],
sum(t.Amount) as [Total Refunds]
from Transactions t
where t.[Transaction Type] = 'Refund'
group by t.[User Id],
t.[Site Id]
)
select
u.[User Id],
u.[Site Id],
u.[Name] as [User Name],
coalesce(s.[Total Sales],0) as [Total Sales],
abs(coalesce(r.[Total Refunds],0)) as [Total Refunds],
(coalesce(s.[Total Sales],0) + coalesce(r.[Total Refunds],0)) as [Net Amount Collected]
from Users u
left join cte_sales s on s.[User Id] = u.[User Id]
and s.[Site Id] = u.[Site Id]
left join cte_refunds r on r.[User Id] = u.[User Id]
and r.[Site Id] = u.[Site Id]
order by u.[User Id],
u.[Site Id];
Demo
| User Id | Site Id | User Name | Total Sales | Total Refunds | Net Amount Collected |
|---------|---------|-----------|-------------|---------------|----------------------|
| 1 | 1 | Arthur | 120 | 120 | 0 |
| 2 | 1 | Aaron | 90 | 30 | 60 |
| 2 | 2 | Brett | 90 | 0 | 90 |

Access SQL appears to be treating date as dd/mm/yyyy?

I have a table in MS Access which holds staff details (tblStaff):
| Employee Number | Employee Name | Dept |
------------------------------------------------
| 205147 | Joe Bloggs | IT |
| 205442 | John Doe | Accounts |
I refresh this table with new data weekly and if any records have changed (e.g. changed dept) then they are archived in another table (tblArchiveStaff) along with the dates that that record was valid from and to.
| Employee Number | Employee Name | Dept | DateFrom | DateTo |
----------------------------------------------------------------------
| 205147 | Joe Bloggs | HR | 03/01/16 | 01/06/17 |
I am trying to write a query that will select records from either of these tables based on which ones were valid for a given date
We can assume that records in tblStaff are valid from the dateTo + 1 of the last entry for that employee in tblArchiveStaff, or from 03 Jan 16 if they have no archived records.
So far I have come up with the below query into which I have hardcoded the date condition of #07/01/2017#:
SELECT [Employee Number], [Dept], DateFrom, DateTo
FROM
(
SELECT ts.[Employee Number], ts.[Dept], nz(ta2.DateTo,#01/02/16#)+1 AS DateFrom, date() AS DateTo
FROM tblStaff ts LEFT JOIN tblArchiveStaff ta2 ON ts.[Employee Number] = ta2.[Employee Number]
UNION ALL
SELECT ta.[Employee Number], ta.[Dept], ta.DateFrom, ta.DateTo
FROM tblArchiveStaff ta
) AS tblUnion
WHERE #07/01/2017# BETWEEN DateFrom AND DateTo;
As I understand it the above query should return records valid on July 1st 2017 which would be both records in tblStaff, it is however returning the record in tblArchiveStaff. It is almost like it is treating the date condition as 7 January which would mean it is formatted dd/mm/yyyy which I thought was impossible.
Can anyone explain this please?
That is because you don't play by the rules. Always handle dates as date values, not strings, not numbers, no exceptions.
So, when you use Nz and even add 1, the union query cannot figure out the datatype, thus it falls back to return the result as text. Then DateFrom becomes Text while DateTo is Date which makes filtering a wild guess.
*Edit - I have also amended the first part of the union query to ensure the current record follows on from the most recent archived record)
Correct as this:
SELECT
tblUnion.[Employee Number],
tblUnion.[Dept],
tblUnion.DateFrom,
tblUnion.DateTo
FROM
(SELECT
ts.[Employee Number],
ts.[Dept],
DateAdd("d", 1, Nz(ta2.MaxDateTo, #01/02/16#)) AS DateFrom,
Date() AS DateTo
FROM
tblStaff ts
LEFT JOIN
(SELECT
[Employee Number],
max(DateTo) AS MaxDateTo
FROM
tblArchiveStaff
GROUP BY
[Employee Number]) ta2
ON ts.[Employee Number] = ta2.[Employee Number]
UNION ALL
SELECT
ta.[Employee Number],
ta.[Dept],
ta.DateFrom,
ta.DateTo
FROM
tblArchiveStaff ta) AS tblUnion
WHERE
#7/1/2017# Between [DateFrom] And [DateTo];
and you will get the desired output:
Employee Number Dept DateFrom DateTo
205147 IT 2017-01-07 2017-07-05
205442 Accounts 2016-01-03 2017-07-05
Standard date flipping errors. Try to always use neutral date formats in your code like
d MMM yyyy
1 Jul 2017 (viable but not recommended because of language differences)
or
yyyy-MM-dd
2017-07-01 (recommended, will work everywhere)

SQL to find max of sum of data in one table, with extra columns

Apologies if this has been asked elsewhere. I have been looking on Stackoverflow all day and haven't found an answer yet. I am struggling to write the query to find the highest month's sales for each state from this example data.
The data looks like this:
| order_id | month | cust_id | state | prod_id | order_total |
+-----------+--------+----------+--------+----------+--------------+
| 67212 | June | 10001 | ca | 909 | 13 |
| 69090 | June | 10011 | fl | 44 | 76 |
... etc ...
My query
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders GROUP BY `month`, `state`
ORDER BY sales;
| month | state | sales |
+------------+--------+--------+
| September | wy | 435 |
| January | wy | 631 |
... etc ...
returns a few hundred rows: the sum of sales for each month for each state. I want it to only return the month with the highest sum of sales, but for each state. It might be a different month for different states.
This query
SELECT `state`, MAX(order_sum) as topmonth
FROM (SELECT `state`, SUM(order_total) order_sum FROM orders GROUP BY `month`,`state`)
GROUP BY `state`;
| state | topmonth |
+--------+-----------+
| ca | 119586 |
| ga | 30140 |
returns the correct number of rows with the correct data. BUT I would also like the query to give me the month column. Whatever I try with GROUP BY, I cannot find a way to limit the results to one record per state. I have tried PartitionBy without success, and have also tried unsuccessfully to do a join.
TL;DR: one query gives me the correct columns but too many rows; the other query gives me the correct number of rows (and the correct data) but insufficient columns.
Any suggestions to make this work would be most gratefully received.
I am using Apache Drill, which is apparently ANSI-SQL compliant. Hopefully that doesn't make much difference - I am assuming that the solution would be similar across all SQL engines.
This one should do the trick
SELECT t1.`month`, t1.`state`, t1.`sales`
FROM (
/* this one selects month, state and sales*/
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
) AS t1
JOIN (
/* this one selects the best value for each state */
SELECT `state`, MAX(sales) AS best_month
FROM (
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
)
GROUP BY `state`
) AS t2
ON t1.`state` = t2.`state` AND
t1.`sales` = t2.`best_month`
It's basically the combination of the two queries you wrote.
Try this:
SELECT `month`, `state`, SUM(order_total) FROM orders WHERE `month` IN
( SELECT TOP 1 t.month FROM ( SELECT `month` AS month, SUM(order_total) order_sum FROM orders GROUP BY `month`
ORDER BY order_sum DESC) t)
GROUP BY `month`, state ;

Access Query - Sum of Field based on newest for each day

I have the following data:
Site # | Site Name | Product | Reading Date | Volume
1 | Cambridge | Regular | 02/21/17 08:00 | 40000
2 | Cambridge | Regular | 02/22/17 07:00 | 35000
3 | Cambridge | Regular | 02/22/17 10:00 | 30000
What I want to achieve is get the SUM of [Volume] of the last 30 days while taking the newest reading EACH day possible since its pretty inconsistent whether one day there are 1,2 or 3 readings. I have tried a couple of things but can't get it to work.
This is what I've tried:
SELECT [Site #], Product, Sum(Volume) AS SumOfVolume, DatePart("d",InventoryDate]) AS Day
FROM [Circle K New]
GROUP BY [Site #], Product, Day
HAVING (([Site #]=852446) AND (Product ="Diesel Lows"))
ORDER BY DatePart("d",[Inventory Date]) DESC;
Result:
It adds the two readings of the same day. I was/am thinking about just getting a daily average then finding the monthly average from that. But I'm unsure if the value changes affect average numbers.
Based on your description:
select sum(volume)
from data as d
where d.readingdate in (select min(d2.readingdate)
from data as d2
group by int(d2.readingdate)
) and
d.readingdate >= dateadd("d", -30, date());

SQL server 2008: How to get the start date and end date of my data?

As a newb, I already know that I will be berated for asking this question, but I did not find the answer on the site here and could use some help...
I have a table that lists data by the day, and by type. For example
Transaction | Date | Type
-----------------------------
Updat | 11/7/2008 | Cash-out
Update | 11/10/2008 | Wrote-check
Deposit | 11/11/2009 | Cashed Check
Update | 11/18/2008 | Wrote check
Deposit | 11/19/2009 | Cashed Check
What I'm trying to do, is find the very first occurrence of each transaction type, and the very last occurrence of each transaction type.
so I'm trying to figure out an sql statement that I can write that will return something like this:
Transaction | First Date | Last Date |
----------------------------------------------
Update | 11/7/2008 | 11/18/2008 |
Deposit | 11/11/2009 | 1/19/2009 |
any ideas?
SELECT Transaction, Min([date]) AS [First Date] , Max([Date]) AS [Last Date]
FROM myTable GROUP BY Transaction
SELECT
transaction,
MIN([date]) AS [First Date],
MAX([date]) AS [Last Date]
FROM
My_Table
GROUP BY
transaction