Except does not make correct intersection between 2 sets - sql

I want to get best sellers from march 2019, while excluding the top 3 sellers of january. I tried using except where first SELECT gives best sellers of march (all of them) and the second SELECT gives top 3 of january.
SELECT * FROM (SELECT fullname, SUM(sale) sales
FROM mytable
WHERE oredrdate BETWEEN '2019-03-01' AND '2019-03-31'
GROUP BY fullname
ORDER BY sales DESC) X
EXCEPT
SELECT * FROM (SELECT fullname, SUM(sale) sales
FROM mytable
WHERE oredrdate BETWEEN '2019-01-01' AND '2019-01-31'
GROUP BY fullname
ORDER BY sales DESC
LIMIT 3) Y;
The problem is that EXCEPT does not intersect as I wished it would. What each SELECT returns and my desired output with data:
First SELECT returns:
fullname sales
Tommy Williams 8320
Ryan Atkinson 7310
Petey Cruiser 6200
Anna Mull 5840
Gail Forcewind 4120
Paige Turner 3300
Bob Frapples 2100
... ...
Seconds SELECT returns:
fullname sales
Tommy Williams 9220
Anna Mull 8100
Greta Life 7891
Desired OUTPUT:
fullname sales
Ryan Atkinson 7310
Petey Cruiser 6200
Gail Forcewind 4120
Paige Turner 3300
Bob Frapples 2100
... ...
How should I change my code to achieve this?

This can be done with a LEFT JOIN where you exclude the matching rows:
SELECT X.*
FROM (
SELECT fullname, SUM(sale) sales
FROM mytable
WHERE oredrdate BETWEEN '2019-03-01' AND '2019-03-31'
GROUP BY fullname
) X LEFT JOIN (
SELECT fullname, SUM(sale) sales
FROM mytable
WHERE oredrdate BETWEEN '2019-01-01' AND '2019-01-31'
GROUP BY fullname
ORDER BY sales DESC
LIMIT 3
) Y ON Y.fullname = X.fullname
WHERE Y.fullname IS NULL
ORDER BY X.sales DESC

You could use:
SELECT fullname, SUM(sales) AS total
FROM mytable
WHERE oredrdate BETWEEN '2019-03-01' AND '2019-03-31'
AND fullname NOT IN (SELECT fullname, SUM(sales) AS total
FROM mytable
WHERE oredrdate BETWEEN '2019-01-01' AND '2019-01-31'
AND fullname IS NOT NULL
GROUP BY fullname
ORDER BY total DESC LIMIT 3)
GROUP BY fullname
ORDER BY total DESC;
I would group by some kind of unique column like employee_id, there is possibility that two persons could have the same name.

The problem is that EXCEPT is considering both the name and the amount columns. It is unlikely that the second would match.
One way to write this is:
WITH jan3 as (
SELECT TOP (3) fullname, SUM(sale) as sales
FROM mytable
WHERE orderdate >= '2019-01-01' AND
orderdate < '2019-02-01'
GROUP BY fullname
ORDER BY sales DESC
)
SELECT m.fullname, SUM(m.sale) as sales
FROM mytable m
WHERE m.orderdate >= '2019-03-01' AND
m.orderdate < '2019-04-01' AND
NOT EXISTS (SELECT 1
FROM jan3
WHERE jan3.fullname = m.fullname
)
GROUP BY fullname
ORDER BY sales DESC;
Note that this changes the date comparisons to use >= and <. This is considered a best practice, because it works for dates and datetime (timestamp) values.
There are other ways of writing this using only a single aggregation. For instance:
WITH s as (
SELECT m.fullname,
SUM(CASE WHEN m.orderdate < '2019-02-01' THEN m.sale END) as sales_jan,
SUM(CASE WHEN m.orderdate >= '2019-03-01' THEN m.sale END) as sales_mar
FROM mytable m
WHERE m.orderdate >= '2019-01-01' AND
m.orderdate < '2019-04-01'
)
SELECT s.*
FROM (SELECT s.*,
ROW_NUMBER() OVER (ORDER BY sales_jan DESC) as seqnum_jan
FROM s
) s
WHERE seqnum_jan > 3
ORDER BY s.sales_mar;

Related

SQL - How to count number of distinct values (payments), after sum of rows where they have another column value (Due Date) in common

My 'deals_payments' table is:
Due Date Payment ID
1-Mar-19 1,000.00 123
1-Apr-19 1,000.00 123
1-May-19 1,000.00 123
1-Jun-19 1,000.00 123
1-Jul-19 1,000.00 123
1-Aug-19 1,000.00 123
1-Jun-19 500.00 456
1-Jul-19 500.00 456
1-Aug-19 500.00 456
I have the SQL code:
select
count(*), payment
from (select deals_payments.*,
(row_number() over (order by due_date) -
row_number() over (partition by payment order by due_date)
) as grp
from deals_payments
where id = 123
) deals_payments
group by grp, payment
order by grp
which gives me what I want - the number of payments on each distinct amount - (here I only asked for ID 123):
COUNT(*) PAYMENT
6 1000.00
But now I need the sum of payments of the two ID's (123 and 456), where the due dates are the same, and count the number of payments on each distinct amount, as:
COUNT(*) PAYMENT
3 1000.00
3 1500.00
I tried the below but it gives me the 'missing right parenthesis' error. What is wrong??
select
count(*),
(select
sum(total) total
from (select distinct
due_date,
(select
sum(payment)
from deals_payments
where (due_date = a.due_date)) as total
from deals_payments a
where a.id in (123, 456)
and payment > 0)
group by due_date
order by due_date) b
from (select deals_payments.*,
(row_number() over (order by due_date) -
row_number() over (partition by payment order by due_date)
) as grp
from deals_payments
where id = 123
) deals_payments
group by grp, payment
order by grp
Taking your earlier comments into consideration, I agree that the SQL can be simplified to get the intended result. My understanding is that the expected output is the frequency of the total payment of a subset of IDs on any given date.
select count(*) as PaymentFrequency, TotalPaidOnDueDate from
(
select due_date, sum(payment) as TotalPaidOnDueDate from #deals_payments
where ID in (123, 456)
group by due_date
) a
group by a.TotalPaidOnDueDate
Here is a sql fiddle I used to verify: http://sqlfiddle.com/#!18/6b04f/1
This seems really strange. I don't understand why your logic is so complicated.
How about this?
select id, count(*), max(payment)
from (select dp.*,
count(*) over (partition by due_date) as cnt
from deal_payments dp
where dp.id in (123, 456)
) dp
where cnt = 2
group by id;
An interesting question. Could this do the trick???
select payment, count(*)
from deals_payments
where due_date in
(select due_date
from deals_payments
group by due_date
having count(*) > 1)
group by payment;
You can add a filter by id if you want, of course.

get max date when sum of a field equals a value

I have a problem with writing a query.
Row data is as follow :
DATE CUSTOMER_ID AMOUNT
20170101 1 150
20170201 1 50
20170203 1 200
20170204 1 250
20170101 2 300
20170201 2 70
I want to know when(which date) the sum of amount for each customer_id becomes more than 350,
How can I write this query to have such a result ?
CUSTOMER_ID MAX_DATE
1 20170203
2 20170201
Thanks,
Simply use ANSI/ISO standard window functions to calculate the running sum:
select t.*
from (select t.*,
sum(t.amount) over (partition by t.customer_id order by t.date) as running_amount
from t
) t
where running_amount - amount < 350 and
running_amount >= 350;
If for some reason, your database doesn't support this functionality, you can use a correlated subquery:
select t.*
from (select t.*,
(select sum(t2.amount)
from t t2
where t2.customer_id = t.customer_id and
t2.date <= t.date
) as running_amount
from t
) t
where running_amount - amount < 350 and
running_amount >= 350;
ANSI SQL
Used for the test: TSQL and MS SQL Server 2012
select
"CUSTOMER_ID",
min("DATE")
FROM
(
select
"CUSTOMER_ID",
"DATE",
(
SELECT
sum(T02."AMOUNT") AMOUNT
FROM "TABLE01" T02
WHERE
T01."CUSTOMER_ID" = T02."CUSTOMER_ID"
AND T02."DATE" <= T01."DATE"
) "AMOUNT"
from "TABLE01" T01
) T03
where
T03."AMOUNT" > 350
group by
"CUSTOMER_ID"
GO
CUSTOMER_ID | (No column name)
----------: | :------------------
1 | 03/02/2017 00:00:00
2 | 01/02/2017 00:00:00
db<>fiddle here
DB-Fiddle
SELECT
tmp.`CUSTOMER_ID`,
MIN(tmp.`DATE`) as MAX_DATE
FROM
(
SELECT
`DATE`,
`CUSTOMER_ID`,
`AMOUNT`,
(
SELECT SUM(`AMOUNT`) FROM tbl t2 WHERE t2.`DATE` <= t1.`DATE` AND `CUSTOMER_ID` = t1.`CUSTOMER_ID`
) AS SUM_UP
FROM
`tbl` t1
ORDER BY
`DATE` ASC
) tmp
WHERE
tmp.`SUM_UP` > 350
GROUP BY
tmp.`CUSTOMER_ID`
Explaination:
First I select all rows and subselect all rows with SUM and ID where the current row DATE is smaller or same as all rows for the customer. From this tabe i select the MIN date, which has a current sum of >350
I think it is not an easy calculation and you have to calculate something. I know It could be seen a little mixed but i want to calculate step by step. As fist step if we can get success for your scenario, I believe it can be made better about performance. If anybody can make better my query please edit my post;
Unfortunately the solution that i cannot try on computer is below, I guess it will give you expected result;
-- Get the start date of customers
SELECT MIN(DATE) AS DATE
,CUSTOMER_ID
INTO #table
FROM TABLE t1
-- Calculate all possible date and where is sum of amount greater than 350
SELECT t1.CUSTOMER_ID
,SUM(SELECT Amount FROM TABLE t3 WHERE t3.DATE BETWEEN t1.DATE
AND t2.DATE) AS total
,t2.DATE AS DATE
INTO #tableCalculated
FROM #table t1
INNER JOIN TABLE t2 ON t.ID = t2.ID
AND t1.DATE != t2.DATE
WHERE total > 350
-- SELECT Min amount and date for per Customer_ID
SELECT CUSTOMER_ID, MIN(DATE) AS DATE
FROM #tableCalculated
GROUP BY ID
SELECT CUSTOMER_ID, MIN(DATE) AS GOALDATE
FROM ( SELECT cd1.*, (SELECT SUM(AMOUNT)
FROM CustData cd2
WHERE cd2.CUSTOMER_ID = cd1.CUSTOMER_ID
AND cd2.DATE <= cd1.DATE) AS RUNNINGTOTAL
FROM CustData cd1) AS custdata2
WHERE RUNNINGTOTAL >= 350
GROUP BY CUSTOMER_ID
DB Fiddle

SQL Join two tables by unrelated date

I’m looking to join two tables that do not have a common data point, but common value (date). I want a table that lists the date and total number of hired/terminated employees on that day. Example is below:
Table 1
Hire Date Employee Number Employee Name
--------------------------------------------
5/5/2018 10078 Joe
5/5/2018 10077 Adam
5/5/2018 10078 Steve
5/8/2018 10079 Jane
5/8/2018 10080 Mary
Table 2
Termination Date Employee Number Employee Name
----------------------------------------------------
5/5/2018 10010 Tony
5/6/2018 10025 Jonathan
5/6/2018 10035 Mark
5/8/2018 10052 Chris
5/9/2018 10037 Sam
Desired result:
Date Total Hired Total Terminated
--------------------------------------
5/5/2018 3 1
5/6/2018 0 2
5/7/2018 0 0
5/8/2018 2 1
5/9/2018 0 1
Getting the total count is easy, just unsure as the best approach from the standpoint of "adding" a date column
If you need all dates within some window then you need to join the data to a calendar. You can then left join and sum flags for data points.
DECLARE #StartDate DATETIME = (SELECT MIN(ActionDate) FROM(SELECT ActionDate = MIN(HireDate) FROM Table1 UNION SELECT ActionDate = MIN(TerminationDate) FROM Table2)AS X)
DECLARE #EndDate DATETIME = (SELECT MAX(ActionDate) FROM(SELECT ActionDate = MAX(HireDate) FROM Table1 UNION SELECT ActionDate = MAX(TerminationDate) FROM Table2)AS X)
;WITH AllDates AS
(
SELECT CalendarDate=#StartDate
UNION ALL
SELECT DATEADD(DAY, 1, CalendarDate)
FROM AllDates
WHERE DATEADD(DAY, 1, CalendarDate) <= #EndDate
)
SELECT
CalendarDate,
TotalHired = SUM(CASE WHEN H.HireDate IS NULL THEN NULL ELSE 1 END),
TotalTerminated = SUM(CASE WHEN T.TerminationDate IS NULL THEN NULL ELSE 1 END)
FROM
AllDates D
LEFT OUTER JOIN Table1 H ON H.HireDate = D.CalendarDate
LEFT OUTER JOIN Table2 T ON T.TerminationDate = D.CalendarDate
/* If you only want dates with data points then uncomment out the where clause
WHERE
NOT (H.HireDate IS NULL AND T.TerminationDate IS NULL)
*/
GROUP BY
CalendarDate
I would do this with a union all and aggregations:
select dte, sum(is_hired) as num_hired, sum(is_termed) as num_termed
from (select hiredate as dte, 1 as is_hired, 0 as is_termed from table1
union all
select terminationdate, 0 as is_hired, 1 as is_termed from table2
) ht
group by dte
order by dte;
This does not include the "missing" dates. If you want those, a calendar or recursive CTE works. For instance:
with ht as (
select dte, sum(is_hired) as num_hired, sum(is_termed) as num_termed
from (select hiredate as dte, 1 as is_hired, 0 as is_termed from table1
union all
select terminationdate, 0 as is_hired, 1 as is_termed from table2
) ht
group by dte
),
d as (
select min(dte) as dte, max(dte) as max_dte)
from ht
union all
select dateadd(day, 1, dte), max_dte
from d
where dte < max_dte
)
select d.dte, coalesce(ht.num_hired, 0) as num_hired, coalesce(ht.num_termed) as num_termed
from d left join
ht
on d.dte = ht.dte
order by dte;
Try this one
SELECT ISNULL(a.THE_DATE, b.THE_DATE) as Date,
ISNULL(a.Total_Hire,0) as Total_Hire,
ISNULL (b.Total_Terminate,0) as Total_terminate
FROM (SELECT Hire_date as the_date, COUNT(1) as Total_Hire
FROM TABLE_HIRE GROUP BY HIRE_DATE) a
FULL OUTER JOIN (SELECT Termination_Date as the_date, COUNT(1) as Total_Terminate
FROM TABLE_TERMINATE GROUP BY HIRE_DATE) a
ON a.the_date = b.the_date

SQL subtract from different table

I have two different queries from two tables. The first query I have is:
select sum(total_amount) as total_amount, supplier_name
from tbL_supplierAccountLedger
where DATE >= '2017-01-01' and DATE <= '2017-12-31' group by supplier_name
The output of this is
Total Amount | Supplier name
4000 A
5000 B
8000 C
9000 D
Here is my another query with different tablename
SELECT SUM(RET_AMOUNT)as returnamount, SUPPLIER_NAME
FROM tbl_PurchaseReturns
where CAST(date as DATE) >= '2017-01-01' and
CAST(date as DATE) <= '2017-12-31'
group by SUPPLIER_NAME
The output of this is
Return Amount | Supplier name
1000 A
2000 B
500 C
I want a query that automatically subtracts table B from table A.
Below is the expected output.
total amount | Supplier Name
3000 A
3000 B
7500 C
9000 D
use derived query and union both result, with the RET_AMOUNT of tbl_PurchaseReturns as negative value. And finally group by supplier_name
SELECT SUM(total_amount), supplier_name
FROM
(
SELECT sum(total_amount) as total_amount, supplier_name
from tbL_supplierAccountLedger
where DATE >= '2017-01-01' and DATE <= '2017-12-31'
group by supplier_name
UNION ALL
SELECT SUM(-RET_AMOUNT) as returnamount, supplier_name
FROM tbl_PurchaseReturns
where CAST(date as DATE) >= '2017-01-01'
and CAST(date as DATE) <= '2017-12-31'
group by supplier_name
) AS D
GROUP BY supplier_name
Do the JOINs
SELECT s.supplier_name,
r.total_amount - coalesce(returnamount, 0) as amount from
(
SELECT supplier_name , SUM(total_amount) as total_amount
FROM tbL_supplierAccountLedger
WHERE ...
GROUP BY supplier_name
)s LEFT JOIN (
SELECT SUPPLIER_NAME , SUM(RET_AMOUNT)as returnamount
FROM tbl_PurchaseReturns
WHERE ...
GROUP BY SUPPLIER_NAME
) r on r.SUPPLIER_NAME= s.supplier_name

How do I get the highest sum per day for last X days?

This is probably a easy one, but for the life of me I can't seem to figure it out.
Here is my table:
Date User Amount
---------- ----- ------
01/01/2010 User1 2
01/01/2010 User2 2
01/01/2010 User1 4
01/01/2010 User2 1
01/02/2010 User2 2
01/02/2010 User1 2
01/02/2010 User2 4
01/02/2010 User2 1
So on for past several months. I need get the following results:
Date User Amount
---------- ----- ------
01/01/2010 User1 6
01/02/2010 User2 7
Basically, the user with Max(SUM(Amount)) for each day.
I would appreciate any hints you guys can offer.
Thanks.
SELECT MAX(amt),`Date`,`User` FROM
(SELECT SUM(`Amount`),`Date`,`User` as amt .... GROUP BY `Date`,`User`)
GROUP BY `Date`
select t.*
from (
select Date, Max(Amount) as MaxAmount
from MyTable
group by Date
) tm
inner join MyTable t on tm.Date = t.Date and tm.MaxAmount = t.Amount
Note: this will give you both user records if there are two users with the same max amount on a given day.
I actually ended up going with the following:
WITH ranked AS
(
SELECT ROW_NUMBER() OVER (ORDER BY SUM(Amount), Date, User) as 'rank', SUM(Amount) AS Amount, User, Date FROM MyTable GROUP BY Date, User
)
SELECT Date, User, Amount
FROM ranked
WHERE rank IN ( select MAX(rank) from ranked group by Date)
ORDER BY Date DESC
Can be less verbose with the RANK ... OVER, but following is the straight-forward solution:
WITH summary_user_date
AS (SELECT Date, User, SUM(Amount) AS SumAmount
FROM MyTable
GROUP BY Date, User
)
, summary_date
AS (SELECT Date, MAX(SumAmount) AS SumAmount
FROM summary_user_date
GROUP BY Date
)
SELECT summary_user_date.*
FROM summary_user_date
INNER JOIN summary_date
ON summary_date.Date = summary_user_date.Date
AND summary_date.SumAmount = summary_user_date.SumAmount
It should be mentioned that if more then one user has the same maximum amount, all of them will be shown. If this is not desired then one should use RANK based solution.
Using CTEs you could do something like:
With DailyTotals As
(
Select [Date], [User], Sum(Amount) As Total
From #Test
Group By [Date], [User]
)
Select [Date],[User],Total
From DailyTotals As DT
Where Total = (
Select Max(Total)
From DailyTotals As DT1
Where DT1.[Date] = DT.[Date]
)
Order By DT.[Date]
A non-CTE solution would be:
Select [Date],[User],Total
From (
Select [Date], [User], Sum(Amount) As Total
From #Test
Group By [Date], [User]
) As DT
Where DT.Total = (
Select Max(DT1.Total)
From (
Select [Date], [User], Sum(Amount) As Total
From #Test
Group By [Date], [User]
) As DT1
Where DT1.[Date] = DT.[Date]
)
Order By DT.[Date]