SQL Join Not Returning What I Expect - sql

I have a temp table I am creating a query off of in the following format. That contains a record for every CustomerID, Year, and Month for several years.
#T
Customer | CustomerID | Year | Month
ex.
Foo | 12345 | 2008 | 12
Foo | 12345 | 2008 | 11
Bar | 11224 | 2007 | 7
When I join this temp table to another table of the following format I get many more results than I am expecting.
Event
EventID | CustomerID | DateOpened
ex.
1100 | 12345 | '2008-12-11 10:15:43'
1100 | 12345 | '2008-12-11 11:25:17'
I am trying to get a result set of the count of events along with the Customer, Year, and Month like this.
SELECT COUNT(EventID), Customer, Year, Month
FROM [Event]
JOIN #T ON [Event].CustomerID = #T.CustomerID
WHERE [Event].DateOpened BETWEEN '2008-12-01' AND '2008-12-31'
GROUP BY Customer, Year, Month
ORDER BY Year, Month
I am getting a record for every Year and Month instead of only for December 2008.

You're specifying the date on the event table but not on the join -- so it's joining all records from the temp table with a matching customerid.
Try this:
SELECT COUNT(e.EventID), T.Customer, T.Year, T.Month
FROM [Event] e
INNER JOIN #T T ON (
T.CustomerID = e.CustomerID and
T.Year = year(e.DateOpened) and
T.Month = month(e.DateOpened)
)
WHERE T.Year = 2008
and T.Month = 12
GROUP BY T.Customer, T.Year, T.Month
ORDER BY T.Year, T.Month

Perhaps what you mean is:
SELECT COUNT(EventID)
,Customer
,Year
,Month
FROM [Event]
INNER JOIN #T
ON [Event].CustomerID = #T.CustomerID
AND YEAR([Event].DateOpened) = #T.YEAR
AND MONTH([Event].DateOpened) = #T.MONTH
WHERE [Event].DateOpened >= '2008-12-01'
AND [Event].DateOpened < '2009-01-01'
GROUP BY Customer
,Year
,Month
ORDER BY Year
,Month
Note, I've fixed another latent bug in your code: your BETWEEN is going to exclude datetimes like '2008-12-31 10:15:43' You can use this or similar technique.

The problem is that there are two rows in #T with CustomerID = 12345. Each of those rows joins with each of the rows in Event. If you only want the CustomerID in December, then you need to filter #T too:
SELECT COUNT(EventID), Customer, Year, Month
FROM [Event]
JOIN #T ON [Event].CustomerID = #T.CustomerID
WHERE [Event].DateOpened BETWEEN '2008-12-01' AND '2008-12-31'
AND #T.Year = 2008
AND #T.Month = 12
GROUP BY Customer, Year, Month
ORDER BY Year, Month
If you have some other expectation, you'd better clarify your question.

It seams that #T has redundant information if you only want the customer name.
You can solve it with a subquery
SELECT COUNT(EventID), (select TOP 1 #T.Customer from #T Where #T.CustomerID = [Event].CustomerID ), Year, Month
FROM [Event]
WHERE [Event].DateOpened BETWEEN '2008-12-01' AND '2008-12-31'
GROUP BY CustomerID, Year, Month
ORDER BY Year, Month

Related

How to get value from a query of another table to create a new column (postgresql)

I am new to postgres and I want to be able to set value to Y if order (order table) is a first month order (first month order table)
first month order table is as per below. It will only show the order placed by user the first time in the month:
customer_id | order_date | order_id
--------------------------------------------------
a1 | December 6, 2015, 8:30 PM | orderA1
order table is as per below. It shows all the order records:
customer_id | order_date | order_id
-----------------------------------------------------
a1 | December 6, 2020, 8:30 PM | orderA1
a1 | December 7, 2020, 8:30 PM | orderA2
a2 | December 11, 2020, 8:30 PM | orderA3
To get the first month order column in the order table, I tried using case as below. But then it will give the error more than one row returned by a subquery.
SELECT DISTINCT ON (order_id) order_id, customer_id,
(CASE when (select distinct order_id from first_month_order_table) = order_id then 'Y' else 'N'
END)
FROM order_table
ORDER BY order_id;
I also tried using count but then i understand that this is quite inefficient and overworks the database i think.
SELECT DISTINCT ON (order_id) order_id, customer_id,
(CASE when (select count order_id from first_month_order_table) then 'Y' else 'N'
END)
FROM order_table
ORDER BY order_id;
How can I determine if the order is first month order and set the value as Y for every order in the order table efficiently?
Use the left join as follows:
SELECT o.order_id, o.customer_id,
CASE when f.order_id is not null then 'Y' else 'N' END as flag
FROM order_table o left join first_month_order_table f
on f.order_id = o.order_id
ORDER BY o.order_id;
If you have all orders in the orders table, you don't need the second table. Just use window functions. The following returns a boolean, which I find much more convenient than a character flag:
select o.*,
(row_number() over (partition by customer_id, date_trunc('month', order_date order by order_date) = 1) as flag
from orders o;
If you want a character flag, then you need case:
select o.*,
(case when row_number() over (partition by customer_id, date_trunc('month', order_date order by order_date) = 1
then 'Y' else 'N'
end) as flag
from orders o;

Count by week between dates

I'm trying to show a count by week but I am unsure of how to find the week that isn't showing between effdate and expdat. How do show the week and count shown below? Thanks.
You could use a recursive query to enumerate the weeks, then join it with the table
with cte as (
select min(effweek) week, max(expweek) max_week from mytable
union all
select week + 1, max_week from cte where week < max_week
)
select c.week, count(t.id_num) cnt
from cte c
left join mytable t on c.week between t.effweek and t.expweek
group by c.week
order by c.week
(Simplified) demo on DB Fiddle:
week | cnt
---: | --:
12 | 2
13 | 1
14 | 1

How to query total sum of sales by month by teams?

What is the best way to get the best selling teams by each month when I have tables like these:
The results should be something like this (group by total price of orders):
Month | Team | Sales
____________________
March | 2 | 3453
April | 3 | 1353
May | 2 | 5341
I have joined two tables before but for some reason joining 4 tables and grouping them by month seems difficult.
Thank you.
In Postgres, you can use distinct on -- if you want exactly one row per month:
select date_trunc('month', created), e.Team_nr, sum(p.price) as Sales
from employee e join
orders o
on e.id = o.employee_ID join
products p
on p.id = orders.product_id
group by e.Team_nr, date_trunc('month', o.Created)
order by date_trunc('month', o.Created), sum(p.price) desc
This should do it. I added the year too. It uses a CTE
with cte as
(
select Team_nr, sum(price) as Sales, date_part('month', Created) as _Month, date_part('year', Created) as _Year
from employee e
inner join orders o on e.id = o.employee_ID
inner join products p on p.id = orders.product_id
Group by Team_nr, date_part('month', Created), date_part('year', Created)
)
select Team_nr, Sales, _Month, _Year
from cte a
where not exists(select 1 from cte b where
a._Month = b._Month and a._Year = b._Year and a.Team_nr <> b.Team_nr and a.Sales < b.Sales )

Query two unbalanced tables

Sum across two tables returns unwanted Sum from one table multiplied by the number of rows in the other
I have 1 table with Actual results recorded by date and the other tables contains planned results recorded by month.
Table 1(Actual)
Date Location Amount
01/01/2019 Loc1 1000
01/02/2019 Loc1 700
01/01/2019 Loc2 7500
01/02/2019 Loc2 1000
02/01/2019 Loc1 500
Table 2(Plan)
Year Month Location Amount
2019 1 Loc1 1500
2019 1 Loc2 8000
2019 2 Loc1 800
I have tried various differed Joins using YEAR(Table1.date) and Month(table1.date) and grouping by
Month(Table1.Date) but I keep running into the same problem where the PlanAmount is multiplied by however many rows in the Actual table...
in the example of loc1 for Month 1 below I get
Year Month Location PlanAmount ActualAmount
2019 1 Loc1 3000 1700
I would like to return the below
Year Month Location PlanAmount ActualAmount
2019 1 Loc1 1500 1700
2019 1 Loc2 8000 8500
2019 2 Loc1 800 500
Thanks in advance for any help
D
You can do this with a full join or union all/group by:
select yyyy, mm, location,
sum(actual_amount) as actual_amount,
sum(plan_amount) as plan_amount
from ((select year(date) as yyyy, month(date) as mm, location,
amount as actual_amount, 0 as plan_amount
from actual
group by year(date) as yyyy, month(date) as mm, location
) union all
(select year, month, location,
0 as actual_amount, amount as plan_amount
from actual
group by year, month, location
)
) ap
group by yyyy, mm, location;
This ensures that you have rows, even when there are no matches in the other table.
To get the required results you need to group the first table on year of date, month of date and location and need to select the columns year, month, location and sum of amount from group after that you need to join that resultant r
SELECT
plans.year,
plans.month,
plans.location,
plans.plan_amount,
grouped_results.actual_amount
FROM plans
INNER JOIN (
SELECT
datepart(year, date) AS year,
datepart(month, date) AS month,
location,
SUM(amount) AS actual_amount
FROM actuals
GROUP BY datepart(year, date), datepart(month, date), location
) as grouped_results
ON
grouped_results.year = plans.year AND
grouped_results.month = plans.month AND
grouped_results.location = plans.location
I think the problem is that you are using sum(PlanTable.Amount) when grouping. Try using max(PlanTable.Amount) instead.
select
p.Year,
p.Month,
p.Location,
sum(a.Amount) as actual_amount,
max(p.Amount) as plan_amount
from
[Plan] p left join Actual a
on year(a.date) = p.year
and month(a.date) = p.Month
and a.Location = p.Location
group by
p.year,
p.month,
p.Location
SQL Fiddle
get year and month from date and use them in join , most dbms has year and month functions you can use according to your DBMS
select year(t1.date) yr,month(t1.date) as monthofyr ,t1.Location,
sum(t1.amount) as actual_amoun,
sum(t2.amount) as planamount
from table1 t1 left join table2 t2 on
month(t1.date)= t2.Month and t1.Location=t2.Location
and year(t1.date)=t2.year
group by year(t1.date) ,month(t1.date),Location

Sum Values From Specific Group of Rows - SQL

I am trying to sum all Sales/TXNs and count the distinct IDs for the whole month, and not just the individual row, in which the rank is "1". So for customer "ABC", I want to retrieve all their data for Jan, and for customer "DEF" I want all their data for Feb.
Below is an example table as well as what my desired result set would be (apologies for the formatting).
Sales Table:
Month|ID |Dte |TXNs|Sales|Rank
Jan |ABC|1-5-17|1 |$15 |1
Jan |ABC|1-8-17|2 |$10 |2
Feb |ABC|2-6-17|1 |$20 |3
Feb |DEF|2-6-17|2 |$10 |1
Mar |DEF|3-5-17|1 |$40 |2
May |DEF|5/2/17|3 |$20 |3
Desired Answer:
Month|IDs|TXNs|Sales
Jan |1 |3 |$25
Feb |1 |2 |$10
You can use group by and in clause
select month, count(ID), sum(TNXs), sum(sales)
from my_table where ( month, ID ) in (
select distinct Month, ID
from my_table
where rank = 1
)
group by month
I think the IDs you listed in your table aren't right? Should the first row in your result be ABC and the second result be DEF?
Anyhow, I think this should work:
select month, ID, sum(TXNs), sum(SALES)
from SALES_TABLE
where
(
ID='ABC'
and MONTH='Jan'
)
or (
ID='DEF'
and MONTH='Feb'
)
group by ID, MONTH
edit: I missed the count part. How's this?
select month, count(ID), sum(TXNs), sum(SALES)
from SALES_TABLE
where rank = 1
group by month
Count Distinct should get you what you're looking for:
SELECT Month, COUNT(DISTINCT ID) AS UniqueIds, COUNT(*) AS Transactions, SUM(Sales) AS TotalSales
FROM t
INNER JOIN (
SELECT Month, ID FROM t WHERE Rank = 1
)firstRank WHERE t.ID = firstRank.ID AND t.Month = firstRank.Month
GROUP BY Month
It's hard to follow your description, but this seems to match:
select Month
,count(*) -- number of IDs
,sum(sumTXN)
,sum(sumSales)
from
(
select Month, ID, sum(TXNs) as sumTXN, sum(Sales) as sumSales
from tab
group by Month, ID
having min(Rank) = 1 -- only those IDs with a Rank = 1 within this month
) as dt
group by Month