Left and right joining in a query - sql

A friend asked me for help on building a query that would show how many pieces of each model were sold on each day of the month, showing zeros when no pieces were sold for a particular model on a particular day, even if no items of any model are sold on that day. I came up with the query below, but it isn't working as expected. I'm only getting records for the models that have been sold, and I don't know why.
select days_of_months.`Date`,
m.NAME as "Model",
count(t.ID) as "Count"
from MODEL m
left join APPLIANCE_UNIT a on (m.ID = a.MODEL_FK and a.NUMBER_OF_UNITS > 0)
left join NEW_TICKET t on (a.NEW_TICKET_FK = t.ID and t.TYPE = 'SALES'
and t.SALES_ORDER_FK is not null)
right join (select date(concat(2009,'-',temp_months.id,'-',temp_days.id)) as "Date"
from temp_months
inner join temp_days on temp_days.id <= temp_months.last_day
where temp_months.id = 3 -- March
) days_of_months on date(t.CREATION_DATE_TIME) =
date(days_of_months.`Date`)
group by days_of_months.`Date`,
m.ID, m.NAME
I had created the temporary tables temp_months and temp_days in order to get all the days for any month. I am using MySQL 5.1, but I am trying to make the query ANSI-compliant.

You should CROSS JOIN your dates and models so that you have exactly one record for each day-model pair no matter what, and then LEFT JOIN other tables:
SELECT date, name, COUNT(t.id)
FROM (
SELECT ...
) AS days_of_months
CROSS JOIN
model m
LEFT JOIN
APPLIANCE_UNIT a
ON a.MODEL_FK = m.id
AND a.NUMBER_OF_UNITS > 0
LEFT JOIN
NEW_TICKET t
ON t.id = a.NEW_TICKET_FK
AND t.TYPE = 'SALES'
AND t.SALES_ORDER_FK IS NOT NULL
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
GROUP BY
date, name
The way you do it now you get NULL's in model_id for the days you have no sales, and they are grouped together.
Note the JOIN condition:
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
instead of
DATE(t.CREATION_DATE_TIME) = DATE(days_of_months.`Date`)
This will help make your query sargable (optimized by indexes)

You need to use outer joins, as they do not require each record in the two joined tables to have a matching record.
http://dev.mysql.com/doc/refman/5.1/en/join.html

You're looking for an OUTER join. A left outer join creates a result set with a record from the left side of the join even if the right side does not have a record to be joined with. A right outer join does the same on the opposite direction, creates a record for the right side table even if the left side does not have a corresponding record. Any column projected from the table that does not have a record will have a NULL value in the join result.

Related

Why is the SQL full outer join is not presenting unmatched customers (avc_id)?

I appreciate your help in advance!
The right table avc_enr has 108K customers (b.avc_id) in it. In the 2nd table (alias a), we have about 97K customers (a.avc_id).
I tried to use right, left and full outer join but every time the count of customers shows 97K rather than 108K customers (under Total_users)... any idea why with full outer join the count function is not counting all customers even if no common match is found between two tables?
with avc_enr as
(
select
dt, avc_id, service_template_name
from
hive.thor_satellite.v_nms_inventory_nmsdb_avc_service
where
current_status = 'ACTIVE' and dt = 20220809
)
select
a.dt, a.metrics_date,
avg(a.vsat_fl_byte_count_kbps) as AUPU_Kbps,
count(b.avc_id) as Total_users
from
hive.thor_satellite.vda_satellite_nms_performance_smts_avc_pm_throughput a
full outer join
avc_enr b on a.avc_id = b.avc_id and a.dt = b.dt
where
a.dt = 20220809
group by
a.dt, a.metrics_date

Left join in view with condition on the left table but still want all records

I have a date dimension table from which I want to left join to another table in order to show records (worker_availability) that exist for dates for the next couple weeks, for instance.
Date Dimension simply has every date for the next hundred years.
SELECT
dd.date_actual,
wa.worker_id,
string_agg(sh.name, ', ')
FROM public.worker_availability wa
LEFT JOIN public.d_date dd on wa.day = dd.date_actual --and wa.worker_id = '00000000-0000-0000-0000-000000000000'
LEFT JOIN public.shift sh on sh.shift_id = wa.shift_id
where
wa.worker_id = '00000000-0000-0000-0000-000000000000' AND
dd.date_actual >= NOW()
GROUP BY dd.date_actual, wa.worker_id
ORDER BY dd.date_actual asc
LIMIT 100
If I violate a principle of left join and use a where, then the results are incorrect.
If I add WHERE (worker_id = '00000000-0000-0000-0000-000000000000' OR wa.* IS NULL) then the results are still incorrect.
I want to see every date regardless of if there is a worker_availability record for the date.
The issue is that if I filter the left worker_availability left join (which works), then I can no longer make this query into a view and use it from EntityFramework because I cannot use the worker_id column in a where clause.

Joining 3 tables on 2 columns?

I've created 3 views with identical columns- Quantity, Year, and Variety. I want to join all three tables on year and variety in order to do some calculations with quantities.
The problem is that a particular year/variety combo does not occur on every view.
I've tried queries like :
SELECT
*
FROM
a
left outer join
b
on a.variety = b.variety
left outer join
c
on a.variety = c.variety or b.variety = c.variety
WHERE
a.year = '2015'
and b.year = '2015'
and a.year= '2015'
Obviously this isn't the right solution. Ideally I'd like to join on both year and variety and not use a where statement at all.
The desired output would be put all quantities of matching year and variety on the same line, regardless of null values on a table.
I really appreciate the help, thanks.
You want a full outer join, not a left join, like so:
Select coalesce(a.year, b.year, c.year) as Year
, coalesce(a.variety, b.variety, c.variety) as Variety
, a.Quantity, b.Quantity, c.Quantity
from tableA a
full outer join tableB b
on a.variety = b.variety
and a.year = b.year
full outer join tableC c
on isnull(a.variety, b.variety) = c.variety
and isnull(a.year, b.year) = c.year
where coalesce(a.year, b.year, c.year) = 2015
The left join you are using won't pick up values from b or c that aren't in a. Additionally, your where clause is dropping rows that don't have values in all three tables (because the year in those rows is null, which is not equal to 2015). The full outer join will grab rows from either table in the join, regardless of whether the other table contains a match.

SQL server SELECT with join performance issue

Sorry about the saga here but am trying to explain everything.
We have 2 databases that I would like to join some tables in.
1 database holds sales data from various different stores/sites. This database is quite large (over 3mill rows currently) This table is ItemSales
The other holds application data from an in house web app. These tables are Departments and GroupItems
I would like to create a query that joins 2 tables from the app database with the sales database table. This is so we can group some items together for a date range and see the amount sold for example.
My first attempt was (DealId being the variable that it is grouped on in the App):
SELECT d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate,
(SELECT SUM(ItemQty) AS Expr1
FROM Sales.dbo.ItemSales AS s
WHERE (Store = d.SiteId) AND (ItemNo = d.ItemNo) AND (ItemSaleDate >= d.ItemStartDate) AND (ItemSaleDate <= d.ItemEndDate)) AS ItemsSold, Sales.dbo.ItemSales.ItemDesc, Departments.Description
FROM Departments INNER JOIN
Sales.dbo.ItemSales ON Departments.Id = Sales.dbo.ItemSales.ItemDept RIGHT OUTER JOIN
GroupItems AS d ON Sales.dbo.ItemSales.ItemNo = d.ItemNo
WHERE (d.DealId = 11)
GROUP BY d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate, ItemDesc, Departments.Description, d.SiteId
ORDER BY d.Id
This does exactly what I want which is:
-Give me all the details from the GroupItems table (UnitValue, ItemStartDate, ItemEndDate etc)
-Gives me the SUM() on the ItemQty column for the amount sold (plus the description etc)
-Returns NULL for something with no sales for the period
It is VERY slow though. To the point that if the GroupItems table has more than about 7 items in it, it times out.
Second attempt has been:
SELECT d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate, SUM(ItemQty) AS ItemsSold, Sales.dbo.ItemSales.ItemDesc, Departments.Description
FROM Departments INNER JOIN
Sales.dbo.ItemSales ON Departments.Id = Sales.dbo.ItemSales.ItemDept RIGHT OUTER JOIN
GroupItems AS d ON Sales.dbo.ItemSales.ItemNo = d.ItemNo
WHERE (Store = d.SiteId) AND (d.DealId = 11) AND (Sales.dbo.ItemSales.ItemSaleDate >= d.ItemStartDate) AND (Sales.dbo.ItemSales.ItemSaleDate <= d.ItemEndDate)
GROUP BY d.Id, d.ItemNo, d.UnitValue, d.NoGST, d.ItemStartDate, d.ItemEndDate, ItemDesc, Departments.Description
ORDER BY d.Id
This is very quick and does not time out but does not return the NULLs for no sales items in the ItemSales table. This is a problem as we need to see nothing or 0 for a no sales item otherwise people will think we forgot to check that item.
Can someone help me come up with a query please that returns everything from the GroupItems table, shows the SUM() of items sold and doesn't time out? I have also tried a SELECT x WHERE EXISTS (Subquery) but this also didn't return the NULLs for me but I may have had that one wrong.
If you want everything from GroupItems regardless of the sales, use it as the base of the query and then use left outer joins from there. Something along these lines:
SELECT GroupItems.Id, GroupItems.ItemNo, GroupItems.UnitValue, GroupItems.NoGST,
GroupItems.ItemStartDate, GroupItems.ItemEndDate,
Sales.ItemDesc,
SUM(ItemQty) AS SumOfSales,
Departments.Description
FROM GroupItems
LEFT OUTER JOIN #tempSales AS Sales ON
Sales.ItemNo = GroupItems.ItemNo
AND Sales.Store = GroupItems.SiteId
AND Sales.ItemSaleDate >= GroupItems.ItemStartDate
AND Sales.ItemSaleDate <= GroupItems.ItemEndDate
LEFT OUTER JOIN Departments ON Departments.Id = Sales.ItemDept
WHERE GroupItems.DealId = 11
GROUP BY GroupItems.Id, GroupItems.ItemNo, GroupItems.UnitValue, GroupItems.NoGST,
GroupItems.ItemStartDate, GroupItems.ItemEndDate,
Sales.ItemDesc,
SUM(ItemQty) AS SumOfSales,
Departments.Description
ORDER BY GroupItems.Id
Does changing the INNER JOIN to Sales.dbo.ItemSales into a LEFT OUTER JOIN to Sales.dbo.ItemSales and changing the RIGHT OUTER JOIN to GroupItems into an INNER JOIN to GroupItems fix your issue?

SQL Table A Left Join Table B And top of table B

Im working myself into an SQL frenzy, hopefully someone out there can help!
I've got 2 tables which are basically Records and Outcomes, I want to join the 2 tables together, count the number of outcomes per record (0 or more) which I've got quite easily with:
Select records.Id, (IsNull(Count(outcomes.Id),0)) as outcomes
from records
Left Join
outcomes
on records.Id = outcomes.Id
group by
records.Id
The outcomes table also has a timestamp in it, what I want to do is include the last outcome in my result set, if I add that the my query it generates a record for every combination of records to outcomes.
Can any SQL expert point me in the right direction?
Cheers,
try:
SELECT
dt.Id, dt.outcomes,MAX(o.YourTimestampColumn) AS LastOne
FROM (SELECT --basically your original query, just indented differently
records.Id, (ISNULL(COUNT(outcomes.Id),0)) AS outcomes
from records
LEFT JOIN outcomes ON records.Id = outcomes.Id
GROUP BY records.Id
) dt
INNER JOIN outcomes o ON dt.Id = o.Id
GROUP BY dt.Id, dt.outcomes