Need to understand specific LEFT OUTER JOIN behavior in SQL SELECT - sql

I have two tables, transactions and dates. One date may have one or more transactions. I need to get a list of dates with or without transactions of a specific account (account number 111 below).
select d.the_date, t.account, t.amount from dates as d
LEFT OUTER JOIN transactions as t ON t.tx_date=d.the_date
where t.account=111 AND d.the_date>='2016-01-02' and d.the_date<='2017-12-30'
order by d.the_date;
The issue is that when I specify in the condition t.account=111 I don't get the dates on which account 111 did NOT make any transactions.
Only if I remove from the condition t.account=111 I DO get the dates with no transactions (i.e. the LEFT OUTER JOIN works). Why does this happen?

Conditions on the second table need to go into the on clause:
select d.the_date, t.account, t.amount
from dates d left join
transactions t
on t.tx_date = d.the_date and t.account = 111
where d.the_date >= '2016-01-02' and d.the_date <= '2017-12-30'
order by d.the_date;
Otherwise, the value of t.account is NULL in the where clause, turning the outer join into an inner join.

Related

Full Outer Join Not including all records from both sides

I have a calendar table with dates from 2000 through 2029. I have done a full outer join with my data table on date=tdate. My goal is to count trips occurring by date and to show a zero if there are no trips. However, all dates are not showing up. If there is no trip date in the data table, the date from the Calendar table does not show up at all. I've never seen this happen before. I've tried changing it to a left outer join which didn't work either. Actual query below.
SELECT DISTINCT c.DATE,
COALESCE(COUNT(t.TripID), 0) AS tripCount
FROM dbo.Calendar c
FULL OUTER JOIN dbo.TripsMade t ON c.DATE = CONVERT(DATE, t.tripdate, 126)
WHERE (c.DATE BETWEEN '2019-09-01' AND '2019-09-30' )
AND (t.compid = 270 OR t.compid IS NULL)
GROUP BY c.DATE
ORDER BY c.DATE
If I select all dates from the Calendar Table, they are all there. There are dates missing from the Data Table, and always will have some missing. I need all dates to show up on this report and show a zero if there are no associated records on the Data Table side.
What am I missing here?
Goal
DATE TOTAL_TRIPS
2019-09-01 3
2019-09-02 5
2019-09-03 0 <== This row is currently missing completely
2019-09-04 4
2019-09-05 0 <== This row is currently missing completely
2019-09-06 9
Your WHERE clause ends up filtering non-matches out from both tables, turning them almost into INNER JOINs (it is a little complicated with the condition on t.
I don't think you need a FULL JOIN. I suspect you want a LEFT JOIN with filtering (for the second table) in the ON clause:
SELECT c.Date, COUNT(t.TripID) AS tripCount
FROM dbo.Calendar c LEFT OUTER JOIN
dbo.TripsMade t
ON c.date = CONVERT(DATE, t.tripdate, 126) AND
t.compid = 270
WHERE c.Date BETWEEN '2019-09-01' AND '2019-09-30'
GROUP BY c.Date
ORDER BY c.Date;
If compid can actually take on NULL values and you want them, then use (t.compid = 270 OR t.compid IS NULL). I suspect, though, that is not what you want. The NULL just represents a non-match on the JOIN.
Notes:
SELECT DISTINCT is almost never used with GROUP BY, and it is not needed in this case.
COUNT() does not return NULL, so there is no need for COALESCE().
Filtering on the first table does belong in the WHERE clause.

Sum not selecting the values with Zero

I have two tables CDmachine and trnasaction.
CDMachine Table with columns CDMachineID, CDMachineName, InstallationDate
Transaction table with columns TransactionID,CDMachineID,TransactionTime,Amount
I am calculating revenue using the below query but it eliminates the machine without any transaction
SELECT CDMachine.MachineName,
SUM(Transaction.Amount)
FROM CDMachine
LEFT JOIN TRANSACTION ON CDMachine.CDMachineID = Transaction.CDMachineID
WHERE Transaction.TransactionTime BETWEEN '2019-01-01' AND '2019-01-31'
GROUP BY CDMachine.CDMachineName
ORDER BY 2
Move the WHERE condition to the ON clause:
select m.MachineName, sum(t.Amount)
from CDMachine m left join
Transaction t
on m.CDMachineID = t.CDMachineID and
t.TransactionTime between '2019-01-01' and '2019-01-31'
group by m.CDMachineName
order by 2;
The WHERE clause turns the outer join to an inner join -- meaning that you are losing the values that do not match.
If you want 0 rather than NULL for the sum, then use:
select m.MachineName, coalesce(sum(t.Amount), 0)
Even though you are using a LEFT JOIN, the fact that you have a filter on a column from the joined table causes rows that don't meet the join condition to be removed from the result set.
You need to apply the filter on transaction time to the transactions table, before joining it or as part of the join condition. I would do it like this:
SELECT CDMachine.MachineName,
SUM(Transaction.Amount)
FROM CDMachine
LEFT JOIN (
SELECT * FROM TRANSACTION
WHERE Transaction.TransactionTime BETWEEN '2019-01-01' AND '2019-01-31'
) AS Transaction
ON CDMachine.CDMachineID = Transaction.CDMachineID
GROUP BY CDMachine.CDMachineName
ORDER BY 2

SQL Query to count the records

I am making up a SQL query which will get all the transaction types from one table, and from the other table it will count the frequency of that transaction type.
My query is this:
with CTE as
(
select a.trxType,a.created,b.transaction_key,b.description,a.mode
FROM transaction_data AS a with (nolock)
RIGHT JOIN transaction_types b with (nolock) ON b.transaction_key = a.trxType
)
SELECT COUNT (trxType) AS Frequency, description as trxType,mode
from CTE where created >='2017-04-11' and created <= '2018-04-13'
group by trxType ,description,mode
The transaction_types table contains all the types of transactions only and transaction_data contains the transactions which have occurred.
The problem I am facing is that even though it's the RIGHT join, it does not select all the records from the transaction_types table.
I need to select all the transactions from the transaction_types table and show the number of counts for each transaction, even if it's 0.
Please help.
LEFT JOIN is so much easier to follow.
I think you want:
select tt.transaction_key, tt.description, t.mode, count(t.trxType)
from transaction_types tt left join
transaction_data t
on tt.transaction_key = t.trxType and
t.created >= '2017-04-11' and t.created <= '2018-04-13'
group by tt.transaction_key, tt.description, t.mode;
Notes:
Use reasonable table aliases! a and b mean nothing. t and tt are abbreviations of the table name, so they are easier to follow.
t.mode will be NULL for non-matching rows.
The condition on dates needs to be in the ON clause. Otherwise, the outer join is turned into an inner join.
LEFT JOIN is easier to follow (at least for people whose native language reads left-to-right) because it means "keep all the rows in the table you have already read".

How To Display Customers Even If They Have No Sales (results)

I have a code that pulls all of a Reps new customers and their year to date sales. The only problem is it only pulls customers that have sales in the invdate range, but I need it to show all of the accounts with a 0 if they do not have any sales. Is there any way to achieve this? I tried using COALESCE and it didn't seem to work. I also tried using left, right, full outer joins. Any help would be appreciated!
select
a.Acctnum,
sum(a.invtotal) as total
from invoices a right join accounts b on a.acctnum = b.acctnum where
a.invdate between '1/1/2017' and '12/31/2017'
and a.sls = '78'
and b.sls = '78'
and b.activetype = 'y' and b.startdate > (getdate()-365)
group by a.acctnum
order by total desc
You are restricting your results in your WHERE clause AFTER you join your table causing records to drop. Instead, switch to a LEFT OUTER JOIN with your accounts table driving the query. Then restrict your invoices table in your ON clause so that invoices are dropped BEFORE you join.
SELECT a.Acctnum,
sum(a.invtotal) AS total
FROM accounts b
LEFT OUTER JOIN invoices a ON
a.accntnum = b.acctnum AND
--Put the restrictions on your left most table here
--so they are removed BEFORE joining.
a.invdate BETWEEN '1/1/2017' AND '12/31/2017'
AND a.sls = '78'
WHERE
b.sls = '78'
AND b.activetype = 'y'
AND b.startdate > (getdate() - 365)
GROUP BY a.acctnum
ORDER BY total DESC
It's a bit like doing a subquery in invoices before joining in as the left table. It's just easier to drop the conditions into the ON clause.
Your problem is you where clauses are changing the right join to an inner join. Put all the ones that are aliased by a. into the ON clause.

Left and right joining in a query

A friend asked me for help on building a query that would show how many pieces of each model were sold on each day of the month, showing zeros when no pieces were sold for a particular model on a particular day, even if no items of any model are sold on that day. I came up with the query below, but it isn't working as expected. I'm only getting records for the models that have been sold, and I don't know why.
select days_of_months.`Date`,
m.NAME as "Model",
count(t.ID) as "Count"
from MODEL m
left join APPLIANCE_UNIT a on (m.ID = a.MODEL_FK and a.NUMBER_OF_UNITS > 0)
left join NEW_TICKET t on (a.NEW_TICKET_FK = t.ID and t.TYPE = 'SALES'
and t.SALES_ORDER_FK is not null)
right join (select date(concat(2009,'-',temp_months.id,'-',temp_days.id)) as "Date"
from temp_months
inner join temp_days on temp_days.id <= temp_months.last_day
where temp_months.id = 3 -- March
) days_of_months on date(t.CREATION_DATE_TIME) =
date(days_of_months.`Date`)
group by days_of_months.`Date`,
m.ID, m.NAME
I had created the temporary tables temp_months and temp_days in order to get all the days for any month. I am using MySQL 5.1, but I am trying to make the query ANSI-compliant.
You should CROSS JOIN your dates and models so that you have exactly one record for each day-model pair no matter what, and then LEFT JOIN other tables:
SELECT date, name, COUNT(t.id)
FROM (
SELECT ...
) AS days_of_months
CROSS JOIN
model m
LEFT JOIN
APPLIANCE_UNIT a
ON a.MODEL_FK = m.id
AND a.NUMBER_OF_UNITS > 0
LEFT JOIN
NEW_TICKET t
ON t.id = a.NEW_TICKET_FK
AND t.TYPE = 'SALES'
AND t.SALES_ORDER_FK IS NOT NULL
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
GROUP BY
date, name
The way you do it now you get NULL's in model_id for the days you have no sales, and they are grouped together.
Note the JOIN condition:
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
instead of
DATE(t.CREATION_DATE_TIME) = DATE(days_of_months.`Date`)
This will help make your query sargable (optimized by indexes)
You need to use outer joins, as they do not require each record in the two joined tables to have a matching record.
http://dev.mysql.com/doc/refman/5.1/en/join.html
You're looking for an OUTER join. A left outer join creates a result set with a record from the left side of the join even if the right side does not have a record to be joined with. A right outer join does the same on the opposite direction, creates a record for the right side table even if the left side does not have a corresponding record. Any column projected from the table that does not have a record will have a NULL value in the join result.