T-SQL Query Join on different column depending on case - sql

I have a legacy system that has a sales table and a customer table, CMS and CUST respectively. I need to query for the shipped to address based on different criteria. The customer table treats each address as its own customer. So if I have a billing address, then a shipping address, those will both be different CUSTNUM's. The CMS table has columns CUSTNUM and SHIPNUM. If the sales order uses the billing address as the shipping address, SHIPNUM = 0. If those 2 address are different, SHIPNUM = a different customer number than CUSTNUM. I'm trying to write a query that joins CUST to CMS based on the case of SHIPNUM being > 0 or not. My original query just used CUSTNUM, and ignored the SHIPNUM. My new query is syntactically correct and executes, but the row count returned is 2860 vs 3590 for the old query. The old join statement is just the commented out line :ON CMS.CUSTNUM = CUST.CUSTNUM.
from
KGI_LOTNOS as LOT
INNER JOIN CMS
ON LOT.ORDERNO = CMS.ORDERNO
JOIN CUST
ON CUST.CUSTNUM =
CASE
WHEN CMS.SHIPNUM > 0
THEN CMS.SHIPNUM
Else CMS.CUSTNUM
END
-- ON CMS.CUSTNUM = CUST.CUSTNUM
INNER JOIN COUNTRY as C
ON CUST.COUNTRY = C.COUNTRY
Here is an example from the CMS table;
CUSTNUM SHIPNUM ORDERNO
41863 77394 828509 <--Different billing and shipping address
43242 69291 776888 <--Different billing and shipping address
2356 0 765022 <--Same billing and shipping address
Any thoughts on how to make this work?
PS Here is the original query in its entirety.
select
CUST.CUSTNUM as Customer,
CMS.CUSTNUM,
CMS.SHIPNUM,
CUST.CTYPE,
CMS.ORDERNO,
CMS.ODR_DATE,
LTRIM(RTRIM(CUST.FIRSTNAME)) as First,
LTRIM(RTRIM(CUST.LASTNAME)) as Last,
LTRIM(RTRIM(CUST.COMPANY)) as Company,
LTRIM(RTRIM(CUST.PHONE)) as Phone,
LTRIM(RTRIM(CUST.EMAIL)) as Email,
LTRIM(RTRIM(CUST.ADDR)) as ADDR1,
LTRIM(RTRIM(CUST.ADDR2)) as ADDR2,
LTRIM(RTRIM(CUST.ADDR3)) as ADDR3,
LTRIM(RTRIM(CUST.CITY)) as City,
LTRIM(RTRIM(CUST.State)) as State,
LTRIM(RTRIM(CUST.ZIPCODE)) as Zip,
LTRIM(RTRIM(C.NAME)) as Country,
LOT.ITEMNO,
LOT.LOTNO,
COUNT(LOT.ITEMNO) as Quantity
from
KGI_LOTNOS as LOT
INNER JOIN CMS
ON LOT.ORDERNO = CMS.ORDERNO
LEFT JOIN CUST
ON CMS.CUSTNUM = CUST.CUSTNUM
INNER JOIN COUNTRY as C
ON CUST.COUNTRY = C.COUNTRY
where
(
CUST.CTYPE IN ('P','W','Z')
)
AND
(
LOT.LOTNO IN ('1000001','20001','300001')
)
GROUP BY
CMS.ORDERNO,
CUST.CUSTNUM,
CMS.CUSTNUM,
CMS.SHIPNUM,
CUST.CTYPE,
CUST.FIRSTNAME,
CMS.ODR_DATE,
CUST.LASTNAME,
CUST.COMPANY,
CUST.PHONE,
CUST.EMAIL,
CUST.ADDR,
CUST.ADDR2,
CUST.ADDR3,
LOT.ITEMNO,
CUST.CITY,
CUST.STATE,
CUST.ZIPCODE,
C.NAME,
LOT.LOTNO
ORDER BY
Customer,
CMS.ORDERNO,
LOT.ITEMNO,
LOT.LOTNO

If you use INNER JOIN you have risk to exclude raws which have no reference in another table. This could be caused by any of 2 another joins in your expression - comment them and try again. If you still receive less records you should check consistency of your data - one table has values which not correspond to values in another table.
BTW, I don't like CASE in JOIN expression simply because it looks ugly. What do you thinK about this expression which seemed to do the job too:
LEFT JOIN CUST
ON CUST.CUSTNUM = COALESCE(NULLIF(CMS.SHIPNUM, 0), CMS.CUSTNUM)

You could use a CTE like this.
WITH cte (ORDERNO, SHIPNUM) AS
(
SELECT ORDERNUM, SHIPNUM = CASE
WHEN CMS.SHIPNUM > 0
FROM CMS

Fewer records join using your altered criteria, there are some CMS.SHIPNUM values that don't have matching CUSTNUM in the CUST table.
To find the problematic entries change from INNER to OUTER join and add WHERE criteria, something like:
LEFT JOIN CUST
ON CUST.CUSTNUM = CASE WHEN CMS.SHIPNUM > 0 THEN CMS.SHIPNUM
ELSE CMS.CUSTNUM
END
WHERE CUST.CUSTNUM IS NULL
AND CMS.SHIPNUM > 0
Edit: You'll have to remove the INNER JOIN to COUNTRY to see the unmatched from your updated JOIN since it joins on a field from the customer table, and make sure to have the SHIPNUM field in your SELECT.

your query looks correct but not sure why it is not working, try left joinCUST table twiceone on shipping and the other on billing and then write the case statement for each customer column.
select
LTRIM(RTRIM(case when CMS.SHIPNUM > 0 THEN CUST.FIRSTNAME else CUST_BILL.FIRSTNAME end )) as First,
from
KGI_LOTNOS as LOT
INNER JOIN CMS
ON LOT.ORDERNO = CMS.ORDERNO
left JOIN CUST CUST
ON CUST.CUSTNUM = CMS.SHIPNUM
left JOIN CUST CUST_BILL
ON CMS.CUSTNUM = CUST_BILL.CUSTNUM
INNER JOIN COUNTRY as C
ON CUST.COUNTRY = C.COUNTRY
if it still outputs less rows then something else is wrong

Related

LEFT JOIN with condition in WHERE clause

SELECT
supplies.id,
supplierId,
supplies.date,
supplies.commodity,
supplier_payments.date AS paymentDate,
FROM
supplies
INNER JOIN suppliers ON suppliers.id = supplies.supplierId
LEFT JOIN supplier_payments ON supplier_payments.supplyId = supplies.id
WHERE supplier_payments.isDeleted = 0 AND supplierId = 1
What I am trying is to get all records from supplies table and related records from supplier_payments but the supplier_payments.isDeleted should be equal to 0. What happens now that I only get records from supplies that have at least one supplier payment because of the condition. Is there a way to get all supply records and supply payments with condition?
Consider moving the condition on the LEFT JOINed table to the ON clause of the JOIN:
SELECT
sr.id,
se.supplierId,
se.date,
se.commodity,
sp.date AS paymentDate,
FROM supplies se
INNER JOIN suppliers sr ON sr.id = se.supplierId
LEFT JOIN supplier_payments sp
ON sp.supplyId = se.id
AND sp.isDeleted = 0
WHERE se.supplierId = 1
Side notes:
in a multi-table query, always qualify each column with the table it belongs to, to make the query easier to follow and avoid ambiguity
table aliases make the query easier to read and write

SQL Get aggregate as 0 for non existing row using inner joins

I am using SQL Server to query these three tables that look like (there are some extra columns but not that relevant):
Customers -> Id, Name
Addresses -> Id, Street, StreetNo, CustomerId
Sales -> AddressId, Week, Total
And I would like to get the total sales per week and customer (showing at the same time the address details). I have come up with this query
SELECT a.Name, b.Street, b.StreetNo, c.Week, SUM (c.Total) as Total
FROM Customers a
INNER JOIN Addresses b ON a.Id = b.CustomerId
INNER JOIN Sales c ON b.Id = c.AddressId
GROUP BY a.Name, c.Week, b.Street, b.StreetNo
and even if my SQL skill are close to none it looks like it's doing its job. But now I would like to be able to show 0 whenever the one customer don't have sales for a particular week (weeks are just integers). And I wonder if somehow I should get distinct values of the weeks in the Sales table, and then loop through them (not sure how)
Any help?
Thanks
Use CROSS JOIN to generate the rows for all customers and weeks. Then use LEFT JOIN to bring in the data that is available:
SELECT c.Name, a.Street, a.StreetNo, w.Week,
COALESCE(SUM(s.Total), 0) as Total
FROM Customers c CROSS JOIN
(SELECT DISTINCT s.Week FROM sales s) w LEFT JOIN
Addresses a
ON c.CustomerId = a.CustomerId LEFT JOIN
Sales s
ON s.week = w.week AND s.AddressId = a.AddressId
GROUP BY c.Name, a.Street, a.StreetNo, w.Week;
Using table aliases is good, but the aliases should be abbreviations for the table names. So, a for Addresses not Customers.
You should generate a week numbers, rather than using DISTINCT. This is better in terms of performance and reliability. Then use a LEFT JOIN on the Sales table instead of an INNER JOIN:
SELECT a.Name
,b.Street
,b.StreetNo
,weeks.[Week]
,COALESCE(SUM(c.Total),0) as Total
FROM Customers a
INNER JOIN Addresses b ON a.Id = b.CustomerId
CROSS JOIN (
-- Generate a sequence of 52 integers (13 x 4)
SELECT ROW_NUMBER() OVER (ORDER BY a.x) AS [Week]
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x)
CROSS JOIN (SELECT x FROM (VALUES(1),(1),(1),(1)) b(x)) b
) weeks
LEFT JOIN Sales c ON b.Id = c.AddressId AND c.[Week] = weeek.[Week]
GROUP BY a.Name
,b.Street
,b.StreetNo
,weeks.[Week]
Please try the following...
SELECT Name,
Street,
StreetNo,
Week,
SUM( CASE
WHEN Total IS NULL THEN
0
ELSE
Total
END ) AS Total
FROM Customers a
JOIN Addresses b ON a.Id = b.CustomerId
RIGHT JOIN Sales c ON b.Id = c.AddressId
GROUP BY a.Name,
c.Week,
b.Street,
b.StreetNo;
I have modified your statement in three places. The first is I changed your join to Sales to a RIGHT JOIN. This will join as it would with an INNER JOIN, but it will also keep the records from the table on the right side of the JOIN that do not have a matching record or group of records on the left, placing NULL values in the resulting dataset's fields that would have come from the left of the JOIN. A LEFT JOIN works in the same way, but with any extra records in the table on the left being retained.
I have removed the word INNER from your surviving INNER JOIN. Where JOIN is not preceded by a join type, an INNER JOIN is performed. Both JOIN and INNER JOIN are considered correct, but the prevailing protocol seems to be to leave the INNER out, where the RDBMS allows it to be left out (which SQL-Server does). Which you go with is still entirely up to you - I have left it out here for illustrative purposes.
The third change is that I have added a CASE statement that tests to see if the Total field contains a NULL value, which it will if there were no sales for that Customer for that Week. If it does then SUM() would return a NULL, so the CASE statement returns a 0 instead. If Total does not contain a NULL value, then the SUM() of all values of Total for that grouping is performed.
Please note that I am assuming that Total will not have any NULL values other than from the RIGHT JOIN. Please advise me if this assumption is incorrect.
Please also note that I have assumed that either there will be no missing Weeks for a Customer in the Sales table or that you are not interested in listing them if there are. Again, please advise me if this assumption is incorrect.
If you have any questions or comments, then please feel free to post a Comment accordingly.

Excluding multiple results in specific column (SQL JOIN)

I'm taking my first steps in terms of practical SQL use in real life.
I have a few tables with contractual and financial information and the query works exactly as I need - to a certain point. It looks more or less like that:
SELECT /some columns/ from CONTRACTS
Linked 3 extra tables with INNER JOIN to add things like department names, product information etc. This all works but they all have simplish one-to-one relationship (one contract related to single department in Department table, one product information entry in the corresponding table etc).
Now this is my challenge:
I also need to add contract invoicing information doing something like:
inner join INVOICES on CONTRACTS.contnoC = INVOICES.contnoI
(and selecting also the Invoice number linked to the Contract number, although that's partly optional)
The problem I'm facing is that unlike with other tables where there's always one-to-one relationship when joining tables, INVOICES table can have multiple (or none at all) entries that correspond to a single contract no. The result is that I will get multiple query results for a single contract no (with different invoice numbers presented), needlessly crowding the query results.
Essentially I'm looking to add INVOICES table to a query to just identify if the contract no is present in the INVOICES table (contract has been invoiced or not). Invoice number itself could be presented (it is with INNER JOIN), however it's not critical as long it's somehow marked. Invoice number fields remains blank in the result with the INNER JOIN function, which is also necessary (i.e. to have the row presented even if the match is not found in INVOICES table).
SELECT DISTINCT would look to do what I need, but I seemed to face the problem that I need to levy DISTINCT criteria only for column representing contract numbers, NOT any other column (there can be same values presented, but all those should be presented).
Unfortunately I'm not totally aware of what database system I am using.
Seems like the question is still getting some attention and in an effort to provide some explanation here are a few techniques.
If you just want any contract with details from the 1 to 1 tables you can do it similarily to what you have described. the key being NOT to include any column from Invoices table in the column list.
SELECT
DISTINCT Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
INNER JOIN Invoices i
ON c.contnoC = i.contnoI
Perhaps a Little cleaner would be to use IN or EXISTS like so:
SELECT
Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
WHERE
EXISTS (SELECT 1 FROM Invoices i WHERE i.contnoI = c.contnoC )
SELECT
Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
WHERE
contnoC IN (SELECT contnoI FROM Invoices)
Don't use IN if the SELECT ... list can return a NULL!!!
If you Actually want all of the contracts and just know if a contract has been invoiced you can use aggregation and a case expression:
SELECT
Contract, Department, ProductId, CASE WHEN COUNT(i.contnoI) = 0 THEN 0 ELSE 1 END as Invoiced
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
LEFT JOIN Invoices i
ON c.contnoC = i.contnoI
GROUP BY
Contract, Department, ProductId
Then if you actually want to return details about a particular invoice you can use a technique similar to that of cybercentic87 if your RDBMS supports or you could use a calculated column with TOP or LIMIT depending on your system.
SELECT
Contract, Department, ProductId, (SELECT TOP 1 InvoiceNo FROM invoices i WHERE c.contnoC = i.contnoI ORDER BY CreateDate DESC) as LastestInvoiceNo
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
GROUP BY
Contract, Department, ProductId
I would do it this way:
with mainquery as(
<<here goes you main query>>
),
invoices_rn as(
select *,
ROW_NUMBER() OVER (PARTITION BY contnoI order by
<<some column to decide which invoice you want to take eg. date>>) as rn
)
invoices as (
select * from invoices_rn where rn = 1
)
select * from mainquery
left join invoices i on contnoC = i.contnoI
This gives you an ability to get all of the invoice details to your query, also it gives you full control of which invoice you want see in your main query. Please read more about CTEs; they are pretty handy and much easier to understand / read than nested selects.
I still don't know what database you are using. If ROW_NUMBER is not available, I will figure out something else :)
Also with a left join you should use COALESCE function for example:
COALESCE(i.invoice_number,'0')
Of course this gives you some more possibilities, you could for example in your main select do:
CASE WHEN i.invoicenumber is null then 'NOT INVOICED'
else 'INVOICED'
END as isInvoiced
You can use
SELECT ..., invoiced = 'YES' ... where exists ...
union
SELECT ..., invoiced = 'NO' ... where not exists ...
or you can use a column like "invoiced" with a subquery into invoices to set it's value depending on whether you get a hit or not

Querying the max invoice for each related customer instead of every invoice for that customer

I need to write a query to display the customer code, customer first name, last name, full address, invoice date, and invoice total of the largest purchase made by each customer in Alabama (including any customers in Alabama who have never made a purchase; their invoice dates should be NULL and the invoice totals should display as 0).
Here is an ERD of the two required tables.
Here is what my latest query looks like.
select distinct lgcustomer.cust_code
,Cust_FName
,Cust_LName
,Cust_Street
,Cust_City
,Cust_State
,Cust_ZIP
,max(inv_total) as [Largest Invoice]
,inv_date
from lgcustomer left join lginvoice on lgcustomer.cust_code = lginvoice.cust_code
where Cust_State = 'AL'
group by lgcustomer.cust_code
,Cust_FName
,Cust_Lname
,Cust_Street
,Cust_City
,Cust_State
,Cust_ZIP
,Inv_Date
I don't understand why, despite using DISTINCT lgcustomer.cust_code as well as only the MAX(inv_total), it still returns every inv_total for that customer.
My professor says to make use of UNION, but as I understand it, that is for compiling two different tables with the same attributes...
I appreciate any responses that can point me in the right direction!
Solution
The answer we came to in class was to use a correlated subquery and union.
select c.cust_code
,cust_fname
,cust_lname
,cust_street
,cust_city
,cust_state
,inv_date
,inv_total
from lgcustomer c left outer join lginvoice i on c.cust_code = i.cust_code
where cust_state = 'AL'
and inv_total = (select max(inv_total)
from lginvoice i2
where i2.cust_code = c.cust_code)
union
select c.cust_code
,cust_fname
,cust_lname
,cust_street
,cust_city
,cust_state
,''
,0
from lgcustomer c left outer join lginvoice i on c.cust_code = i.cust_code
where cust_state = 'AL'
and inv_date is null
and inv_total is null
order by cust_lname asc

Join two tables where all child records of first table match all child records of second table

I have four tables: Customer, CustomerCategory, Limit, and LimitCategory. A customer can be in multiple categories and a limit can also have multiple categories. I need to write a query that will return the customer name and limit amount where ALL the customers categories match ALL the limit categories.
I'm guessing it would be similar to the answer here, but I can't seem to get it right. Thanks!
Edit - Here's what the tables look like:
tblCustomer
customerId
name
tblCustomerCategory
customerId
categoryId
tblLimit
limitId
limit
tblLimitCategory
limitId
categoryId
I THINK you're looking for:
SELECT *
FROM CustomerCategory
LEFT OUTER JOIN Customer
ON CustomerCategory.CustomerId = Customer.Id
INNER JOIN LimitCategory
ON CustomerCategory.CategoryId = LimitCategory.CategoryId
LEFT OUTER JOIN Limit
ON Limit.Id = LimitCategory.LimitId
Updated!
Thanks to Felix for pointing out a flaw in my existing solution (3 years after I originally posted it, hehe). After looking at it again, I think this might be correct. Here I'm getting (1) the customers and limits with matching categories, plus the number of matching categories, (2) the number of categories per customer, (3) the number of categories per limit, (4) I then ensure the number of categories for customer and limits is the same as the number of the matches between the customers and limits:
UNTESTED!
select
matches.name,
matches.limit
from (
select
c.name,
c.customerId,
l.limit,
l.limitId,
count(*) over(partition by cc.customerId, lc.limitId) as matchCount
from tblCustomer c
join tblCustomerCategory cc on c.customerId = cc.customerId
join tblLimitCategory lc on cc.categoryId = lc.categoryId
join tblLimit l on lc.limitId = l.limitId
) as matches
join (
select
cc.customerId,
count(*) as categoryCount
from tblCustomerCategory cc
group by cc.customerId
) as customerCategories
on matches.customerId = customerCategories.customerId
join (
select
lc.limitId,
count(*) as categoryCount
from tblLimitCategory lc
group by lc.limitId
) as limitCategories
on matches.limitId = limitCategories.limitId
where matches.matchCount = customerCategories.categoryCount
and matches.matchCount = limitCategories.categoryCount
I don't know if this will work or not, just a thought i had and i can't test it, I'm sures theres a nicer way! don't be too harsh :)
SELECT
c.customerId
, l.limitId
FROM
tblCustomer c
CROSS JOIN
tblLimit l
WHERE NOT EXISTS
(
SELECT
lc.limitId
FROM
tblLimitCategory lc
WHERE
lc.limitId = l.id
EXCEPT
SELECT
cc.categoryId
FROM
tblCustomerCategory cc
WHERE
cc.customerId = l.id
)