WHERE clause in an SQL query - sql

I THINK what is happening with this query is if there are no records in the GenericAttribute table associated with the Product, then that product is not displayed. See line below in WHERE clause: "AND GenericAttribute.KeyGroup = 'Product'"
Is there a way to reword so that that part of the WHERE is ignored if no associated record in the GenericAttribute table?
Also, looking at my ORDER BY clause, will a record from the product table still show up if it has no associated record in the Pvl_AdDates table?
Thanks!
SELECT DISTINCT Product_Category_Mapping.CategoryId, Product.Id, Product.Name, Product.ShortDescription, Pvl_AdDates.Caption, Pvl_AdDates.EventDateTime, convert(varchar(25), Pvl_AdDates.EventDateTime, 120) AS TheDate, Pvl_AdDates.DisplayOrder, Pvl_Urls.URL, [Address].FirstName, [Address].LastName, [Address].Email, [Address].Company, [Address].City, [Address].Address1, [Address].Address2, [Address].ZipPostalCode, [Address].PhoneNumber
FROM [Address]
RIGHT JOIN (GenericAttribute
RIGHT JOIN (Pvl_Urls RIGHT JOIN (Pvl_AdDates
RIGHT JOIN (Product_Category_Mapping
LEFT JOIN Product
ON Product_Category_Mapping.ProductId = Product.Id)
ON Pvl_AdDates.ProductId = Product.Id)
ON Pvl_Urls.ProductId = Product.Id)
ON GenericAttribute.EntityId = Product.Id)
ON Address.Id = convert(int, GenericAttribute.Value)
WHERE
Product_Category_Mapping.CategoryId=12
AND GenericAttribute.KeyGroup = 'Product'
AND Product.Published=1
AND Product.Deleted=0
AND Product.AvailableStartDateTimeUtc <= getdate()
AND Product.AvailableEndDateTimeUtc >= getdate()
ORDER BY
Pvl_AdDates.EventDateTime DESC,
Product.Id,
Pvl_AdDates.DisplayOrder

I strongly encourage you to not mix left join and right join. I have written many SQL queries and cannot think of an occasion when that was necessary.
In fact, just stick to left join.
If you want all products (or at least all products not filtered out by the where clause), then start with the products table and go from there:
FROM Products p LEFT JOIN
Product_Category_Mapping pcm
ON pcm.ProductId = p.Id LEFT JOIN
Pvl_AdDates ad
ON ad.ProductId = p.id LEFT JOIN
Pvl_Urls u
ON u.ProductId = p.id LEFT JOIN
GenericAttribute ga
ON ga.EntityId = p.id LEFT JOIN
Address a
ON a.Id = convert(int, ga.Value)
Note that I added table aliases. These make queries easier to write and to read.
I would add a caution. It looks like you are combining data along different dimensions. You are likely to get a Cartesian product of the dimension attributes for each dimension. Perhaps that is what you want or the WHERE clause takes care of the additional rows.

Yes put constraints (restrictions) on tables on the outer side of outer joins in the on conditions of the outer join, not in the where clause. Conditions in where clauses are not evaluated and applied until after the outer joins are evaluated, so where there is not record in the outer table, the predicate will be false and entire row will be eliminated, undoing the outer-ness. Conditions in the join are evaluated during the join, before the rows from the inner side are added back in, so the result set will still include them.
Second, formatting formatting, formatting! Stick to one direction of join (left is easier) and use Aliases for tables names!
SELECT DISTINCT m.CategoryId, p.Id,
p.Name, p.ShortDescription, d.Caption, d.EventDateTime,
convert(varchar(25), d.EventDateTime, 120) TheDate,
d.DisplayOrder, u.URL, a.FirstName, a.LastName,
a.Email, a.Company, a.City, a.Address1, a.Address2,
a.ZipPostalCode, a.PhoneNumber
FROM Product_Category_Mapping m
left join Product p on p.Id = m.ProductId
and p.Published=1
and p.Deleted=0
and p.AvailableStartDateTimeUtc <= getdate()
and p.AvailableEndDateTimeUtc >= getdate()
left join Pvl_AdDates d ON d.ProductId = p.Id
left join Pvl_Urls u ON u.ProductId = p.Id
left join GenericAttribute g ON g.EntityId = p.Id
and g.KeyGroup = 'Product'
left join [Address] a ON a.Id = convert(int, g.Value)
WHERE m.CategoryId=12
ORDER BY d.EventDateTime DESC, p.Id, d.DisplayOrder

Related

Multiple joins with group by (Sum)

When I using multiple JOIN, I hope to get the sum of some column in joined tables.
SELECT
A.*,
SUM(C.purchase_price) AS purcchase_total,
SUM(D.sales_price) AS sales_total,
B.user_name
FROM
PROJECT AS A
LEFT JOIN
USER AS B ON A.user_idx = B.user_idx
LEFT JOIN
PURCHASE AS C ON A.project_idx = C.project_idx
LEFT JOIN
SALES AS D ON A.project_idx = D.project_idx
GROUP BY
????
You need to use subquery as follows:
SELECT A.project_idx,
a.project_name,
A.project_category,
sum(C.purchase_price) AS purcchase_total,
sum(D.sales_price) as sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_price
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_price
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
I am not sure but you can use inner join of project with user instead of left join.
SELECT A.project_idx,
a.project_name,
A.project_category,
purcchase_total,
sales_total,
B.user_name
FROM PROJECT AS A
LEFT JOIN USER AS B ON A.user_idx = B.user_idx
LEFT JOIN (select project_idx, sum(purchase_price) as purchase_total
from PURCHASE group by project_idx ) AS C ON A.project_idx = C.project_idx
LEFT JOIN (select project_idx, sum(sale_price) as sale_total
from SALES group by project_idx) AS D ON A.project_idx = D.project_idx
This is working correctly on MS-SQL Server.
Thanks to Popeye
You are attempting to aggregate over two unrelated dimensions, and that throws off all the calculations.
Correlated subqueries are an alternative:
SELECT p.*,
(SELECT SUM(pu.purchase_price)
FROM PURCHASE pu
WHERE p.project_idx = pu.project_idx
) as purchase_total,
(SELECT SUM(s.sales_price)
FROM SALES s
WHERE p.project_idx = s.project_idx
) as sales_total,
u.user_name
FROM PROJECT p LEFT JOIN
USER u
ON p.user_idx = u.user_idx ;
Note that this uses meaningful table aliases so the query is easier to read. Arbitrary letters are really no better (and perhaps worse) than using the entire table name.
Correlated subqueries avoid the outer aggregation as well -- and let you select all the columns from the first table, which is what you want. They also often have better performance with the right indexes.

Trying to understand NULL operator in Query

I'm looking to see a breakdown of the total dollar business that each vendor has done (indirectly via the distributor) with each customer, where I'm trying not to use the Inner Join Syntax. I basically don't understand the difference between the two outputs produced by the two queries shown below:
Query1
select customers.cust_id, vendors.vend_id, sum(OrderItems.item_price*OrderItems.quantity) as total_business from
(((Vendors left outer join products
on vendors.vend_id = products.prod_id)
left outer join OrderItems
on products.prod_id = OrderItems.prod_id)
left outer join Orders
on OrderItems.order_num = Orders.order_num)
left outer join Customers
on Orders.cust_id = Customers.cust_id
group by Customers.cust_id, vendors.vend_id
order by total_business
I get the following output:
Query2
select customers.cust_id, Vendors.vend_id, sum(quantity*item_price) as total_business from
(((Vendors left outer join Products
on Products.vend_id = Vendors.vend_id)
left outer join OrderItems --No inner joins allowed
on OrderItems.prod_id = Products.prod_id)
left outer join Orders
on Orders.order_num = OrderItems.order_num)
left outer join Customers
on Customers.cust_id = Orders.cust_id
where Customers.cust_id is not null -- THE ONLY DIFFERENCE BETWEEN QUERY1 AND QUERY2
group by Customers.cust_id, Vendors.vend_id
order by total_business
I don't understand how there are only NULL cust_id's associated with the 1st Output when in the 2nd Output we get some non-NULL cust_ids. Why doesn't the 1st Output include these non-NULL cust_id's
Thank You
Query One is joining Vendors and Products incorrectly:
on vendors.vend_id = products.prod_id -- Vend_ID = Prod_ID
Query Two is joining Vendors and Products correctly:
on Products.vend_id = Vendors.vend_id -- Vend_ID = Vend_ID
Once that is fixed, you'll get the same IDs in both queries. Then I suggest you read Dan's answer to understand why what you were trying to do in eliminating INNER JOIN from the query is cancelled out by adding a WHERE filter to a column from the last table in the chain.
When you left join to a table, then filter on that table in the where clause, the join effectively changes to an inner join. The workaround is to apply the filter as a joining condition.
In your second query, all you have to do is is change the word "where" to "and".

Understand Sub-Queries

I was initially looking to see a breakdown of the total dollar business that each vendor has done (indirectly via the distributor) with each customer, where I'm trying not to use the Inner Join Syntax and used the Query below for this purpose:
select customers.cust_id, Vendors.vend_id, sum(quantity*item_price) as total_business from
(((Vendors left outer join Products
on Products.vend_id = Vendors.vend_id)
left outer join OrderItems --No inner joins allowed
on OrderItems.prod_id = Products.prod_id)
left outer join Orders
on Orders.order_num = OrderItems.order_num)
left outer join Customers
on Customers.cust_id = Orders.cust_id
where Customers.cust_id is not null -- THE ONLY DIFFERENCE BETWEEN QUERY1 AND QUERY2
group by Customers.cust_id, Vendors.vend_id
order by total_business
Now, I am trying to see the query output results for all vendor-customer combinations, including those combinations where there was no business transacted and am trying to write this via a single SQL Query. My teacher provided this solution, but I honestly cannot understand the logic at all, as I've never come across Sub-queries.
select
customers.cust_id,
Vendors.vend_id,
sum(OrderItems.quantity*orderitems.item_price)
from
(
customers
inner join
Vendors on 1 = 1
)
left outer join --synthetic product using joins
(
orders
join
orderitems on orders.order_num = OrderItems.order_num
join
Products on orderitems.prod_id = products.prod_id
) on
Vendors.vend_id = Products.vend_id and
customers.cust_id = orders.cust_id
group by customers.cust_id, vendors.vend_id
order by customers.cust_id
Thanks a lot
I would write this query as:
select c.cust_id, v.vend_id, coalesce(cv.total, 0)
fro Customers c cross join
Vendors v left outer join
(select o.cust_id, v.vend_id, sum(oi.quantity * oi.item_price) as total
from orders o join
orderitems oi
on o.order_num = oi.order_num join
Products p
on oi.prod_id = p.prod_id
group by o.cust_id, v.vend_id
) cv
on cv.vend_id = v.vend_id and
cv.cust_id = c.cust_id
order by c.cust_id;
The structure is quite similar. Both version start by creating a cross product between all customers and vendors. This creates all the rows in the output result set. Next, the aggregation needs to be calculated at this level. In the above query, this is done explicitly as a subquery which aggregates the values to the customer/vendor level. (In the original query, this is done in the outer query.)
The final step is joining these together.
Your teacher should be encouraging you to use table aliases, particularly table abbreviations. You should also be encouraged to use the proper join. So, although you can express a cross join as an inner join with on 1=1, a cross join is part of the SQL language, not a hack.
Similarly, parentheses in the from clause can make the logic harder to follow. Explicit subqueries are more easily read.

Left join 3 tables returning results for every match in right table

Fairly straightforward, I have 3 tables i need to join. The DB (MSSQL) should have 1 record in the first two (p and u) tables, and then multiple records in the 3rd table (a).
I only want it to return a match from the first table (is that not a left outer join?) regardless if there is a match in the second table, but if there is display that match, and then if there is a match in the 3rd table (most situations there will be multiple matches) but to only use the first match when the column appt_date is ordered DESC (giving me the most recent appointment date)
SELECT p.person_id, p.ln, p.fn, p.sex,
u.ud1_id, u.ud2_id, a.date, a.time
FROM person p LEFT OUTER JOIN person_defined u
ON p.person_id = u.person_id LEFT OUTER JOIN appointments a
ON p.person_id = a.person_id
where p.home_phone = '123456789'
ORDER BY a.appt_date DESC
You can take advantage of the OUTER APPLY operator here to find the most recent appointment for each person. This is much easier than using the combination of the GROUP BY and ROW_NUMBER() operators.
SELECT
p.person_id, p.ln, p.fn, p.sex,
u.ud1_id, u.ud2_id,
pa.date, pa.time
FROM person p
LEFT OUTER JOIN person_defined u ON p.person_id = u.person_id
OUTER APPLY
(
select top 1 a.date, a.time
from appointments a
where a.person_id = p.person_id
order by a.appt_date desc
) pa
where p.home_phone = '123456789'

Using sum with a nested select

I'm using SQL Server. This statement lists my products per menu:
SELECT menuname, productname
FROM [web].[dbo].[tblMenus]
FULL OUTER JOIN [web].[dbo].[tblProductsRelMenus]
ON [tblMenus].Id = [tblProductsRelMenus].MenuId
FULL OUTER JOIN [web].[dbo].[tblProducts]
ON [tblProductsRelMenus].ProductId = [tblProducts].ProductId
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].Id = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
Some products don't have menus and vice versa. I use the following to establish what has been sold of each product.
SELECT [tblProducts].ProductName, SUM([tblOrderDetails].Ammount) as amount
FROM [web].[dbo].[tblProducts]
LEFT JOIN [web].[dbo].[tblOrderDetails]
ON ([tblProducts].ProductId = [tblOrderDetails].ProductId)
GROUP BY [tblProducts].ProductName
What I want to do is complement the top table with an amount column. That is, I want a table with the same number of rows as in the first table above but with an amount value if it exists, otherwise null.
I can't figure out how to do this. Any suggestions?
If I am not missing anything, the second query could be simplified, then incorporated into the first query like this:
SELECT
m.menuname,
p.productname,
t.amount
FROM [web].[dbo].[tblMenus] m
FULL JOIN [web].[dbo].[tblProductsRelMenus] pm ON m.Id = pm.MenuId
FULL JOIN [web].[dbo].[tblProducts] p ON pm.ProductId = p.ProductId
LEFT JOIN (
SELECT ProductId, SUM(Amount) as amount
FROM [web].[dbo].[tblOrderDetails]
GROUP BY ProductId
) t ON p.ProducId = t.ProductId