Rows which appeared only once in query results - sql

I want to have rows which appeared in 25th of February and did not appear in 6th of March. I've tried to do such query:
SELECT favfd.day, dv.title, array_to_string(array_agg(distinct "host_name"), ',') AS affected_hosts , htmltoText(dsol.fix),
count(dv.title) FROM fact_asset_vulnerability_finding_date favfd
INNER JOIN dim_vulnerability dv USING(vulnerability_id)
INNER JOIN dim_asset USING(asset_id)
INNER JOIN dim_vulnerability_solution USING(vulnerability_id)
INNER JOIN dim_solution dsol USING(solution_id)
INNER JOIN dim_solution_highest_supercedence dshs USING (solution_id)
WHERE (favfd.day='2018-02-25' OR favfd.day='2018-03-06') AND
dsol.solution_type='PATCH' AND dshs.solution_id=dshs.superceding_solution_id
GROUP BY favfd.day, dv.title, host_name, dsol.fix
ORDER BY favfd.day, dv.title
which gave me following results:
results
I've read that I need to add something like "HAVING COUNT(*)=1" but like you can see in query results my count columns looks quite weird. Here is my results with that line added:
results with having
Can you advice me what I am doing wrong?

One way is to use a HAVING clause to assert your date criteria:
SELECT
dv.title,
array_to_string(array_agg(distinct "host_name"), ',') AS affected_hosts,
htmltoText(dsol.fix),
count(dv.title)
FROM fact_asset_vulnerability_finding_date favfd
INNER JOIN dim_vulnerability dv USING(vulnerability_id)
INNER JOIN dim_asset USING(asset_id)
INNER JOIN dim_vulnerability_solution USING(vulnerability_id)
INNER JOIN dim_solution dsol USING(solution_id)
INNER JOIN dim_solution_highest_supercedence dshs USING (solution_id)
WHERE
dsol.solution_type = 'PATCH' AND
dshs.solution_id = dshs.superceding_solution_id
GROUP BY dv.title, dsol.fix
HAVING
SUM(CASE WHEN favfd.day = '2018-02-25' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN favfd.day = '2018-03-06' THEN 1 ELSE 0 END) = 0
ORDER BY favfd.day, dv.title
The only major changes I made were to remove host_name from the GROUP BY clause, because you use this column in aggregate, in the select clause. And I added the HAVING clause to check your logic.
The change you should make is to replace your implicit join syntax with explicit joins. Putting join criteria into the WHERE clause is considered bad practice nowadays.

Related

SQL subquery with join to main query

I have this:
SELECT
SU.FullName as Salesperson,
COUNT(DS.new_dealsheetid) as Units,
SUM(SO.New_profits_sales_totaldealprofit) as TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) as PPU,
-- opportunities subquery
(SELECT COUNT(*) FROM Opportunity O
LEFT JOIN Account A ON O.AccountId = A.AccountId
WHERE A.OwnerId = SU.SystemUserId AND
YEAR(O.CreatedOn) = 2022)
-- /opportunities subquery
FROM New_dealsheet DS
LEFT JOIN SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId
LEFT JOIN New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId
LEFT JOIN SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId
WHERE
YEAR(SO.New_purchaseordersenddate) = 2022 AND
SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName
I'm getting an error from the subquery:
Column 'SystemUser.SystemUserId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Is it possible to use the SystemUser table join from the main query in this way?
As has been mentioned extensively in the comments, the error is actually telling you the problem; SU.SystemUserId isn't in the GROUP BY nor in an aggregate function, and it appears in the SELECT of the query (albeit in the WHERE of a correlated subquery). Any columns in the SELECT must be either aggregated or in the GROUP BY when using one of the other for a query scope. As the column in question isn't aggregated nor in the GROUP BY, the error occurs.
There are, however, other problems. Like mentioned ikn the comments too, your LEFT JOINs make little sense, as many of the tables you LEFT JOIN to require a column in that table to have a non-NULL value; it is impossible for a column to have a non-NULL value if a row was not found.
You also use syntax like YEAR(<Column Name>) = <int Value> in the WHERE; this is not SARGable, and thus should be avoided. Use explicit date boundaries instead.
I assume here that SU.SystemUserId is a primary key, and so should be in the GROUP BY. This is probably a good thing anyway, as a person's full name isn't something that can be used to determine who a person is on their own (take this from someone who shared their name, date of birth and post code with another person in their youth; it caused many problems on the rudimentary IT systems of the time). This results in a query like this:
SELECT SU.FullName AS Salesperson,
COUNT(DS.new_dealsheetid) AS Units,
SUM(SO.New_profits_sales_totaldealprofit) AS TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) AS PPU,
(SELECT COUNT(*)
FROM dbo.Opportunity O
JOIN dbo.Account A ON O.AccountId = A.AccountId --A.OwnerID must have a non-NULL value, so why was this a LEFT JOIN?
WHERE A.OwnerId = SU.SystemUserId
AND O.CreatedOn >= '20220101' --Don't use YEAR(<Column Name>) = <int Value> syntax, it isn't SARGable
AND O.CreatedOn < '20230101') AS SomeColumnAlias
FROM dbo.New_dealsheet DS
JOIN dbo.SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId --SO.New_purchaseordersenddate must have a non-NULL value, so why was this a LEFT JOIN?
JOIN dbo.New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId --SP.New_SalesGroupIdName must have a non-NULL value, so why was this a LEFT JOIN?
LEFT JOIN dbo.SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId --This actually looks like it can be a LEFT JOIN.
WHERE SO.New_purchaseordersenddate >= '20220101' --Don't use YEAR(<Column Name>) = <int Value> syntax, it isn't SARGable
AND SO.New_purchaseordersenddate < '20230101'
AND SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName,
SU.SystemUserId;
Doing such a sub-query is bad on a performance-wise point of view
better do it like this:
SELECT
SU.FullName as Salesperson,
COUNT(DS.new_dealsheetid) as Units,
SUM(SO.New_profits_sales_totaldealprofit) as TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) as PPU,
SUM(csq.cnt) as Count
FROM New_dealsheet DS
LEFT JOIN SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId
LEFT JOIN New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId
LEFT JOIN SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId
-- Moved subquery as sub-join
LEFT JOIN (SELECT a.OwnerId, YEAR(o.CreatedOn) as year, COUNT(*) cnt FROM Opportunity O
LEFT JOIN Account A ON O.AccountId = A.AccountId
GROUP BY a.OwnerId, YEAR(o.CreatedOn) as csq ON csq.OwnerId = su.SystemUserId and csqn.Year = 2022
WHERE
YEAR(SO.New_purchaseordersenddate) = 2022 AND
SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName
So you have a nice join and a clean result
The query above is untested

SQL Query with counts only returning equivalent counts

I have a query that consists of 1 table and 2 sub queries. The table being a listing of all customers, 1 sub query is a listing all of the quotes given over a period of time for customers and the other sub query is a listing of all of the orders booked for a customer over the same period of time. What I am trying to do is return a result set that is a customer, the number of quotes given, and the number of orders booked over a given period of time. However what I am returning is only a listening of customers over the period of time that have an equivalent quote and order count. I feel like I am missing something obvious within the context of the query but I am unable to figure it out. Any help would be appreciated. Thank you.
Result Set should look like this
Customer-------Quotes-------Orders Placed
aaa----------------4----------------4
bbb----------------9----------------18
ccc----------------18----------------9
select
[Customer2].[Name] as [Customer2_Name],
(count( Quotes.UD03_Key3 )) as [Calculated_CustomerQuotes],
(count( Customer_Bookings.OrderHed_OrderNum )) as [Calculated_CustomerBookings]
from Erp.Customer as Customer2
left join (select
[UD03].[Key3] as [UD03_Key3],
[UD03].[Key4] as [UD03_Key4],
[UD03].[Key1] as [UD03_Key1],
[UD03].[Date02] as [UD03_Date02]
from Ice.UD03 as UD03
inner join Ice.UD02 as UD02 on
UD03.Company = UD02.Company
And
CAST(CAST(UD03.Number09 AS INT) AS VARCHAR(30)) = UD02.Key1
left outer join Erp.Customer as Customer on
UD03.Company = Customer.Company
And
UD03.Key1 = Customer.Name
left outer join Erp.SalesTer as SalesTer on
Customer.Company = SalesTer.Company
And
Customer.TerritoryID = SalesTer.TerritoryID
left outer join Erp.CustGrup as CustGrup on
Customer.Company = CustGrup.Company
And
Customer.GroupCode = CustGrup.GroupCode
where (UD03.Key3 <> '0')) as Quotes on
Customer2.Name = Quotes.UD03_Key1
left join (select
[Customer1].[Name] as [Customer1_Name],
[OrderHed].[OrderNum] as [OrderHed_OrderNum],
[OrderDtl].[OrderLine] as [OrderDtl_OrderLine],
[OrderHed].[OrderDate] as [OrderHed_OrderDate]
from Erp.OrderHed as OrderHed
inner join Erp.Customer as Customer1 on
OrderHed.Company = Customer1.Company
And
OrderHed.BTCustNum = Customer1.CustNum
inner join Erp.OrderDtl as OrderDtl on
OrderHed.Company = OrderDtl.Company
And
OrderHed.OrderNum = OrderDtl.OrderNum) as Customer_Bookings on
Customer2.Name = Customer_Bookings.Customer1_Name
where Quotes.UD03_Date02 >= '5/15/2018' and Quotes.UD03_Date02 <= '5/15/2018' and Customer_Bookings.OrderHed_OrderDate >='5/15/2018' and Customer_Bookings.OrderHed_OrderDate <= '5/15/2018'
group by [Customer2].[Name]
You have several problems going on here. The first problem is your code is so poorly formatted it is user hostile to look at. Then you have left joins being logically treated an inner joins because of the where clause. You also have date literal strings in language specific format. This should always be the ANSI format YYYYMMDD. But in your case your two predicates are contradicting each other. You have where UD03_Date02 is simultaneously greater than and less than the same date. Thankfully you have =. But if your column is a datetime you have prevented any rows from being returned again (the first being your where clause). You have this same incorrect date logic and join in the second subquery as well.
Here is what your query might look like with some formatting so you can see what is going on. Please note I fixed the logical join issue. You still have the date problems because I don't know what you are trying to accomplish there.
select
[Customer2].[Name] as [Customer2_Name],
count(Quotes.UD03_Key3) as [Calculated_CustomerQuotes],
count(Customer_Bookings.OrderHed_OrderNum) as [Calculated_CustomerBookings]
from Erp.Customer as Customer2
left join
(
select
[UD03].[Key3] as [UD03_Key3],
[UD03].[Key4] as [UD03_Key4],
[UD03].[Key1] as [UD03_Key1],
[UD03].[Date02] as [UD03_Date02]
from Ice.UD03 as UD03
inner join Ice.UD02 as UD02 on UD03.Company = UD02.Company
And CAST(CAST(UD03.Number09 AS INT) AS VARCHAR(30)) = UD02.Key1
left outer join Erp.Customer as Customer on UD03.Company = Customer.Company
And UD03.Key1 = Customer.Name
left outer join Erp.SalesTer as SalesTer on Customer.Company = SalesTer.Company
And Customer.TerritoryID = SalesTer.TerritoryID
left outer join Erp.CustGrup as CustGrup on Customer.Company = CustGrup.Company
And Customer.GroupCode = CustGrup.GroupCode
where UD03.Key3 <> '0'
) as Quotes on Customer2.Name = Quotes.UD03_Key1
and Quotes.UD03_Date02 >= '20180515'
and Quotes.UD03_Date02 <= '20180515'
left join
(
select
[Customer1].[Name] as [Customer1_Name],
[OrderHed].[OrderNum] as [OrderHed_OrderNum],
[OrderDtl].[OrderLine] as [OrderDtl_OrderLine],
[OrderHed].[OrderDate] as [OrderHed_OrderDate]
from Erp.OrderHed as OrderHed
inner join Erp.Customer as Customer1 on OrderHed.Company = Customer1.Company
And OrderHed.BTCustNum = Customer1.CustNum
inner join Erp.OrderDtl as OrderDtl on OrderHed.Company = OrderDtl.Company
And OrderHed.OrderNum = OrderDtl.OrderNum
) as Customer_Bookings on Customer2.Name = Customer_Bookings.Customer1_Name
and Customer_Bookings.OrderHed_OrderDate >= '20180515'
and Customer_Bookings.OrderHed_OrderDate <= '20180515'
group by [Customer2].[Name]
COUNT() will just give you the number of records. You'd expect this two result columns to be equal. Try structuring it like this:
SUM(CASE WHEN Quote.UD03_Key1 IS NOT NULL THEN 1 ELSE 0 END) AS QuoteCount,
SUM(CASE WHEN Customer_Bookings.Customer1_Name IS NOT NULL THEN 1 ELSE 0 END) AS custBookingCount

Using COALESCE with JOIN on a different database column

Trying to populate the location column of a query and was hoping that the use of the COALESCE function would help me get what I want.
SELECT OrderItem.Code AS ItemCode, MAX(COALESCE(OrderItem.Location, [Picklist].[dbo].[ItemData].InventoryLocation)) AS Location, SUM(OrderItem.Quantity) AS Quantity, MAX(Store.StoreName) AS Store
FROM OrderItem
INNER JOIN [Order] ON OrderItem.OrderID = [Order].OrderID
INNER JOIN [Store] ON [Order].StoreID = [Store].StoreID
LEFT JOIN [AmazonOrder] ON [AmazonOrder].OrderID = [Order].OrderID
JOIN [Picklist].[dbo].[ItemData] ON [Picklist].[dbo].[ItemData].[InventoryNumber] = [OrderItem].[Code]
WHERE (CASE WHEN [Order].[LocalStatus] = 'Recently Downloaded' AND [AmazonOrder].FulfillmentChannel = 2 THEN 1
WHEN [Order].[LocalStatus] = 'Recently Downloaded' AND [Store].StoreName != 'Amazon' THEN 1 ELSE 0 END ) = 1
GROUP BY OrderItem.Code
ORDER BY ItemCode
There will not be a location when the Store is Amazon so I need to Join on another table in another database. I don't believe I'm using this correctly. Also I do get the right Location results returned if I use :
SELECT InventoryLocation From [Picklist].[dbo].[ItemData] WHERE InventoryNumber = 'L1201-2W-EA'
Perhaps this is more like the query that you want:
SELECT oi.Code AS ItemCode, COALESCE(oi.Location, id.InventoryLocation) AS Location,
oi.Quantity, s.StoreName AS Store
FROM OrderItem oi INNER JOIN
[Order] o
ON oi.OrderID = o.OrderID INNER JOIN
[Store]
ON o.StoreID = s.StoreID LEFT JOIN
AmazonOrder ao
ON ao.OrderID = o.OrderID JOIN
[Picklist].[dbo].[ItemData] id
ON id.InventoryNumber = oi.[Code]
WHERE o.LocalStatus = 'Recently Downloaded' AND
(ao.FulfillmentChannel = 2 OR s.StoreName <> 'Amazon')
ORDER BY ItemCode
Here are the changes:
Removed the aggregation. It does not seem to be part of the question.
Introduced table aliases, so the query is easier to write and to read.
Simplified the logic in the where clause.
As the comment above says, the max seems somewhat strange, an arbitrary aggregation no doubt due to one of the joins bringing back more information than you might of expected.
Then the statement has a few issues:
The coalesce is using two fields, neither if which is in a left join, only the AmazonOrder is left joined, so that seems a bit strange, that would only work if the first field in the coalesce (OrderItem.Location) is nullable - which it might be, there is no schema posted.
The left join itself is an inner join in disguise at present - within the where clause you have given explicit conditions on a field from that table - AND [AmazonOrder].FulfillmentChannel = 2 - if the record was actually missing the left join would return null for that field, and the where clause would then drop it out of the results. If you want this to properly work as a left join, any condition on fields from that table must move into the join condition, or the where clause itself must allow for that field being null (explicitly or using a coalesce.)
SELECT OrderItem.Code AS Code,
CASE WHEN (LEN(ISNULL(MAX([OrderItem].[Location]),'')) = 1)
THEN MAX([OrderItem].[Location])
ELSE MAX([Picklist].[dbo].[ItemData].InventoryLocation)
END AS Location,
SUM(OrderItem.Quantity) AS Quantity,
MAX(Store.StoreName) AS Store
FROM OrderItem
INNER JOIN [Order] ON OrderItem.OrderID = [Order].OrderID
INNER JOIN [Store] ON [Order].StoreID = [Store].StoreID
LEFT JOIN [AmazonOrder] ON [AmazonOrder].OrderID = [Order].OrderID
LEFT JOIN [Picklist].[dbo].[ItemData] ON [Picklist].[dbo].[ItemData].[InventoryNumber] = [OrderItem].[Code] OR
[Picklist].[dbo].[ItemData].[MediaCreator] = [OrderItem].[Code]
WHERE [Order].LocalStatus = 'Recently Downloaded' AND (AmazonOrder.FulfillmentChannel = 2 OR Store.StoreName <> 'Amazon')
GROUP BY OrderItem.Code
ORDER BY OrderItem.Code
Decided to go with case statement on location column route because I could not get COALESCE to work for me. Schema, some not all data, at SQLFiddle.
I guess if someone gets COALESCE to work I'll change the answer?
#Gordon Linoff I used the re-written WHERE clause because it looked cleaner than using the CASE statement. It worked and guessed there was a simpler way to go about it but was more worried about getting COALESCE to work. As for the Aliases sometimes I like to use them but in this case since there was a lot of tables I like to code out what I'm actually working in. Just my preference .

SQL Join only if all records have a match

I have 3 tables:
CP_carthead (idOrder)
CP_cartrows (idOrder, idCartRow)
CP_shipping (idCartRow, idShipping, dateShipped)
There can be multiple idCartRows per idOrder.
I want to get all orders where all its idCartRows exist in CP_shipping. This seems like it should be simple, but I haven't found much on the web.
Here's my query now:
SELECT
s.idOrder
, s.LatestDateShipped
FROM
CP_carthead o
LEFT OUTER JOIN (
SELECT
MAX(s.dateShipped) [LatestDateShipped]
, r.idOrder
FROM
CP_shipping s
LEFT OUTER JOIN CP_cartrows r ON s.idCartRow = r.idCartRow
GROUP BY
r.idOrder
) s ON o.idOrder = s.idOrder
Your query is returning rows from "s" and not the orders. Based on your question, I came up with this query:
select o.*
from CP_Carthead o
where o.orderId in (select cr.idOrder
from cp_cartrows cr left outer join
cp_shipping s
on cr.idCartRow = s.IdCartrow
group by cr.idOrder
having count(s.idCartRow) = COUNT(*)
)
The subquery in the in statement is getting orders all of whose cartrows are in shipping.

Subquery with multiple joins involved

Still trying to get used to writing queries and I've ran into a problem.
Select count(region)
where (regionTable.A=1) in
(
select jxn.id, count(jxn.id) as counts, regionTable.A
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
group by jxn.id, regionTable.A
)
The inner query gives an ID number in one column, the amount of times they appear in the table, and then a bit attribute if they are in region A. The outer query works but the error I get is incorrect syntax near the keyword IN. Of the inner query, I would like a number of how many of them are in region A
You must specify table name in query before where
Select count(region)
from table
where (regionTable.A=1) in
And you must choose one of them.
where regionTable.A = 1
or
where regionTable.A in (..)
Your query has several syntax errors. Based on your comments, I think there is no need for a subquery and you want this:
select jxn.id, count(jxn.id) as counts, regionTable.A
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
where regionTable.A = 1
group by jxn.id, regionTable.A
which can be further simplified to:
select jxn.id, count(jxn.id) as counts
, 1 as A --- you can even omit this line
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
where regionTable.A = 1
group by jxn.id
You are getting the error because of this line:
where (regionTable.A=1)
You cannot specify a condition in a where in clause, it should only be column name
Something like this may be what you want:
SELECT COUNT(*)
FROM
(
select jxn.id, count(jxn.id) as counts, regionTable.A
from
jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
group by jxn.id, regionTable.A
) sq
WHERE sq.a = 1