SQL Server Select Distinct and Order By with CASE - sql

I've seen quite a few questions/forum posts regarding this scenario but I either don't understand the solutions or the solutions provided are too specific to that particular question and I don't know how to apply it to my situation. I have the following query:
SELECT DISTINCT d.*
FROM Data d
JOIN Customers c
ON c.Customer_Name = d.Customer_Name
AND c.subMarket = d.subMarket
JOIN Sort s
ON s.Market = c.Market
ORDER BY d.Customer_Name, d.Category, d.Tab, d.SubMarket,
CASE s.sortBy
WHEN 'Comp_Rank'
THEN d.Comp_Rank
WHEN 'Market_Rank'
THEN d.Market_Rank
ELSE d.Other_Rank
END
I used that exact query on my MySQL database and it worked perfectly. We recently switched over to a SQL Server database and now it doesn't work and I get the error:
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
I've tried adding s.* to the SELECT (since s.sortBy is in the CASE) and that didn't change anything and I also tried listing out every single field in Data and Sort in SELECT and that resulted in the same exact error.
There actually aren't duplicates in Data, but when I do the joins it results in 4 exact duplicate rows for every single item and I don't know how to fix that so that's why I originally added the DISTINCT. I tried variations of LEFT JOINs, INNER JOINs, etc... and couldn't get a different result. Anyway, a solution to either issue would be fine but I'm assuming more information would be needed to figure out the JOIN duplicate issue.
Edit: I just realized that I mistakenly typed some of the fields in the ORDER BY (example, n.Category, n.Tab should have been d.Category, d.Tab). EVERYTHING in the ORDER BY is from the Data table which I've selected * from. As I said, I also tried listing out every field in the SELECT and that didn't help.

As the error suggests, when you use select distinct, you have to order by the expressions in the select clause. So, your case is an issue as well as all the columns not from d.
You can fix this by using group by instead, and including the columns that you want to sort by. Because the case includes a column from s, you need to include the case (or at least that column) in the group by:
SELECT d.*
FROM Data d JOIN
Customers c
ON c.Customer_Name = d.Customer_Name AND
c.subMarket = d.subMarket JOIN Sort s
ON s.Market = c.Market
GROUP BY "d.*",
(CASE s.sortBy WHEN 'Comp_Rank' THEN d.Comp_Rank
WHEN 'Market_Rank' THEN d.Market_Rank
ELSE d.Other_Rank
END)
ORDER BY d.Customer_Name, d.Category, d.Tab, d.SubMarket,
(CASE s.sortBy WHEN 'Comp_Rank' THEN d.Comp_Rank
WHEN 'Market_Rank' THEN d.Market_Rank
ELSE d.Other_Rank
END)
Note that "d.*" is in quotes. You need to list out all the columns in the group by.

Try this:
SELECT DISTINCT d.Customer_Name, d.Category, d.Tab, d.SubMarket,
CASE s.sortBy
WHEN 'Comp_Rank'
THEN d.Comp_Rank
WHEN 'Market_Rank'
THEN d.Market_Rank
ELSE
d.Other_Rank
END
FROM Data d
JOIN Customers c
ON c.Customer_Name = d.Customer_Name
AND c.subMarket = d.subMarket
JOIN Sort s
ON s.Market = c.Market
ORDER BY d.Customer_Name, d.Category, d.Tab, d.SubMarket,
CASE s.sortBy
WHEN 'Comp_Rank'
THEN d.Comp_Rank
WHEN 'Market_Rank'
THEN d.Market_Rank
ELSE
d.Other_Rank
END

Just to follow up on this, you just need to add the case to your SELECT with an AS. Then include that reference in the ORDER BY list.
SELECT DISTINCT d.*, case when ... then ... else ... end AS MyCase
...
ORDER BY d.Customer_Name, d.Category, d.Tab, d.SubMarket, MyCase
This answer is similar to Pang's, but I find not having to include the case statement in both the top and bottom better as errors could arise if you modified one and not the other.

Related

Change existing sql to left join only on first match

Adding back some original info for historical purposes as I thought simplifying would help but it didn't. We have this stored procedure, in this part it is selecting records from table A (calldetail_reporting_agents) and doing a left join on table B (Intx_Participant). Apparently there are duplicate rows in table B being pulled that we DON'T want. Is there any easy way to change this up to only pick the first match on table B? Or will I need to rewrite the whole thing?
SELECT 'Agent Calls' AS CallType,
CallDate,
CallTime,
RemoteNumber,
DialedNumber,
RemoteName,
LocalUserId,
CallDurationSeconds,
Answered,
AnswerSpeed,
InvalidCall,
Intx_Participant.Duration
FROM calldetail_reporting_agents
LEFT JOIN Intx_Participant ON calldetail_reporting_agents.CallID = Intx_Participant.CallIDKey
WHERE DialedNumber IN ( SELECT DialedNumber
FROM #DialedNumbers )
AND ConnectedDate BETWEEN #LocStartDate AND #LocEndDate
AND (#LocQueue IS NULL OR AssignedWorkGroup = #LocQueue)
Simpler version: how to change below to select only first matching row from table B:
SELECT columnA, columnB FROM TableA LEFT JOIN TableB ON someColumn
I changed to this per the first answer and all data seems to look exactly as expected now. Thank you to everyone for the quick and attentive help.
SELECT 'Agent Calls' AS CallType,
CallDate,
CallTime,
RemoteNumber,
DialedNumber,
RemoteName,
LocalUserId,
CallDurationSeconds,
Answered,
AnswerSpeed,
InvalidCall,
Intx_Participant.Duration
FROM calldetail_reporting_agents
OUTER APPLY (SELECT TOP 1
*
FROM Intx_Participant ip
WHERE calldetail_reporting_agents.CallID = ip.CallIDKey
AND calldetail_reporting_agents.RemoteNumber = ip.ConnValue
AND ip.HowEnded = '9'
AND ip.Recorded = '0'
AND ip.Duration > 0
AND ip.Role = '1') Intx_Participant
WHERE DialedNumber IN ( SELECT DialedNumber
FROM #DialedNumbers )
AND ConnectedDate BETWEEN #LocStartDate AND #LocEndDate
AND (#LocQueue IS NULL OR AssignedWorkGroup = #LocQueue)
You can try to OUTER APPLY a subquery getting only one matching row.
...
FROM calldetail_reporting_agents
OUTER APPLY (SELECT TOP 1
*
FROM intx_Participant ip
WHERE ip.callidkey = calldetail_reporting_agents.callid) intx_participant
WHERE ...
You should add an ORDER BY in the subquery. Otherwise it isn't deterministic which row is taken as the first. Or maybe that's not an issue.

Using select case on a left join?

I have used a left join on two of my tables. Now I want to use case to identify the records from my left table who don't have a match in the right table. Such records exist and have a null value in the 'id_zeus' column of my join, however when I execute the case, it is as these fields don't exist. Where am I going wrong ? I get "Present" in all my column Disturbance. I am using Oracle SQL developer.
SELECT
CASE DP.ID_PRB
WHEN NULL
THEN 'Absence'
ELSE 'Present' END as Disturbance,
FROM
FIRE.WSITE WI
LEFT JOIN
(SELECT DISTINCT
DPL.ID_PERT as ID_PRB
FROM FIRE.DEPPLAN DPL
GROUP BY DPL.ID_PERT
) DPL
ON WI.ID_PERT = DP.ID_PERT
What is const? You don't seem to need it. The SELECT DISTINCT and GROUP BY are redundant, so use only one of them. And your alias on the subquery is incorrect.
But your problem is the comparison to NULL. It doesn't even match when doing a comparison as you are doing in CASE. You need to use IS NULL:
SELECT (CASE WHEN DP.ID_PRB IS NULL THEN 'Absence' ELSE 'Present'
END) as Disturbance,
FROM FIRE.WSITE WI LEFT JOIN
(SELECT DISTINCT DPL.ID_PERT as ID_PRB
FROM FIRE.OSI_DEVIATION_PLANS DP
) DP
ON WI.ID_PERT = DP.ID_PERT;
This query would commonly be written as:
SELECT (CASE WHEN NOT EXISTS (SELECT 1
FROM FIRE.OSI_DEVIATION_PLANS DP
WHERE WI.ID_PERT = DP.ID_PERT
)
THEN 'Absence' ELSE 'Present'
END) as Disturbance,
FROM FIRE.WSITE WI ;
This offers more opportunities for optimization.

CASE statement ALIAS comparison

Here is my query:
SELECT DISTINCT v.codi, m.nom, v.matricula, v.data_compra, v.color,
v.combustible, v.asseguranca,
(CASE WHEN lloguer.dataf IS NOT NULL THEN 'Si' ELSE 'Llogat' END) AS Disponible
FROM vehicle v
INNER JOIN model m on model_codi=m.codi
INNER JOIN lloguer on codi_vehicle=v.codi
WHERE Disponible='Si';
What I'm trying to do it's to show only those rows that have the "lloguer.dataf" is not NULL, but it doesn't alow me to use the "Disponible" alias to do the last line comparison.
What can I do?
This is how the info is shown (with some more atribute) without the last line comparison.
The problem is the alias doesnt exists yet. So you have to repeat the full code or create a subquery.
SELECT *
FROM ( .... ) YourQuery
WHERE Disponible='Si';
You can read more details here https://community.oracle.com/thread/1109532?tstart=0
Remove your CASE WHEN from the SELECT block and replace your WHERE clause with:
WHERE lloguer.dataf IS NOT NULL
I'm a TSQL guy by nature, but can you do this?
Select distinct codi, nom,matricula, data_compra, colour, combustible, asseguranca from
(SELECT DISTINCT v.codi, m.nom, v.matricula, v.data_compra, v.color,
v.combustible, v.asseguranca,
(CASE WHEN lloguer.dataf IS NOT NULL THEN 'Si' ELSE 'Llogat' END) AS Disponible
FROM vehicle v
INNER JOIN model m on model_codi=m.codi
INNER JOIN lloguer on codi_vehicle=v.codi)
WHERE Disponible='Si';
As #JuanCarlosOropeza has stated, the alias doesn't exist until the data is initially fetched. This is why you would be able to use the alias in an order by clause without using a subquery, but not in the where clause as the data hasn't been fetched yet.

Using an Alias column in the where clause in Postgresql

I have a query like this:
SELECT
jobs.*,
(
CASE
WHEN lead_informations.state IS NOT NULL THEN lead_informations.state
ELSE 'NEW'
END
) AS lead_state
FROM
jobs
LEFT JOIN lead_informations ON
lead_informations.job_id = jobs.id
AND
lead_informations.mechanic_id = 3
WHERE
lead_state = 'NEW'
Which gives the following error:
PGError: ERROR: column "lead_state" does not exist
LINE 1: ...s.id AND lead_informations.mechanic_id = 3 WHERE (lead_state...
In MySql this is valid, but apparently not in Postgresql. From what I can gather, the reason is that the SELECT part of the query is evaluated later than the WHERE part. Is there a common workaround for this problem?
I struggled on the same issue and "mysql syntax is non-standard" is not a valid argument in my opinion. PostgreSQL adds handy non-standard extensions as well, for example "INSERT ... RETURNING ..." to get auto ids after inserts. Also, repeating large queries is not an elegant solution.
However, I found the WITH statement very helpful (CTE's). It sort of creates a temporary view within the query which you can use like a usual table then. I'm not sure if I have rewritten your JOIN correctly, but in general it should work like this:
WITH jobs_refined AS (
SELECT
jobs.*,
(SELECT CASE WHEN lead_informations.state IS NOT NULL THEN lead_informations.state ELSE 'NEW' END) AS lead_state
FROM jobs
LEFT JOIN lead_informations
ON lead_informations.job_id = jobs.id
AND lead_informations.mechanic_id = 3
)
SELECT *
FROM jobs_refined
WHERE lead_state = 'NEW'
You would need to either duplicate the case statement in the where clause, or my preference is to do something like the following:
SELECT *
FROM (
SELECT
jobs.*,
(CASE WHEN lead_informations.state IS NOT NULL THEN lead_informations.state ELSE 'NEW' END) as lead_state
FROM
"jobs"
LEFT JOIN lead_informations ON lead_informations.job_id = jobs.id
AND lead_informations.mechanic_id = 3
) q1
WHERE (lead_state = 'NEW')
MySQL's support is, as you experienced, non-standard. The correct way is to reprint the same expression used in the SELECT clause:
SELECT
jobs.*,
CASE
WHEN lead_informations.state IS NOT NULL THEN lead_informations.state
ELSE 'NEW'
END AS lead_state
FROM
jobs
LEFT JOIN lead_informations ON
lead_informations.job_id = jobs.id
AND
lead_informations.mechanic_id = 3
WHERE
lead_informations.state IS NULL
I believe the common solution is to use an inner SELECT for the calculation (or CASE statement in this case) so that the result of the inner SELECT is available to the entire outer query by the time the execution gets to that query. Otherwise, the WHERE clause is evaluated first and knows nothing about the SELECT clause.
Subquery:
SELECT "tab_1"."BirthDate", "tab_1"."col_1" FROM (
SELECT BirthDate, DATEADD(year, 18, BirthDate) AS "col_1" FROM Employees
) AS "tab_1"
WHERE "tab_1"."col_1" >= '2000-12-31';
I used alias in where like this. (Sub Query).
Select "Vendors"."VendorId", "Vendors"."Name","Result"."Total"
From (Select "Trans"."VendorId", ("Trans"."A"+"Trans"."B"+"Trans"."C") AS "Total"
FROM "Trans"
WHERE "Trans"."Year"=2014
) As "Result"
JOIN "Vendors" ON "Result"."VendorId"="Vendors"."VendorId"
WHERE "Vendors"."Class"='I' AND "Result"."Total" > 200

SQL Having Clause

I'm trying to get a stored procedure to work using the following syntax:
select count(sl.Item_Number)
as NumOccurrences
from spv3SalesDocument as sd
left outer join spv3saleslineitem as sl on sd.Sales_Doc_Type = sl.Sales_Doc_Type and
sd.Sales_Doc_Num = sl.Sales_Doc_Num
where
sd.Sales_Doc_Type='ORDER' and
sd.Sales_Doc_Num='OREQP0000170' and
sl.Item_Number = 'MCN-USF'
group by
sl.Item_Number
having count (distinct sl.Item_Number) = 0
In this particular case when the criteria is not met the query returns no records and the 'count' is just blank. I need a 0 returned so that I can apply a condition instead of just nothing.
I'm guessing it is a fairly simple fix but beyond my simple brain capacity.
Any help is greatly appreciated.
Wally
First, having a specific where clause on sl defeats the purpose of the left outer join -- it bascially turns it into an inner join.
It sounds like you are trying to return 0 if there are no matches. I'm a T-SQL programmer, so I don't know if this will be meaningful in other flavors... and I don't know enough about the context for this query, but it sounds like you are trying to use this query for branching in an IF statement... perhaps this will help you on your way, even if it is not quite what you're looking for...
IF NOT EXISTS (SELECT 1 FROM spv3SalesDocument as sd
INNER JOINs pv3saleslineitem as sl on sd.Sales_Doc_Type = sl.Sales_Doc_Type
and sd.Sales_Doc_Num = sl.Sales_Doc_Num
WHERE sd.Sales_Doc_Type='ORDER'
and sd.Sales_Doc_Num='OREQP0000170'
and sl.Item_Number = 'MCN-USF')
BEGIN
-- Do something...
END
I didn't test these but off the top of my head give them a try:
select ISNULL(count(sl.Item_Number), 0) as NumOccurrences
If that one doesn't work, try this one:
select
CASE count(sl.Item_Number)
WHEN NULL THEN 0
WHEN '' THEN 0
ELSE count(sl.Item_Number)
END as NumOccurrences
This combination of group by and having looks pretty suspicious:
group by sl.Item_Number
having count (distinct sl.Item_Number) = 0
I'd expect this having condition to approve only groups were Item_Number is null.
To always return a row, use a union. For example:
select name, count(*) as CustomerCount
from customers
group by
name
having count(*) > 1
union all
select 'No one found!', 0
where not exists
(
select *
from customers
group by
name
having count(*) > 1
)