Correct Nested Query - sql

When I run the following T-SQL
use xxx
select
t.Vehicle,
t.Distance,
t.FuelConsumption,
d.LastConnection,
v.Make,
v.Model
from
dbo.Trips t
left join
dbo.Vehicles v on v.Id = t.Vehicle
left join
dbo.Devices d on d.Id = v.DeviceId
where
t.Date > '2020-03-02' and Distance > 1
order by
t.Vehicle, t.FuelConsumption
I get 34 rows as result that look like this:
The first Vehicle ID 76 has done 2 trips, 1 recorded fuel the other didn't. This is what I'm trying to establish.
So I attempted the following nested query
select
t.Vehicle,
d.LastConnection,
v.Make,
v.Model,
count(t.id) as TripCount,
sum(NoFuelRecord) as NoFuelRecord,
sum(FuelRecorded) as FuelRecorded
from
(select count(Id) as NoFuelRecord
from dbo.Trips
where Distance > 1 and FuelConsumption <= 0 and Date > '2020-03-02'
group by Vehicle) as NoFuelRecord,
(select count(Id) as FuelRecorded
from dbo.Trips
where Distance > 1 and FuelConsumption > 0 and Date > '2020-03-02'
group by Vehicle) as FuelRecorded,
dbo.Trips t
left join
dbo.Vehicles v on v.Id = t.Vehicle
left join
dbo.Devices d on d.Id = v.DeviceId
where
t.Date > '2020-03-02' and Distance > 1
group by
t.Vehicle, v.Make, v.Model, d.LastConnection
order by
t.Vehicle
Which returned the following results:
So what I'm expected to see in row 1 is TripCount: 2, NoFuelRecord: 1, FuelRecorded: 1
I'm not even close! How do I do this please?

It is hard to track why your query is not returning the expected output. But based on your initial query and the expected results that you described, this should give you what you are after
WITH CTE as (
SELECT t.Vehicle,
t.Distance,
t.FuelConsumption,
t.Id,
d.LastConnection,
v.Make,
v.Model
FROM dbo.Trips t
LEFT JOIN dbo.Vehicles v on v.Id = t.Vehicle
LEFT JOIN dbo.Devices d on d.Id = v.DeviceId
WHERE t.Date > '2020-03-02' and Distance > 1
)
SELECT Vehicle,
LastConnection,
Make,
Model,
COUNT(Id) AS TripCount,
SUM(CASE WHEN FuelConsumption > 0 THEN 1 ELSE 0 END) AS FuelRecorded,
SUM(CASE WHEN FuelConsumption <= 0 THEN 1 ELSE 0 END) AS NoFuelRecorded
FROM CTE
GROUP BY Vehicle,
LastConnection,
Make,
Model

Related

GROUP BY clause SQL invalid aggregate function

I'm trying to group by 'Condicion' but it takes an error:
Column 'exercisepractice.Cantidad' is invalid in the select list
because it is not contained in either an aggregate function or the
GROUP BY clause.
What am I doing wrong here? I am getting this error on:
create view exercisepractice
as
select 'Aprobados' as Condicion, sum(case when Promedio>13.5 then 1 else 0 end) as 'Cantidad', A.Sexo
from vAlumnos A inner join vMatricula M
on(A.CodAlumno=M.CodAlumno)
inner join vNotas N
on (M.NroMatricula=N.NroMatricula)
where N.SemAcademico='2020-I'
GROUP BY A.Sexo
union all
select 'Aprobados' as Condicion, sum(case when Promedio>13.5 then 1 else 0 end) as 'Cantidad', A.Sexo
from vAlumnos A inner join vMatricula M
on(A.CodAlumno=M.CodAlumno)
inner join vNotas N
on (M.NroMatricula=N.NroMatricula)
where N.SemAcademico='2020-I'
GROUP by A.Sexo
union all
select 'Desprobados' as Condicion, sum(case when Promedio<13.5 then 1 else 0 end) as 'Cantidad', A.Sexo
from vAlumnos A inner join vMatricula M
on(A.CodAlumno=M.CodAlumno)
inner join vNotas N
on (M.NroMatricula=N.NroMatricula)
where N.SemAcademico='2020-I'
GROUP BY A.Sexo
union all
select 'Desaprobados' as Condicion, sum(case when Promedio<13.5 then 1 else 0 end) as 'Cantidad', A.Sexo
from vAlumnos A inner join vMatricula M
on(A.CodAlumno=M.CodAlumno)
inner join vNotas N
on (M.NroMatricula=N.NroMatricula)
where N.SemAcademico='2020-I'
GROUP by A.Sexo
select * from exercisepractice
GROUP by Condicion
My expected result is like this:
CONDICION
CANTIDAD
SEXO
Aprobados
XXXX
M
Aprobados
XXXX
F
Desaprobados
XXXX
M
Desaprobados
XXXX
F
LOL I was duplicating 'Approved' and 'Disapproved'

Subquery returned more than 1 value no clue where the issue is?

SELECT MAX(te.StoreID) AS StoreID,
SUM(te.Price * te.Quantity) AS Sales,
SUM(te.Cost * te.Quantity) AS Cost,
COUNT(DISTINCT t.TransactionNumber) AS Trxn,
SUM(te.Quantity) AS Quantity
FROM TransactionEntry te
INNER JOIN [Transaction] t
ON te.TransactionNumber = t.TransactionNumber
AND te.StoreID = t.StoreID
LEFT JOIN item i
ON te.itemID = i.ID
LEFT JOIN Department d
ON i.DepartmentID = d.ID
WHERE d.ID <> 8 AND DATEDIFF(day, t.Time, GETDATE()) = 1
AND t.WebInvoiceID <> (select WebInvoiceID from [Transaction] where WebInvoiceID>0)
GROUP BY te.StoreID
Can anyone help me with this?
The error is in this line:
AND t.WebInvoiceID <> (select WebInvoiceID from [Transaction] where WebInvoiceID > 0 )
One way to fix this is to use NOT IN since the subquery returns multiple rows.
AND t.WebInvoiceID NOT IN (select WebInvoiceID from [Transaction] where WebInvoiceID>0)
Another way is by using NOT EXISTS which I preferred more
WHERE d.ID <> 8 AND DATEDIFF(day, t.Time, GETDATE()) = 1
AND NOT EXISTS
(
SELECT 1
FROM [Transaction] tr
WHERE t.WebInvoiceID = tr.WebInvoiceID
AND tr.WebInvoiceID > 0
)
If not mistaken, based from your original logic, you don't a subquery to filter out WebInvoiceID which are greater than zero. This can be simplified as:
WHERE d.ID <> 8 AND DATEDIFF(day, t.Time, GETDATE()) = 1
AND t.WebInvoiceID > 0

Select inside CASE THEN

I need to select the project rate or shift rate that has the effective date less than today.
SELECT
CASE
WHEN ISNULL(s.rate,0) = 0
THEN SELECT TOP 1 pr.rate FROM ProjectRates pr WHERE (pr.projectID = p.ID) AND (pr.effectiveDate < GETDATE()) ORDER BY pr.effectiveDate DESC
--p.rate
ELSE SELECT TOP 1 sr.rate FROM ShiftRates sr WHERE (sr.shiftID = s.ID) AND (sr.effectiveDate < GETDATE()) ORDER BY pr.effectiveDate DESC
--s.rate
END AS rate
FROM Projects p
INNER JOIN Shifts s ON (p.ID = s.projectID)
WHERE (p.ID = #projectID)
Please note that this code snippet is part of a larger stored proc and thus it must be within a CASE statement.
Subqueries need parentheses:
SELECT (CASE WHEN ISNULL(s.rate, 0) = 0
THEN (SELECT TOP 1 pr.rate
FROM ProjectRates pr
WHERE (pr.projectID = p.ID) AND (pr.effectiveDate < GETDATE())
ORDER BY pr.effectiveDate DESC
)
ELSE (SELECT TOP 1 sr.rate
FROM ShiftRates sr
WHERE (sr.shiftID = s.ID) AND (sr.effectiveDate < GETDATE())
ORDER BY pr.effectiveDate DESC
) --s.rate
END) AS rate
FROM Projects p INNER JOIN
Shifts s
ON p.ID = s.projectID
WHERE p.ID = #projectID;

HAVING clause on SUM column

I want to have a condition on my score column that I get from sum, but HAVING score =< 1 is not working if I put it after group by. That would have to show me projects that have good score. I am using hsqldb, what's going wrong? I get 'user lacks privelege or object not found: SCORE'
SELECT p.id, p.project_name, SUM(CASE r.type_code
WHEN 'GOOD' THEN 1
WHEN 'VERY_GOOD' THEN 1
WHEN 'BAD' THEN -1
WHEN 'VERY_BAD' THEN -1
ELSE 0 END) AS score
FROM record_project AS rp
JOIN project AS p ON p.id = rp.project_id
JOIN record AS r ON r.id = rp.record_id
GROUP BY p.id, p.project_name
HAVING score =< 1 <<<---- wrong?!
ORDER BY score DESC LIMIT 1
You should be using the whole calculated column,
SELECT p.id, p.project_name,
SUM(CASE WHEN r.type_code IN ('GOOD','VERY_GOOD') THEN 1
WHEN r.type_code IN ('BAD','VERY_BAD') THEN -1
ELSE 0 END) score
FROM record_project AS rp
JOIN project AS p ON p.id = rp.project_id
JOIN record AS r ON r.id = rp.record_id
GROUP BY p.id, p.project_name
HAVING SUM(CASE WHEN r.type_code IN ('GOOD','VERY_GOOD') THEN 1
WHEN r.type_code IN ('BAD','VERY_BAD') THEN -1
ELSE 0 END) <= 1
ORDER BY score DESC
-- LIMIT 1
You can incorporate the HAVING as a WHERE over a subquery:
SELECT * FROM (
SELECT p.id, p.project_name, SUM(CASE r.type_code
WHEN 'GOOD' THEN 1
WHEN 'VERY_GOOD' THEN 1
WHEN 'BAD' THEN -1
WHEN 'VERY_BAD' THEN -1
ELSE 0 END) AS score
FROM record_project AS rp
JOIN project AS p ON p.id = rp.project_id
JOIN record AS r ON r.id = rp.record_id
GROUP BY p.id, p.project_name) x
WHERE score =< 1
ORDER BY score DESC
LIMIT 1

Query for logistic regression, multiple where exists

A logistic regression is a composed of a uniquely identifying number, followed by multiple binary variables (always 1 or 0) based on whether or not a person meets certain criteria. Below I have a query that lists several of these binary conditions. With only four such criteria the query takes a little longer to run than what I would think. Is there a more efficient approach than below? Note. tblicd is a large table lookup table with text representations of 15k+ rows. The query makes no real sense, just a proof of concept. I have the proper indexes on my composite keys.
select patient.patientid
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1000
)
then '1' else '0'
end as moreThan1000
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1500
)
then '1' else '0'
end as moreThan1500
,case when exists
(
select distinct picd.patientid from patienticd as picd
inner join patient as p on p.patientid= picd.patientid
and picd.admissiondate = p.admissiondate
and picd.dischargedate = p.dischargedate
inner join tblicd as t on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%' and patient.patientid = picd.patientid
)
then '1' else '0'
end as diabetes
,case when exists
(
select r.patientid, count(*) from patient as r
where r.patientid = patient.patientid
group by r.patientid
having count(*) >1
)
then '1' else '0'
end
from patient
order by moreThan1000 desc
I would start by using subqueries in the from clause:
select q.patientid, moreThan1000, moreThan1500,
(case when d.patientid is not null then 1 else 0 end),
(case when pc.patientid is not null then 1 else 0 end)
from patient p left outer join
(select c.patientid,
(case when count(*) > 1000 then 1 else 0 end) as moreThan1000,
(case when count(*) > 1500 then 1 else 0 end) as moreThan1500
from tblclaims as c inner join
patient as p
on p.patientid=c.patientid and
c.admissiondate = p.admissiondate and
c.dischargedate = p.dischargedate
group by c.patientid
) q
on p.patientid = q.patientid left outer join
(select distinct picd.patientid
from patienticd as picd inner join
patient as p
on p.patientid= picd.patientid and
picd.admissiondate = p.admissiondate and
picd.dischargedate = p.dischargedate inner join
tblicd as t
on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%'
) d
on p.patientid = d.patientid left outer join
(select r.patientid, count(*) as cnt
from patient as r
group by r.patientid
having count(*) >1
) pc
on p.patientid = pc.patientid
order by 2 desc
You can then probably simplify these subqueries more by combining them (for instance "p" and "pc" on the outer query can be combined into one). However, without the correlated subqueries, SQL Server should find it easier to optimize the queries.
Example of left joins as requested...
SELECT
patientid,
ISNULL(CondA.ConditionA,0) as IsConditionA,
ISNULL(CondB.ConditionB,0) as IsConditionB,
....
FROM
patient
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionA from ... where ... ) CondA
ON patient.patientid = CondA.patientID
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionB from ... where ... ) CondB
ON patient.patientid = CondB.patientID
If your Condition queries only return a maximum one row, you can simplify them down to
(SELECT patientid, 1 as ConditionA from ... where ... ) CondA