How can I SELECT the row with max value? - sql

select NATIONALPLAYERS.FIRSTNAME||' '|| NATIONALPLAYERS.LASTNAME as PlayerName, NATIONALPLAYERS.EXPERIENCEID, experience
from NATIONALPLAYERS
INNER JOIN (select experienceid, MAX( experiences.nationalgames + experiences.internationalgames ) as experience
from experiences
group by experiences.EXPERIENCEID)plexperience
on plexperience.experienceid = NATIONALPLAYERS.experienceid;
This displays all the records from PlayerName, ExperienceID, and Experience even though I asked only for the maximum value of experience.

What you are missing is the max experience. You are checking the experience_id but not making sure that the experience value is the max value
The query should be something like:
SELECT
NATIONALPLAYERS.FIRSTNAME||' '|| NATIONALPLAYERS.LASTNAME as PlayerName,
NATIONALPLAYERS.EXPERIENCEID,
experience
from NATIONALPLAYERS
INNER JOIN (
select
experienceid,
MAX( experiences.nationalgames + experiences.internationalgames ) as experience
from experiences
group by experiences.EXPERIENCEID) AS plexperience
on
plexperience.experienceid = NATIONALPLAYERS.experienceid
**plexperience.experience = NATIONALPLAYERS.experience;**

You are grouping by EXPERIENCEID and getting the maximum EXPERIENCE for each of those - you are not finding the EXPERIENCEID which corresponds to the maximum EXPERIENCE.
If you want to find the values for the maximum experience then you want something like:
SELECT *
FROM (
select n.FIRSTNAME ||' '|| n.LASTNAME as PlayerName
n.experienceid,
e.nationalgames + e.internationgames as Experience
from NATIONALPLAYERS n
INNER JOIN
EXPERIENCE e
ON ( e.experienceid = n.experienceid )
ORDER BY Experience DESC
)
WHERE ROWNUM = 1;
or, if there are multiple highest rows:
SELECT *
FROM (
select n.FIRSTNAME ||' '|| n.LASTNAME as PlayerName
n.experienceid,
e.nationalgames + e.internationgames as Experience,
RANK() OVER ( ORDER BY e.nationalgames + e.internationgames DESC )
AS rn
from NATIONALPLAYERS n
INNER JOIN
EXPERIENCE e
ON ( e.experienceid = n.experienceid )
ORDER BY Experience DESC
)
WHERE rn = 1;

Related

How to select a single row for each unique ID

SQL novice here learning on the job, still a greenhorn. I have a problem I don't know how to overcome. Using IBM Netezza and Aginity Workbench.
My current output will try to return one row per case number based on when a task was created. It will only keep the row with the newest task. This gets me about 85% of the way there. The issue is that sometimes multiple tasks have a create day of the same day.
I would like to incorporate Task Followup Date to only keep the newest row if there are multiple rows with the same Case Number. I posted an example of what my current code outputs and what i would like it to output.
Current code
SELECT
A.PS_CASE_ID AS Case_Number
,D.CASE_TASK_TYPE_NM AS Task
,C.TASK_CRTE_TMS
,C.TASK_FLWUP_DT AS Task_Followup_Date
FROM VW_CC_CASE A
INNER JOIN VW_CASE_TASK C ON (A.CASE_ID = C.CASE_ID)
INNER JOIN VW_CASE_TASK_TYPE D ON (C.CASE_TASK_TYPE_ID = D.CASE_TASK_TYPE_ID)
INNER JOIN ADMIN.VW_RSN_CTGY B ON (A.RSN_CTGY_ID = B.RSN_CTGY_ID)
WHERE
(A.PS_Z_SPSR_ID LIKE '%EFT' OR A.PS_Z_SPSR_ID LIKE '%CRDT')
AND CAST(A.CASE_CRTE_TMS AS DATE) >= '2020-01-01'
AND B.RSN_CTGY_NM = 'Chargeback Initiation'
AND CAST(C.TASK_CRTE_TMS AS DATE) = (SELECT MAX(CAST(C2.TASK_CRTE_TMS AS DATE)) from VW_CASE_TASK C2 WHERE C2.CASE_ID = C.CASE_ID)
GROUP BY
A.PS_CASE_ID
,D.CASE_TASK_TYPE_NM
,C.TASK_CRTE_TMS
,C.TASK_FLWUP_DT
Current output
Desired output
You could use ROW_NUMBER here:
WITH cte AS (
SELECT DISTINCT A.PS_CASE_ID AS Case_Number, D.CASE_TASK_TYPE_NM AS Task,
C.TASK_CRTE_TMS, C.TASK_FLWUP_DT AS Task_Followup_Date,
ROW_NUMBER() OVER (PARTITION BY A.PS_CASE_ID ORDER BY C.TASK_FLWUP_DT DESC) rn
FROM VW_CC_CASE A
INNER JOIN VW_CASE_TASK C ON A.CASE_ID = C.CASE_ID
INNER JOIN VW_CASE_TASK_TYPE D ON C.CASE_TASK_TYPE_ID = D.CASE_TASK_TYPE_ID
INNER JOIN ADMIN.VW_RSN_CTGY B ON A.RSN_CTGY_ID = B.RSN_CTGY_ID
WHERE (A.PS_Z_SPSR_ID LIKE '%EFT' OR A.PS_Z_SPSR_ID LIKE '%CRDT') AND
CAST(A.CASE_CRTE_TMS AS DATE) >= '2020-01-01' AND
B.RSN_CTGY_NM = 'Chargeback Initiation' AND
CAST(C.TASK_CRTE_TMS AS DATE) = (SELECT MAX(CAST(C2.TASK_CRTE_TMS AS DATE))
FROM VW_CASE_TASK C2
WHERE C2.CASE_ID = C.CASE_ID)
)
SELECT
Case_Number,
Task,
TASK_CRTE_TMS,
Task_Followup_Date
FROM cte
WHERE rn = 1;
One method used window functions:
with cte as (
< your query here >
)
select x.*
from (select cte.*,
row_number() over (partition by case_number, Task_Followup_Date
order by TASK_CRTE_TMS asc
) as seqnum
from cte
) x
where seqnum = 1;

Use of MAX function in SQL query to filter data

The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.
Appreciate any help !
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
Property.dbo.PROP
inner join
Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where
(PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
order by
PROP.EffDate DESC
Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:
with all_data as (
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
max (PROP.EffDate) over (partition by PROP.PolNo) as max_date
from Actuarial.dbo.PROP
inner join Actuarial.dbo.PROP_DATA
on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
)
select
FileName, InsName, Status, FileTime, SubmissionNo,
PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC
This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.
Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.
Something like the following might do the trick, for example:
select
p.FileName
, p.InsName
, p.Status
, p.FileTime
, p.SubmissionNo
, p.PolNo
, p.EffDate
, p.ExpDate
, p.Region
, p.Underwriter
, pd.Data
, pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
on p.FileID = pd.FileID
where (
select count(*)
from Actuarial.dbo.PROP p2
where p2.FileID = p.FileID
and p2.EffDate <= p.EffDate
) <= 1
and (
pd.Label in ('Occupancy' , 'OccupancyTIV')
and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC
Have a look at this stackoverflow question for a full working example.
Not tested
with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
select PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from Actuarial.dbo.PROP
inner join temp1 t
on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID
ORDER BY PROP.EffDate DESC

How Do I Group Rows Together In A Query?

I am trying to distinguish the physical servers uptime from virtual ones by looking at OS. I am able to pull out the result of the VMWare OS, however, I'd like to group the physical servers as one row.
Here is the code I have so far:
SELECT TOP (100) PERCENT Avg(dbo.tblserveruptime.uptime) AS Uptime,
Count(*) AS Total
FROM dbo.server
INNER JOIN dbo.tblserveruptime
ON dbo.server.name = dbo.tblserveruptime.name
WHERE ( dbo.server.status = N'production' )
AND ( dbo.server.server_env = N'prod' )
AND ( dbo.server.os_type <> N'vmware' )
GROUP BY dbo.tblserveruptime.month,
dbo.tblserveruptime.year
HAVING ( dbo.tblserveruptime.month = 4 )
AND ( dbo.tblserveruptime.year = 2013 )
UNION
SELECT TOP (100) PERCENT Avg(dbo.tblserveruptime.uptime) AS Uptime,
Count(*) AS Total
FROM dbo.server
INNER JOIN dbo.tblserveruptime
ON dbo.server.name = dbo.tblserveruptime.name
WHERE ( dbo.server.status = N'production' )
AND ( dbo.server.server_env = N'prod' )
AND ( dbo.server.os_type = N'vmware' )
GROUP BY dbo.tblserveruptime.month,
dbo.tblserveruptime.year
HAVING ( dbo.tblserveruptime.month = 4 )
AND ( dbo.tblserveruptime.year = 2013 )
Whatever field it is that gives you the physical server name, add that field to your GROUP BY statement and put it first. It will then group by Server, Month and Year (as you currently have it). My guess is you might want to swap Year and Month as it makes more sense to do it that way than by Month and then Year.
Okay, thanks to all who tried to help. I tried putting images of my before and after results, but I couldn't since I don't have 10 reputation points. Anyways, I think I got it. Here it is:
SELECT AVG(dbo.tblServerUptime.Uptime) AS Uptime,
CASE WHEN OS_TYPE = 'VMWare' THEN 'Virtual' ELSE 'Physical' END AS [Physical vs VM]
FROM dbo.Server INNER JOIN
dbo.tblServerUptime ON dbo.Server.NAME = dbo.tblServerUptime.NAME
GROUP BY CASE WHEN OS_TYPE = 'VMWare' THEN 'Virtual' ELSE 'Physical' END

Fastest way to check if the the most recent result for a patient has a certain value

Mssql < 2005
I have a complex database with lots of tables, but for now only the patient table and the measurements table matter.
What I need is the number of patient where the most recent value of 'code' matches a certain value. Also, datemeasurement has to be after '2012-04-01'. I have fixed this in two different ways:
SELECT
COUNT(P.patid)
FROM T_Patients P
WHERE P.patid IN (SELECT patid
FROM T_Measurements M WHERE (M.code ='xxxx' AND result= 'xx')
AND datemeasurement =
(SELECT MAX(datemeasurement) FROM T_Measurements
WHERE datemeasurement > '2012-01-04' AND patid = M.patid
GROUP BY patid
GROUP by patid)
AND:
SELECT
COUNT(P.patid)
FROM T_Patient P
WHERE 1 = (SELECT TOP 1 case when result = 'xx' then 1 else 0 end
FROM T_Measurements M
WHERE (M.code ='xxxx') AND datemeasurement > '2012-01-04' AND patid = P.patid
ORDER by datemeasurement DESC
)
This works just fine, but it makes the query incredibly slow because it has to join the outer table on the subquery (if you know what I mean). The query takes 10 seconds without the most recent check, and 3 minutes with the most recent check.
I'm pretty sure this can be done a lot more efficient, so please enlighten me if you will :).
I tried implementing HAVING datemeasurment=MAX(datemeasurement) but that keeps throwing errors at me.
So my approach would be to write a query just getting all the last patient results since 01-04-2012, and then filtering that for your codes and results. So something like
select
count(1)
from
T_Measurements M
inner join (
SELECT PATID, MAX(datemeasurement) as lastMeasuredDate from
T_Measurements M
where datemeasurement > '01-04-2012'
group by patID
) lastMeasurements
on lastMeasurements.lastmeasuredDate = M.datemeasurement
and lastMeasurements.PatID = M.PatID
where
M.Code = 'Xxxx' and M.result = 'XX'
The fastest way may be to use row_number():
SELECT COUNT(m.patid)
from (select m.*,
ROW_NUMBER() over (partition by patid order by datemeasurement desc) as seqnum
FROM T_Measurements m
where datemeasurement > '2012-01-04'
) m
where seqnum = 1 and code = 'XXX' and result = 'xx'
Row_number() enumerates the records for each patient, so the most recent gets a value of 1. The result is just a selection.

How to create an aggregate function for median?

I need to create an aggregate function in Advantage-Database to calculate the median value.
SELECT
group_field
, MEDIAN(value_field)
FROM
table_name
GROUP BY
group_field
Seems the solutions I am finding are quite specific to the sql engine used.
There is no built-in median aggregate function in ADS as you can see in the help file:
http://devzone.advantagedatabase.com/dz/webhelp/Advantage10.1/index.html
I'm afraid that you have to write your own stored procedure or sql script to solve this problem.
The accepted answer to the following question might be a solution for you:
Simple way to calculate median with MySQL
I've updated this answer with a solution that avoids the join in favor of storing some data in a json object.
SOLUTION #1 (two selects and a join, one to get counts, one to get rankings)
This is a little lengthy, but it does work, and it's reasonably fast.
SELECT x.group_field,
avg(
if(
x.rank - y.vol/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field,
#r:= IF(#current=group_field, #r+1, 1) as rank,
#current:=group_field
FROM (
SELECT group_field, value_field
FROM table_name
ORDER BY group_field, value_field
) z, (SELECT #r:=0, #current:='') v
) x, (
SELECT group_field, count(*) as vol
FROM table_name
GROUP BY group_field
) y WHERE x.group_field = y.group_field
GROUP BY x.group_field
SOLUTION #2 (uses a json object to store the counts and avoids the join)
SELECT group_field,
avg(
if(
rank - json_extract(#vols, path)/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field, path,
#rnk := if(#curr = group_field, #rnk+1, 1) as rank,
#vols := json_set(
#vols,
path,
coalesce(json_extract(#vols, path), 0) + 1
) as vols,
#curr := group_field
FROM (
SELECT p.group_field, p.value_field, concat('$.', p.group_field) as path
FROM table_name
JOIN (SELECT #curr:='', #rnk:=1, #vols:=json_object()) v
ORDER BY group_field, value_field DESC
) z
) y GROUP BY group_field;