How to get the most recent record of multiple of the same records in a table while joining another table? - sql

SELECT tblSign.sigdate,tblSign.sigtime,tblSign.sigact,tblSign.esignature,tblEmpl.fname,tblEmpl.lname,tblEmpl.location, tblEmpl.estatus,tblLocs.unit,tblLocs.descript,TblLocs.addr1,tblLocs.city,tblLocs.state, tblLocs.zip
FROM tblEmpl
LEFT JOIN tblSign
ON tblSign.eight_id = tblEmpl.eight_id
AND tblSign.formid = '9648'
AND tblSign.sigact <> 'O'
AND tblSign.sigdate >= '2022-11-01'
LEFT JOIN tblLocs
ON tblEmpl.location = tblLocs.location
WHERE tblEmpl.estatus = 'A'
AND tblEmpl.location = '013'
ORDER BY
tblSign.sigdate ASC;
My table Sign has multiple records with the same eight_id so Im just trying to join tables getting the most recent record from tblSign besides multiple records
Data I get
Sigdate
fname
lname
location
sigact
2022-11-01
Bill
Lee
023
A
2022-10-01
Bill
Lee
023
A
2022-11-01
Carter
Hill
555
A
This is what I want :
Sigdate
fname
lname
location
sigact
2022-11-01
Bill
Lee
023
A
2022-11-01
Carter
Hill
555
A

Start by getting into better code-writing habits. Having all column names in one long string is horrible for readability and consequently troubleshooting. You can select the most recent record from a table by using a ROW_NUMBER function. I took your code, cleaned it up, added a derived table and in the derived table added a ROW_NUMBER function. I can't validate that the query works because you didn't post example source data from your tblEmpl, tblSign, and tblLocs tables. I'm not sure if the AND tblSign.sigact <> 'O' is valid in the derived table because it's not clear if you were trying to just limit the date range or that was your attempt to retrieve the most recent date.
SELECT
tblSign.sigdate
, tblSign.sigtime
, tblSign.sigact
, tblSign.esignature
, tblEmpl.fname
, tblEmpl.lname
, tblEmpl.location
, tblEmpl.estatus,tblLocs.unit
, tblLocs.descript
, TblLocs.addr1
, tblLocs.city
, tblLocs.state
, tblLocs.zip
FROM tblEmpl
LEFT JOIN (
SELECT *
--Used to order the records for each eight_id by the date.
--Most recent date for each eight_id will have row_num = 1.
, ROW_NUMBER() OVER(PARTITION BY eight_id ORDER BY sigdat DESC) as row_num
FROM tblSign as ts
WHERE tblSign.formid = '9648'
AND tblSign.sigact <> 'O'
AND tblSign.sigdate >= '2022-11-01' --Not clear if this is just to limit results or an attempt to get most recent date in the failed original code.
) as ts
ON ts.eight_id = tblEmpl.eight_id
AND ts.row_num = 1 --Use to limit to most recent date.
LEFT JOIN tblLocs
ON tblEmpl.location = tblLocs.location
WHERE tblEmpl.estatus = 'A'
AND tblEmpl.location = '013'
ORDER BY
tblSign.sigdate ASC

You use ROW_NUMBER to get the last entry in my case for every esignature, as i thought this must be unique
WITH CTE AS
(SELECT
tblSign.sigdate,
tblSign.sigtime,
tblSign.sigact,
tblSign.esignature,
tblEmpl.fname,
tblEmpl.lname,
tblEmpl.location,
tblEmpl.estatus,
tblLocs.unit,
tblLocs.descript,
TblLocs.addr1,
tblLocs.city,
tblLocs.state,
tblLocs.zip,
ROW_NUMBER() OVER(PARTITION BY tblSign.esignature ORDER BY tblSign.sigdate DESC) rn
FROM
tblEmpl
LEFT JOIN
tblSign ON tblSign.eight_id = tblEmpl.eight_id
AND tblSign.formid = '9648'
AND tblSign.sigact <> 'O'
AND tblSign.sigdate >= '2022-11-01'
LEFT JOIN
tblLocs ON tblEmpl.location = tblLocs.location
WHERE
tblEmpl.estatus = 'A'
AND tblEmpl.location = '013')
SELECT sigdate,
sigtime,
sigact,
esignature,
fname,
lname,
location,
estatus,
unit,
descript,
addr1,
city,
state,
zip
WHERE rn = 1
ORDER BY sigdate ASC;

Related

Combine multiple rows with different dates with overlapping variables (to capture first and last change dates)

I have the following data represented in a table like this:
User
Type
Date
A
Mobile
2019-01-10
A
Mobile
2019-01-20
A
Desktop
2019-03-01
A
Desktop
2019-03-20
A
Email
2021-01-01
A
Email
2020-01-02
A
Desktop
2021-01-03
A
Desktop
2021-01-04
A
Desktop
2021-01-05
Using PostgreSQL - I want to achieve the following:
User
First_Type
First Type Initial Date
Last_Type
Last_Type_Initial_Date
A
Mobile
2019-01-10
Desktop
2021-01-03
So for each user, I want to capture the initial date and type but then also, on the same row (but diff columns), have their last type they "switched" to but with the first date the switch occurred and not the last record of activity on that type.
Consider using a LAG window function and conditional aggregation join via multiple CTEs and self-joins:
WITH sub AS (
SELECT "user"
, "type"
, "date"
, CASE
WHEN LAG("type") OVER(PARTITION BY "user" ORDER BY "date") = "type"
THEN 0
ELSE 1
END "shift"
FROM myTable
), agg AS (
SELECT "user"
, MIN(CASE WHEN shift = 1 THEN "date" END) AS min_shift_dt
, MAX(CASE WHEN shift = 1 THEN "date" END) AS max_shift_dt
FROM sub
GROUP BY "user"
)
SELECT agg."user"
, s1."type" AS first_type
, s1."date" AS first_type_initial_date
, s2."type" AS last_type
, s2."date" AS last_type_initial_date
FROM agg
INNER JOIN sub AS s1
ON agg."user" = s1."user"
AND agg.min_shift_dt = s1."date"
INNER JOIN sub AS s2
ON agg."user" = s2."user"
AND agg.max_shift_dt = s2."date"
Online Demo
user
first_type
first_type_initial_date
last_type
last_type_initial_date
A
Mobile
2019-01-10 00:00:00
Desktop
2021-01-03 00:00:00
Here is my solution with only windows functions and no joins:
with
prep as (
select *,
lag("Type") over(partition by "User" order by "Date") as "Lasttype"
from your_table_name
)
select distinct "User",
first_value("Type") over(partition by "User") as "First_Type",
first_value("Date") over(partition by "User") as "First_Type_Initial_Date",
last_value("Type") over(partition by "User") as "Last_Type",
last_value("Date") over(partition by "User") as "Last_Type_Initial_Date"
from prep
where "Type" <> "Lasttype" or "Lasttype" is null
;
I think this will work, but it sure feels ugly. There might be a better way to do this.
SELECT a.User, a.Type AS First_Type, a.Date AS FirstTypeInitialDate, b.Type AS Last_Type, b.LastTypeInitialDate
FROM table a
INNER JOIN table b ON a.User = b.User
WHERE a.Date = (SELECT MIN(c.Date) FROM table c WHERE c.User = a.User)
AND b.Date = (SELECT MIN(d.Date) FROM table d WHERE d.User = b.User
AND d.Type = (SELECT e.Type FROM table e WHERE e.User = d.User
AND e.Date = (SELECT MAX(f.Date) FROM table f WHERE f.User = e.User)))

How to add a count/sum and group by in a CTE

Just a question on displaying a row on flight level and displaying a count on how many crew members on that flight.
I want to change the output so it will only display a single record at flight level and it will display two additional columns. One column (cabincrew) is the count of crew members that have the 'CREWTYPE' = 'F' and the other column (cockpitcrew) is the count of crew members that have the `'CREWTYPE' = 'C'.
So the query result should look like:
Flight DepartureDate DepartureAirport CREWBASE CockpitCrew CabinCrew
LS361 2016-05-19 BFS BFS 0 3
Can I have a little help tweaking the below query please:
WITH CTE AS (
SELECT cd.*, c.*, l.Carrier, l.FlightNumber, l.Suffix, l.ScheduledDepartureDate, l.ScheduledDepartureAirport
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber FROM Data.Crew) c
INNER JOIN
Data.CrewDetail cd
ON c.UpdateID = cd.CrewUpdateID
AND cd.IsPassive = 0
AND RowNumber = 1
INNER JOIN
Data.Leg l
ON c.LegKey = l.LegKey
)
SELECT
sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix AS Flight
, sac.DepartureDate
, sac.DepartureAirport
, sac.CREWBASE
, sac.CREWTYPE
, sac.EMPNO
, sac.FIRSTNAME
, sac.LASTNAME
, sac.SEX
FROM
Staging.SabreAssignedCrew sac
LEFT JOIN CTE cte
ON sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix = cte.Carrier + CAST(cte.FlightNumber AS VARCHAR) + cte.Suffix
AND sac.DepartureDate = cte.ScheduledDepartureDate
PLEASE TRY THIS.
SELECT Flight,
DepartureDate,
DepartureAirport,
CREWBASE,
SUM(CASE WHEN CREWTYPE = 'F' THEN 1 ELSE 0 END) AS CabinCrew ,
SUM(CASE WHEN CREWTYPE = 'C' THEN 1 ELSE 0 END) AS CockpitCrew
FROM #Table
GROUP BY Flight, DepartureDate, DepartureAirport, CREWBASE
Please Try This:
select Flight, DepartureDate, DepartureAirport,CREWBASE,
count(case when CREWTYPE='F' then 1 end ) as CabinCrew,count(case when CREWTYPE='C' then 1 end ) as CockpitCrew
from Staging.SabreAssignedCrew
group by Flight, DepartureDate, DepartureAirport,CREWBASE

Use of MAX function in SQL query to filter data

The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.
Appreciate any help !
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
Property.dbo.PROP
inner join
Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where
(PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
order by
PROP.EffDate DESC
Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:
with all_data as (
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
max (PROP.EffDate) over (partition by PROP.PolNo) as max_date
from Actuarial.dbo.PROP
inner join Actuarial.dbo.PROP_DATA
on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
)
select
FileName, InsName, Status, FileTime, SubmissionNo,
PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC
This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.
Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.
Something like the following might do the trick, for example:
select
p.FileName
, p.InsName
, p.Status
, p.FileTime
, p.SubmissionNo
, p.PolNo
, p.EffDate
, p.ExpDate
, p.Region
, p.Underwriter
, pd.Data
, pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
on p.FileID = pd.FileID
where (
select count(*)
from Actuarial.dbo.PROP p2
where p2.FileID = p.FileID
and p2.EffDate <= p.EffDate
) <= 1
and (
pd.Label in ('Occupancy' , 'OccupancyTIV')
and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC
Have a look at this stackoverflow question for a full working example.
Not tested
with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
select PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from Actuarial.dbo.PROP
inner join temp1 t
on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID
ORDER BY PROP.EffDate DESC

Group by based on 3 tables that are joined using left join

Below is my query that fetches the fields based on the left join of 3 tables. My requirement is to get all the fields based on the recent SystemDateTime in table Debug.T. For example, if i try it for HardwareId = 550803413, it returns 2 records with 2 different SystemDateTime. I need to filter it so that I get only 1 record for all HardwareIds based on recent SystemDateTime. Data is stored in Google Big Query.
Any help would be appreciated.
SELECT HardwareId, e.Carrier, max(d.SystemDateTime) as DateTime,
CASE
WHEN lower(DebugData) LIKE 'veri%' THEN 'Verizon'
WHEN REGEXP_MATCH(lower(DebugData),'\\d+') THEN c.Network
END
AS ActualData
FROM (
SELECT
HardwareId, SystemDateTime, max(SystemDateTime) as max_date,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(REGEXP_REPLACE(DebugData,'\\"',' '), '\\?',' ') ,0,3))) AS d1,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(DebugData,'[^a-zA-Z0-9]',' '),4,LENGTH(DebugData)-3))) AS d2
FROM TABLE_DATE_RANGE([Debug.T],TIMESTAMP('2016-05-16'),TIMESTAMP('2016-05-16'))
GROUP BY HardwareId, DebugReason, DebugData, SystemDateTime
HAVING DebugReason = 31) AS d
LEFT JOIN
(
SELECT Mcc, Mnc as Mnc, Network from [Debug.Carrier]
) As c
ON c.Mcc = d.d1 and c.Mnc = d.d2
INNER JOIN
(
SELECT VehicleId, APNCarrier FROM [Info_20160516]
) As e
ON d.HardwareId = e.VehicleId
GROUP BY HardwareId, ActualData, e.Carrier
HAVING HardwareId = 550803413
Current output:
HardwareId DebugReason DebugData e_APNCarrier DateTime ActualDebugData
550473814 50013 23430"? Unknown 2016-05-16 08:09:09.534597 Everyth. Ev.wh./T-Mobile
550473814 50013 23410"? Unknown 2016-05-16 07:50:48.526288 O2 Ltd.
550473814 50013 23415"? Unknown 2016-05-16 23:54:37.487154 Vodafone
Expected output:
Since the recent SystemDateTime is 23:54:37.487154, query should filter the records based on the recent SystemDateTime and provide the result.
HardwareId DebugReason DebugData e_APNCarrier DateTime ActualDebugData
550473814 50013 23415"? Unknown 2016-05-16 23:54:37.487154 Vodafone
so you just want the latest record per HardwareId based on DateTime? Try this:
SELECT * FROM (
SELECT HardwareId, e.Carrier, d.SystemDateTime as DateTime,
CASE
WHEN lower(DebugData) LIKE 'veri%' THEN 'Verizon'
WHEN REGEXP_MATCH(lower(DebugData),'\\d+') THEN c.Network
END
AS ActualData,
ROW_NUMBER() OVER (PARTITION BY HARDWAREID ORDER BY d.SystemDateTime desc) RN
FROM (
SELECT
HardwareId, SystemDateTime, max(SystemDateTime) as max_date,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(REGEXP_REPLACE(DebugData,'\\"',' '), '\\?',' ') ,0,3))) AS d1,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(DebugData,'[^a-zA-Z0-9]',' '),4,LENGTH(DebugData)-3))) AS d2
FROM TABLE_DATE_RANGE([Debug.T],TIMESTAMP('2016-05-16'),TIMESTAMP('2016-05-16'))
GROUP BY HardwareId, DebugReason, DebugData, SystemDateTime
HAVING DebugReason = 31) AS d
LEFT JOIN
(
SELECT Mcc, Mnc as Mnc, Network from [Debug.Carrier]
) As c
ON c.Mcc = d.d1 and c.Mnc = d.d2
INNER JOIN
(
SELECT VehicleId, APNCarrier FROM [Info_20160516]
) As e
ON d.HardwareId = e.VehicleId
HAVING HardwareId = 550803413
)
WHERE RN = 1

Fastest way to check if the the most recent result for a patient has a certain value

Mssql < 2005
I have a complex database with lots of tables, but for now only the patient table and the measurements table matter.
What I need is the number of patient where the most recent value of 'code' matches a certain value. Also, datemeasurement has to be after '2012-04-01'. I have fixed this in two different ways:
SELECT
COUNT(P.patid)
FROM T_Patients P
WHERE P.patid IN (SELECT patid
FROM T_Measurements M WHERE (M.code ='xxxx' AND result= 'xx')
AND datemeasurement =
(SELECT MAX(datemeasurement) FROM T_Measurements
WHERE datemeasurement > '2012-01-04' AND patid = M.patid
GROUP BY patid
GROUP by patid)
AND:
SELECT
COUNT(P.patid)
FROM T_Patient P
WHERE 1 = (SELECT TOP 1 case when result = 'xx' then 1 else 0 end
FROM T_Measurements M
WHERE (M.code ='xxxx') AND datemeasurement > '2012-01-04' AND patid = P.patid
ORDER by datemeasurement DESC
)
This works just fine, but it makes the query incredibly slow because it has to join the outer table on the subquery (if you know what I mean). The query takes 10 seconds without the most recent check, and 3 minutes with the most recent check.
I'm pretty sure this can be done a lot more efficient, so please enlighten me if you will :).
I tried implementing HAVING datemeasurment=MAX(datemeasurement) but that keeps throwing errors at me.
So my approach would be to write a query just getting all the last patient results since 01-04-2012, and then filtering that for your codes and results. So something like
select
count(1)
from
T_Measurements M
inner join (
SELECT PATID, MAX(datemeasurement) as lastMeasuredDate from
T_Measurements M
where datemeasurement > '01-04-2012'
group by patID
) lastMeasurements
on lastMeasurements.lastmeasuredDate = M.datemeasurement
and lastMeasurements.PatID = M.PatID
where
M.Code = 'Xxxx' and M.result = 'XX'
The fastest way may be to use row_number():
SELECT COUNT(m.patid)
from (select m.*,
ROW_NUMBER() over (partition by patid order by datemeasurement desc) as seqnum
FROM T_Measurements m
where datemeasurement > '2012-01-04'
) m
where seqnum = 1 and code = 'XXX' and result = 'xx'
Row_number() enumerates the records for each patient, so the most recent gets a value of 1. The result is just a selection.