Group by based on 3 tables that are joined using left join - sql

Below is my query that fetches the fields based on the left join of 3 tables. My requirement is to get all the fields based on the recent SystemDateTime in table Debug.T. For example, if i try it for HardwareId = 550803413, it returns 2 records with 2 different SystemDateTime. I need to filter it so that I get only 1 record for all HardwareIds based on recent SystemDateTime. Data is stored in Google Big Query.
Any help would be appreciated.
SELECT HardwareId, e.Carrier, max(d.SystemDateTime) as DateTime,
CASE
WHEN lower(DebugData) LIKE 'veri%' THEN 'Verizon'
WHEN REGEXP_MATCH(lower(DebugData),'\\d+') THEN c.Network
END
AS ActualData
FROM (
SELECT
HardwareId, SystemDateTime, max(SystemDateTime) as max_date,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(REGEXP_REPLACE(DebugData,'\\"',' '), '\\?',' ') ,0,3))) AS d1,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(DebugData,'[^a-zA-Z0-9]',' '),4,LENGTH(DebugData)-3))) AS d2
FROM TABLE_DATE_RANGE([Debug.T],TIMESTAMP('2016-05-16'),TIMESTAMP('2016-05-16'))
GROUP BY HardwareId, DebugReason, DebugData, SystemDateTime
HAVING DebugReason = 31) AS d
LEFT JOIN
(
SELECT Mcc, Mnc as Mnc, Network from [Debug.Carrier]
) As c
ON c.Mcc = d.d1 and c.Mnc = d.d2
INNER JOIN
(
SELECT VehicleId, APNCarrier FROM [Info_20160516]
) As e
ON d.HardwareId = e.VehicleId
GROUP BY HardwareId, ActualData, e.Carrier
HAVING HardwareId = 550803413
Current output:
HardwareId DebugReason DebugData e_APNCarrier DateTime ActualDebugData
550473814 50013 23430"? Unknown 2016-05-16 08:09:09.534597 Everyth. Ev.wh./T-Mobile
550473814 50013 23410"? Unknown 2016-05-16 07:50:48.526288 O2 Ltd.
550473814 50013 23415"? Unknown 2016-05-16 23:54:37.487154 Vodafone
Expected output:
Since the recent SystemDateTime is 23:54:37.487154, query should filter the records based on the recent SystemDateTime and provide the result.
HardwareId DebugReason DebugData e_APNCarrier DateTime ActualDebugData
550473814 50013 23415"? Unknown 2016-05-16 23:54:37.487154 Vodafone

so you just want the latest record per HardwareId based on DateTime? Try this:
SELECT * FROM (
SELECT HardwareId, e.Carrier, d.SystemDateTime as DateTime,
CASE
WHEN lower(DebugData) LIKE 'veri%' THEN 'Verizon'
WHEN REGEXP_MATCH(lower(DebugData),'\\d+') THEN c.Network
END
AS ActualData,
ROW_NUMBER() OVER (PARTITION BY HARDWAREID ORDER BY d.SystemDateTime desc) RN
FROM (
SELECT
HardwareId, SystemDateTime, max(SystemDateTime) as max_date,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(REGEXP_REPLACE(DebugData,'\\"',' '), '\\?',' ') ,0,3))) AS d1,
INTEGER(RTRIM(SUBSTR(REGEXP_REPLACE(DebugData,'[^a-zA-Z0-9]',' '),4,LENGTH(DebugData)-3))) AS d2
FROM TABLE_DATE_RANGE([Debug.T],TIMESTAMP('2016-05-16'),TIMESTAMP('2016-05-16'))
GROUP BY HardwareId, DebugReason, DebugData, SystemDateTime
HAVING DebugReason = 31) AS d
LEFT JOIN
(
SELECT Mcc, Mnc as Mnc, Network from [Debug.Carrier]
) As c
ON c.Mcc = d.d1 and c.Mnc = d.d2
INNER JOIN
(
SELECT VehicleId, APNCarrier FROM [Info_20160516]
) As e
ON d.HardwareId = e.VehicleId
HAVING HardwareId = 550803413
)
WHERE RN = 1

Related

How to get the most recent record of multiple of the same records in a table while joining another table?

SELECT tblSign.sigdate,tblSign.sigtime,tblSign.sigact,tblSign.esignature,tblEmpl.fname,tblEmpl.lname,tblEmpl.location, tblEmpl.estatus,tblLocs.unit,tblLocs.descript,TblLocs.addr1,tblLocs.city,tblLocs.state, tblLocs.zip
FROM tblEmpl
LEFT JOIN tblSign
ON tblSign.eight_id = tblEmpl.eight_id
AND tblSign.formid = '9648'
AND tblSign.sigact <> 'O'
AND tblSign.sigdate >= '2022-11-01'
LEFT JOIN tblLocs
ON tblEmpl.location = tblLocs.location
WHERE tblEmpl.estatus = 'A'
AND tblEmpl.location = '013'
ORDER BY
tblSign.sigdate ASC;
My table Sign has multiple records with the same eight_id so Im just trying to join tables getting the most recent record from tblSign besides multiple records
Data I get
Sigdate
fname
lname
location
sigact
2022-11-01
Bill
Lee
023
A
2022-10-01
Bill
Lee
023
A
2022-11-01
Carter
Hill
555
A
This is what I want :
Sigdate
fname
lname
location
sigact
2022-11-01
Bill
Lee
023
A
2022-11-01
Carter
Hill
555
A
Start by getting into better code-writing habits. Having all column names in one long string is horrible for readability and consequently troubleshooting. You can select the most recent record from a table by using a ROW_NUMBER function. I took your code, cleaned it up, added a derived table and in the derived table added a ROW_NUMBER function. I can't validate that the query works because you didn't post example source data from your tblEmpl, tblSign, and tblLocs tables. I'm not sure if the AND tblSign.sigact <> 'O' is valid in the derived table because it's not clear if you were trying to just limit the date range or that was your attempt to retrieve the most recent date.
SELECT
tblSign.sigdate
, tblSign.sigtime
, tblSign.sigact
, tblSign.esignature
, tblEmpl.fname
, tblEmpl.lname
, tblEmpl.location
, tblEmpl.estatus,tblLocs.unit
, tblLocs.descript
, TblLocs.addr1
, tblLocs.city
, tblLocs.state
, tblLocs.zip
FROM tblEmpl
LEFT JOIN (
SELECT *
--Used to order the records for each eight_id by the date.
--Most recent date for each eight_id will have row_num = 1.
, ROW_NUMBER() OVER(PARTITION BY eight_id ORDER BY sigdat DESC) as row_num
FROM tblSign as ts
WHERE tblSign.formid = '9648'
AND tblSign.sigact <> 'O'
AND tblSign.sigdate >= '2022-11-01' --Not clear if this is just to limit results or an attempt to get most recent date in the failed original code.
) as ts
ON ts.eight_id = tblEmpl.eight_id
AND ts.row_num = 1 --Use to limit to most recent date.
LEFT JOIN tblLocs
ON tblEmpl.location = tblLocs.location
WHERE tblEmpl.estatus = 'A'
AND tblEmpl.location = '013'
ORDER BY
tblSign.sigdate ASC
You use ROW_NUMBER to get the last entry in my case for every esignature, as i thought this must be unique
WITH CTE AS
(SELECT
tblSign.sigdate,
tblSign.sigtime,
tblSign.sigact,
tblSign.esignature,
tblEmpl.fname,
tblEmpl.lname,
tblEmpl.location,
tblEmpl.estatus,
tblLocs.unit,
tblLocs.descript,
TblLocs.addr1,
tblLocs.city,
tblLocs.state,
tblLocs.zip,
ROW_NUMBER() OVER(PARTITION BY tblSign.esignature ORDER BY tblSign.sigdate DESC) rn
FROM
tblEmpl
LEFT JOIN
tblSign ON tblSign.eight_id = tblEmpl.eight_id
AND tblSign.formid = '9648'
AND tblSign.sigact <> 'O'
AND tblSign.sigdate >= '2022-11-01'
LEFT JOIN
tblLocs ON tblEmpl.location = tblLocs.location
WHERE
tblEmpl.estatus = 'A'
AND tblEmpl.location = '013')
SELECT sigdate,
sigtime,
sigact,
esignature,
fname,
lname,
location,
estatus,
unit,
descript,
addr1,
city,
state,
zip
WHERE rn = 1
ORDER BY sigdate ASC;

Query error: Column name ICUSTAY_ID is ambiguous. Using multiple subqueries in BigQuery

Hi, I receive the following query error "Query error: Column name ICUSTAY_ID is ambiguous" referred to the third last line of code (see the following code). Please can you help me? Thank you so much!
I am an SQL beginner..
WITH t AS
(
SELECT
*
FROM
(
SELECT *,
DATETIME_DIFF(CHARTTIME, INTIME, MINUTE) AS pi_recorded
FROM
(
SELECT
*
FROM
(
SELECT * FROM
(SELECT i.SUBJECT_ID, p.dob, i.hadm_id, p.GENDER, a.ETHNICITY, a.ADMITTIME, a.INSURANCE, i.ICUSTAY_ID,
i.DBSOURCE, i.INTIME, DATETIME_DIFF(a.ADMITTIME, p.DOB, DAY) AS age,
CASE
WHEN DATETIME_DIFF(a.ADMITTIME, p.DOB, DAY) <= 32485
THEN 'adult'
WHEN DATETIME_DIFF(a.ADMITTIME, p.DOB, DAY) > 32485
then '>89'
END AS age_group
FROM `project.mimic3.ICUSTAYS` AS i
INNER JOIN `project.mimic3.PATIENTS` AS p ON i.SUBJECT_ID = p.SUBJECT_ID
INNER JOIN `project.mimic3.ADMISSIONS` AS a ON i.HADM_ID = a.HADM_ID)
WHERE age >= 6570
) AS t1
LEFT JOIN
(
SELECT ITEMID, ICUSTAY_ID, CHARTTIME, VALUE, FROM `project.mimic3.CHARTEVENTS`
WHERE ITEMID = 551 OR ITEMID = 552 OR ITEMID = 553 OR ITEMID = 224631
OR ITEMID = 224965 OR ITEMID = 224966
) AS t2
ON t1.ICUSTAY_ID = t2.ICUSTAY_ID
)
)
WHERE ITEMID IN (552, 553, 224965, 224966) AND pi_recorded <= 1440
)
SELECT ICUSTAY_ID #### Query error: Column name ICUSTAY_ID is ambiguous
FROM t
GROUP BY ICUSTAY_ID;
Both t1 and t2 have a column called ICUSTAY_ID. When you join them together into a single dataset you end up with 2 columns with the same name - which obviously can't work as there would be no way of uniquely identify each column.
You need to alias these columns in you code or not include one or the other if you don't need both

How to get data from 2 rows which has same data in all columns except one in MSSQL

As in my title I want to take data from 2 rows but In my case each 2nd row has one different value compare to the first row.
I want to take all the common data along with the different data as a single row .
Here you can see each row has same values in another row except the 2nd rows last column.
Thanks.
Edits Result :
I suspect you have a some kind of ordering columns that could specify your actual data ordering if so, then you can use row_number() function
select * from (
select *,
row_number() over (partition by <common data cols> order by ? desc) Seq
from table t
) t
where seq = 1;
EDIT : I don't believe your inventort_item_id columns but yes you could use creation_date for ordering purpose
SELECT
EPI.ITEM_CODE, LMP.PROD_DESC, LLPC.COLOC_PROD_PRICE,
BASE_PATH + '' + EPI.IMAGE_FOLDER_NAME + '/' + EPI.IMAGE_DESCRIPTION AS POPULAR_PRODUCTS_IMAGE_PATHS
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY ITEM_CODE ORDER BY creation_date DESC) as Seq
FROM ECOM_PRODUCT_IMAGES EPI
) EPI
INNER JOIN ECOM_POPULAR_PRODUCTS_MAPPING EPPIM ON EPPIM.ITEM_CODE = EPI.ITEM_CODE
INNER JOIN LOM_MST_PRODUCT LMP ON LMP.PROD_CODE = EPI.ITEM_CODE
INNER JOIN LOM_LNK_PROD_COMP LLPC ON LLPC.COLOC_PROD_CODE = LMP.PROD_CODE
WHERE EPI.Seq = 1 AND
EPPIM.ITEM_STATUS = 'ACTIVE';
EDIT 2: In that case you need to use GROUP BY clause with conditional aggregation
SELECT
EPI.ITEM_CODE, LMP.PROD_DESC, LLPC.COLOC_PROD_PRICE,
MAX(CASE WHEN EPI.Seq = 2
THEN (BASE_PATH + '' + EPI.IMAGE_FOLDER_NAME + '/' + EPI.IMAGE_DESCRIPTION)
END) AS POPULAR_PRODUCTS_IMAGE_PATHS,
MAX(CASE WHEN EPI.Seq = 1
THEN (BASE_PATH + '' + EPI.IMAGE_FOLDER_NAME + '/' + EPI.IMAGE_DESCRIPTION)
END) AS PATH_NEW
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY ITEM_CODE ORDER BY creation_date DESC) as Seq
FROM ECOM_PRODUCT_IMAGES EPI
) EPI
INNER JOIN ECOM_POPULAR_PRODUCTS_MAPPING EPPIM ON EPPIM.ITEM_CODE = EPI.ITEM_CODE
INNER JOIN LOM_MST_PRODUCT LMP ON LMP.PROD_CODE = EPI.ITEM_CODE
INNER JOIN LOM_LNK_PROD_COMP LLPC ON LLPC.COLOC_PROD_CODE = LMP.PROD_CODE
WHERE EPPIM.ITEM_STATUS = 'ACTIVE'
GROUP BY EPI.ITEM_CODE, LMP.PROD_DESC, LLPC.COLOC_PROD_PRICE;
here is my approach, also using a window function.
sample data
if object_id('tempdb..#x') is not null drop table #x
CREATE TABLE #x (ITEM_CODE VARCHAR(10), PROD_DESC VARCHAR(20),
COLOR_PROD_PRICE DECIMAL, POPULAR_PRODUCTS_IMAGE_PATHS VARCHAR(200))
INSERT INTO #X(ITEM_CODE,PROD_DESC,COLOR_PROD_PRICE,POPULAR_PRODUCTS_IMAGE_PATHS) VALUES
('P0001', 'Axe Brand', 88.000, 'some_path_to_img1.jpg'),
('P0001', 'Axe Brand', 88.000, 'some_path_to_img2.jpg'),
('P0002', 'Almond Nuts', 499.000, 'some_path_to_img1.jpg'),
('P0002', 'Almond Nuts', 499.000, 'some_path_to_img2.jpg')
query - just change #x to your table and it should work
;WITH my_cte as
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY ITEM_CODE ORDER BY POPULAR_PRODUCTS_IMAGE_PATHS) AS 'track_row'
FROM #x
)
SELECT a.ITEM_CODE, a.PROD_DESC, a.COLOR_PROD_PRICE,
a.POPULAR_PRODUCTS_IMAGE_PATHS + ' ' + b.POPULAR_PRODUCTS_IMAGE_PATHS AS 'POPULAR_PRODUCTS_IMAGE_PATHS'
FROM my_cte AS a
INNER JOIN
my_cte AS b ON a.ITEM_CODE=b.ITEM_CODE
WHERE a.track_row=1 AND b.track_row=2
output
ITEM_CODE PROD_DESC COLOR_PROD_PRICE POPULAR_PRODUCTS_IMAGE_PATHS
P0001 Axe Brand 88 some_path_to_img1.jpg some_path_to_img2.jpg
P0002 Almond Nuts 499 some_path_to_img1.jpg some_path_to_img2.jpg

How can I create a view which shows monthly values on a daily basis

I have a view which is defined by the following code
CREATE VIEW [dbo].V_SOME_VIEW AS
WITH all_dates AS (SELECT DISTINCT(read_dtime) AS date FROM t_periodic_value),
theObjects AS (SELECT * FROM t_object)
SELECT
ad.date,
objs.id,
pv1.value as theValue
FROM all_dates ad
LEFT JOIN theObjects objs ON
objs.start_date <= ad.date AND (objs.end_date IS NULL OR (objs.end_date IS NOT NULL AND objs.end_date >= ad.date))
LEFT JOIN t_periodic_value pv1 ON pv1.data_point_id = (SELECT id FROM t_data_point WHERE object_id = objs.id AND measurement_id = 'MonthlyValue')
AND pv1.read_dtime = ad.date AND pv1.latest_ind = 1
GO
Which if I run a select for any given month gives me output along the lines of :
Date | ID | theValue
01/01/1990 | someFacility | 1000
02/01/1990 | someFacility | NULL
03/01/1990 | someFacility | NULL
...
and so on for the rest of the month. Nulls are returned for every date except the first as the value is calculated on a monthly basis. Is there a way I can define the view so that for every other day in the month, the value from the 1st is used?
Use a window function:
CREATE VIEW [dbo].V_SOME_VIEW AS
WITH all_dates AS (
SELECT DISTINCT(read_dtime) AS date
FROM t_periodic_value
),
theObjects AS ( -- no idea why you are doing this
SELECT *
FROM t_object
)
SELECT ad.date, objs.id,
SUM(pv1.value) OVER (PARTITION BY YEAR(ad.date), MONTH(ad.date)) as theValue
FROM all_dates ad LEFT JOIN
theObjects objs ON
ON objs.start_date <= ad.date AND (objs.end_date IS NULL OR (objs.end_date IS NOT NULL AND objs.end_date >= ad.date)) LEFT JOIN
t_periodic_value pv1
ON pv1.data_point_id = (SELECT id FROM t_data_point WHERE object_id = objs.id AND measurement_id = 'MonthlyValue')
AND pv1.read_dtime = ad.date AND pv1.latest_ind = 1
GO

How to add a count/sum and group by in a CTE

Just a question on displaying a row on flight level and displaying a count on how many crew members on that flight.
I want to change the output so it will only display a single record at flight level and it will display two additional columns. One column (cabincrew) is the count of crew members that have the 'CREWTYPE' = 'F' and the other column (cockpitcrew) is the count of crew members that have the `'CREWTYPE' = 'C'.
So the query result should look like:
Flight DepartureDate DepartureAirport CREWBASE CockpitCrew CabinCrew
LS361 2016-05-19 BFS BFS 0 3
Can I have a little help tweaking the below query please:
WITH CTE AS (
SELECT cd.*, c.*, l.Carrier, l.FlightNumber, l.Suffix, l.ScheduledDepartureDate, l.ScheduledDepartureAirport
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber FROM Data.Crew) c
INNER JOIN
Data.CrewDetail cd
ON c.UpdateID = cd.CrewUpdateID
AND cd.IsPassive = 0
AND RowNumber = 1
INNER JOIN
Data.Leg l
ON c.LegKey = l.LegKey
)
SELECT
sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix AS Flight
, sac.DepartureDate
, sac.DepartureAirport
, sac.CREWBASE
, sac.CREWTYPE
, sac.EMPNO
, sac.FIRSTNAME
, sac.LASTNAME
, sac.SEX
FROM
Staging.SabreAssignedCrew sac
LEFT JOIN CTE cte
ON sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix = cte.Carrier + CAST(cte.FlightNumber AS VARCHAR) + cte.Suffix
AND sac.DepartureDate = cte.ScheduledDepartureDate
PLEASE TRY THIS.
SELECT Flight,
DepartureDate,
DepartureAirport,
CREWBASE,
SUM(CASE WHEN CREWTYPE = 'F' THEN 1 ELSE 0 END) AS CabinCrew ,
SUM(CASE WHEN CREWTYPE = 'C' THEN 1 ELSE 0 END) AS CockpitCrew
FROM #Table
GROUP BY Flight, DepartureDate, DepartureAirport, CREWBASE
Please Try This:
select Flight, DepartureDate, DepartureAirport,CREWBASE,
count(case when CREWTYPE='F' then 1 end ) as CabinCrew,count(case when CREWTYPE='C' then 1 end ) as CockpitCrew
from Staging.SabreAssignedCrew
group by Flight, DepartureDate, DepartureAirport,CREWBASE