SQL subquery with average of 3 top values in Postgresql - sql

I need a little help to solve a query to count patients (ID) who have the average of the 3 last diastolic tension (TAD) < 90.
I've tried several type of nested subqueries with different errors.
This is my last version I've done:
SELECT CENTRO, COUNT ( DISTINCT ID )
FROM
(
SELECT PAC.CENTRO, PAC.ID, T.TAD
FROM IDDPAC PAC,
(
SELECT AVG(TA.TAD) TAD
FROM
(
SELECT
TEXT_TO_NUMBER ( PAG.TEXTO ) TAD
FROM IDDPAG PAG, DATE D
WHERE TRIM ( PAG.DGP )='AH'
AND PAG.ID=T.ID
AND PAG.FECHA=D.OMI
AND D.TIME_DATE::DATE BETWEEN DATE '2012-01-01'
AND DATE '2012-12-31'
ORDER BY PAG.FECHA DESC LIMIT 3
) TA
) T
WHERE PAC.CENTRO='10040110' AND T.ID = PAC.ID
GROUP BY PAC.CENTRO , PAC.ID
)
A
WHERE T.TAD < 90
GROUP BY CENTRO
And I get the following error:
ERROR: falta una entrada para la tabla «t» en la cláusula FROM
LINE 31: AND PAG.ID=T.ID
^
********** Error **********
Translation:
ERROR: missing an entry for the table «t» in the clause FROM
LINE 31: AND PAG.ID=T.ID
^
********** Error **********

To get the average of the last three values, use row_number() to enumerate the values. Then choose the last three and take the average. This gives you the patient level information:
SELECT PAC.CENTRO, PAG.ID, AVG(TA.TAD) AS TAD
FROM (SELECT PAG.ID, TEXT_TO_NUMBER ( PAG.TEXTO ) as TAD,
ROW_NUMBER() OVER (PARTITION BY PAG.ID ORDER BY D.TIME_DATE DESC) as seqnum
FROM IDDPAG PAG JOIN
DATE D
ON PAG.FECHA = D.OMI JOIN
IDDPAC PAC
ON PAC.ID = PAG.ID
WHERE TRIM ( PAG.DGP )='AH' AND
D.TIME_DATE::DATE BETWEEN DATE '2012-01-01' AND DATE '2012-12-31'
) TA
WHERE SEQNUM <= 3
GROUP BY PAC.CENTRO, PAD.ID
HAVING AVG(TA.TAD) < 90;
The count by centro would just be:
SELECT CENTRO, COUNT(*)
FROM (SELECT PAC.CENTRO, PAG.ID, AVG(TA.TAD) AS TAD
FROM (SELECT PAG.ID, TEXT_TO_NUMBER ( PAG.TEXTO ) as TAD,
ROW_NUMBER() OVER (PARTITION BY PAG.ID ORDER BY D.TIME_DATE DESC) as seqnum
FROM IDDPAG PAG JOIN
DATE D
ON PAG.FECHA = D.OMI JOIN
IDDPAC PAC
ON PAC.ID = PAG.ID
WHERE TRIM ( PAG.DGP )='AH' AND
D.TIME_DATE::DATE BETWEEN DATE '2012-01-01' AND DATE '2012-12-31'
) TA
WHERE SEQNUM <= 3
GROUP BY PAC.CENTRO, PAD.ID
HAVING AVG(TA.TAD) < 90
) TA
GROUP BY CENTRO;

The problem is, exactly as the error indicates, that 'T' is not defined in the place it is requested. Your error is in the innermost subquery:
SELECT
TEXT_TO_NUMBER ( PAG.TEXTO ) TAD
FROM IDDPAG PAG, DATE D
WHERE TRIM ( PAG.DGP )='AH'
AND PAG.ID=T.ID
AND PAG.FECHA=D.OMI
AND D.TIME_DATE::DATE BETWEEN DATE '2012-01-01'
AND DATE '2012-12-31'
ORDER BY PAG.FECHA DESC LIMIT 3
But there is no T defined here to be used in the PAG.ID=T.ID portion of your WHERE clause. Did you mean to join on a table called T? Or did you mean to use D.ID instead?

Related

SQL Server LEAD function

-- FIRST LOGIN DATE
WITH CTE_FIRST_LOGIN AS
(
SELECT
PLAYER_ID, EVENT_DATE,
ROW_NUMBER() OVER (PARTITION BY PLAYER_ID ORDER BY EVENT_DATE ASC) AS RN
FROM
ACTIVITY
),
-- CONSECUTIVE LOGINS
CTE_CONSEC_PLAYERS AS
(
SELECT
PLAYER_ID,
LEAD(EVENT_DATE,1) OVER (PARTITION BY EVENT_DATE ORDER BY EVENT_DATE) NEXT_DATE
FROM
ACTIVITY A
JOIN
CTE_FIRST_LOGIN C ON A.PLAYER_ID = C.PLAYER_ID
WHERE
NEXT_DATE = DATEADD(DAY, 1, A.EVENT_DATE) AND C.RN = 1
GROUP BY
A.PLAYER_ID
)
-- FRACTION
SELECT
NULLIF(ROUND(1.00 * COUNT(CTE_CONSEC.PLAYER_ID) / COUNT(DISTINCT PLAYER_ID), 2), 0) AS FRACTION
FROM
ACTIVITY
JOIN
CTE_CONSEC_PLAYERS CTE_CONSEC ON CTE_CONSEC.PLAYER_ID = ACTIVITY.PLAYER_ID
I am getting the following error when I run this query.
[42S22] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid column name 'NEXT_DATE'. (207) (SQLExecDirectW)
This is a leetcode medium question 550. Game Play Analysis IV. I wanted to know why it can't identify the column NEXT_DATE here and what am I missing? Thanks!
The problem is in this CTE:
-- CONSECUTIVE LOGINS prep
CTE_CONSEC_PLAYERS AS (
SELECT
PLAYER_ID,
LEAD(EVENT_DATE,1) OVER (PARTITION BY EVENT_DATE ORDER BY EVENT_DATE) NEXT_DATE
FROM ACTIVITY A
JOIN CTE_FIRST_LOGIN C ON A.PLAYER_ID = C.PLAYER_ID
WHERE NEXT_DATE = DATEADD(DAY, 1, A.EVENT_DATE) AND C.RN = 1
GROUP BY A.PLAYER_ID
)
Note that you are creating NEXT_DATE as a column alias in this CTE but also referring to it in the WHERE clause. This is invalid because by SQL clause-ordering rules the NEXT_DATE column alias does not exist until you get to the ORDER BY clause which is the last evaluated clause in a SQL query or subquery. You don't have an ORDER BY clause in this subquery, so technically the NEXT_DATE column alias only exists to [sub]queries that both come after and reference your CTE_CONSEC_PLAYERS CTE.
To fix this you'd probably want two CTEs like this (untested):
-- CONSECUTIVE LOGINS
CTE_CONSEC_PLAYERS_pre AS (
SELECT
PLAYER_ID,
RN,
EVENT_DATE,
LEAD(EVENT_DATE,1) OVER (PARTITION BY EVENT_DATE ORDER BY EVENT_DATE) NEXT_DATE
FROM ACTIVITY A
JOIN CTE_FIRST_LOGIN C ON A.PLAYER_ID = C.PLAYER_ID
)
-- CONSECUTIVE LOGINS
CTE_CONSEC_PLAYERS AS (
SELECT
PLAYER_ID,
MAX(NEXT_DATE) AS NEXT_DATE,
FROM CTE_CONSEC_PLAYERS_pre
WHERE NEXT_DATE = DATEADD(DAY, 1, EVENT_DATE) AND RN = 1
GROUP BY PLAYER_ID
)
You gave every table an alias (for example JOIN CTE_FIRST_LOGIN C has the alias C), and every column access is via the alias. You need to add the correct alias from the correct table to NEXT_DATE.
Your primary issue is that NEXT_DATE is a window function, and therefore cannot be referred to in the WHERE because of SQL's order of operations.
But it seems this query is over-complicated.
The problem to be solved appears to be: how many players logged in the day after they first logged in, as a percentage of all players.
This can be done in a single pass (no joins), by using multiple window functions together:
WITH CTE_FIRST_LOGIN AS (
SELECT
PLAYER_ID,
EVENT_DATE,
ROW_NUMBER() OVER (PARTITION BY PLAYER_ID ORDER BY EVENT_DATE) AS RN,
-- if EVENT_DATE is a datetime and can have multiple per day then group by CAST(EVENT_DATE AS date) first
LEAD(EVENT_DATE, 1) OVER (PARTITION BY EVENT_DATE ORDER BY EVENT_DATE) AS NextDate
FROM ACTIVITY
),
BY_PLAYERS AS (
SELECT
c.PLAYER_ID,
SUM(CASE WHEN c.RN = 1 AND c.NextDate = DATEADD(DAY, 1, c.EVENT_DATE)
THEN 1 END) AS IsConsecutive
FROM CTE_FIRST_LOGIN AS c
GROUP BY c.PLAYER_ID
)
SELECT ROUND(
1.00 *
COUNT(c.IsConsecutive) /
NULLIF(COUNT(*), 0)
,2) AS FRACTION
FROM BY_PLAYERS AS c;
You could theoretically merge BY_PLAYERS into the outer query and use COUNT(DISTINCT but splitting them feels cleaner

Select only observations with a date more recent than the 30/6/2021 (dd/mm/yyyy)

I have the following code:
Select Tbl.Fromdate, Tbl.Por, Tbl.Porname, Tbl.Bmref3
From(
Select
To_Char(P.Fromdate, 'dd-mm-yyyy') As Fromdate, P.Por, P.Porname, W.Bmref3,
, RANK() OVER (PARTITION BY P.Por ORDER BY P.fromdate DESC) AS rank
From Tmsdat.Climandatecomps W
Inner Join Tmsdat.Portfolios P On (W.Porik = P.Porik)
Where 1=1
) Tbl
Where 1=1
And Tbl.Rank = 1
;
However, I wish to select only the observations that have a Fromdate more recent than the June 30, 2021. I tried to add Tbl.Fromdate> '30-06-2021' to the WHERE clause, but I did not receive the desired results.
Do you have any suggestions?
Thank you in advance.
Best regards,
You would put the condition in the inner query:
Select To_Char(P.Fromdate, 'dd-mm-yyyy') As Fromdate, P.Por, P.Porname, W.Bmref3,
RANK() OVER (PARTITION BY P.Por ORDER BY P.fromdate DESC) AS rank
From Tmsdat.Climandatecomps W inner join
Tmsdat.Portfolios P
On (W.Porik = P.Porik)
Where p.FromDate > date '2021-06-30'

Filter rows using on conditions twice the same column

I have the next query:
select CHANNEL , my_date
from table_1 d
where source_data = 'test_5'
and my_date < to_date('27/09/2020','DD/MM/YYYY')
and customer_ID = :param_customer_ID
order by d.my_date asc;
That will show the next result:
My need is. have the last vale filter for the last my_date, grouping by channel. My result for this example must look like this:
Just the two rows.
I tried with:
select CHANNEL , my_date
from table_1 d
where source_data = 'test_5'
and (my_date < to_date('27/09/2020','DD/MM/YYYY') and my_date = max(my_date))
and customer_ID = :param_customer_ID
group by CHANNEL, my_date
order by d.my_date asc;
but nothing, it doesn't work, and give me error
ORA-00934: función de grupo no permitida aquí
00934. 00000 - "group function is not allowed here"
*Cause:
*Action:
Error en la línea: 138, columna: 30
what should i do?
Regards
In Oracle, you can use aggregation:
select channel, max(my_date)
from table_1 d
where source_data = 'test_5' and
my_date < date '2020-09-27' and
customer_ID = :param_customer_ID
group by channel;
If there are more columns that you want, then use row_number():
select channel, my_date
from (select d.*,
row_number() over (partition by channel order by my_date desc) as seqnum
from table_1 d
where source_data = 'test_5' and
my_date < date '2020-09-27' and
customer_ID = :param_customer_ID
) d
where seqnum = 1;

SQL Query to Group Columns

I am running a query which I want to only show distinct customers from.
At the current time I am receiving records which have multiple records for example 3 records for Item A0003. I want to only return the last record in the sequence.
My code:
select OJCUNO AS Item,OJPRRF as code,OJFVDT as From Date, OJLVDT as To Date
from M3FDBPRD.OPRICH
WHERE
OJCUNO in ( Select max(OJCUNO) FROM OPRICH group by OJCUNO )
Data Sample:
Item Code From Date To Date
A0007 AD 20030301 20161231
A0008 AF 20030301 20161231
A0009 AL 20030301 20121229
A0009 AL 20030301 20121231
Expected Result:
Item Code From Date To Date
A0007 AD 20030301 20161231
A0008 AF 20030301 20161231
A0009 AL 20030301 20121231
Just use row_number():
select OJCUNO AS Item, OJPRRF as code ,OJFVDT as FromDate, OJLVDT as ToDate
from (select o.*,
row_number() over (partition by ojcuno order OJPRRF desc, OJLVDT desc) as seqnum
from M3FDBPRD.OPRICH o
) o
where seqnum = 1;
Your approach, using a correlated subquery would work if you used the right columns:
select OJCUNO AS Item,OJPRRF as code, OJFVDT as FromDate, OJLVDT as ToDate
from M3FDBPRD.OPRICH
where OJLVDT in ( Select max(OJLVDT) from OPRICH group by OJCUNO );
I'm just having a stab here because I don't have access to a DB2 database to test on, but:
SELECT *
FROM
(
select OJCUNO AS Item,
OJPRRF as code,
OJFVDT as From Date,
OJLVDT as To Date,
ROW_NUMBER() OVER (PARTITION BY OJCUNO, OJPRRF ORDER BY OJFVDT DESC, OJLVDT DESC) AS RNum
from M3FDBPRD.OPRICH
WHERE
OJCUNO in ( Select max(OJCUNO) FROM OPRICH group by OJCUNO )
) a
WHERE a.RNum = 1

Return first result only for each unique result

I am having some trouble with duplicating results in SQL Server 2005. I have previously used the ROW NUMBER function to display my query results, but I cannot get the query below to only display rownum 1:
SELECT *
FROM (SELECT l.insbilleddate, l.pickupdate, l.patientname, l.inscompanyname AS Payor, l.tripid,
l.sales, l.cost, l.sales-l.cost AS Profit, l.profitpct AS 'Profit Pct', u.pUPFName + ' ' +
u.pUPLName AS Dispatcher, ROW_NUMBER() OVER (PARTITION BY l.tripid ORDER BY d.trDispatchDate
ASC) AS rownum
FROM pUsersPrinters u
INNER JOIN TranslationDispatch d
INNER JOIN v_OLAPdataTR l ON d.trTripid = l.tripid ON u.pUP_id = d.trDispatchedBy
GROUP BY l.insbilleddate, l.pickupdate, l.patientname, l.inscompanyname, l.tripid, l.sales,
l.cost, l.profitpct, u.pUPFName, u.pUPLName, d.trDispatchDate
HAVING l.insbilleddate >= '6/1/2014' And l.insbilleddate < '7/1/2014' AND l.cost > '0' AND
l.profitpct < '30') q1
WHERE rownum = 1
ORDER BY q1.profitpct
The TranslationDispatch table adds a line each time a user dispatches a trip. If a trip needs to be reassigned the database does not overwrite the original dispatcher, instead it adds another line with the userID, tripID, and dispatch date. The d.trTripid = l.tripid comparison causes the trip to show for each dispatcher that marks it.
As an example, results show as:
TripID trDispatchedBy trDispatchDate
1234 Carlos 6/25/2014 10:00
1234 Tim 6/25/2014 10:02
...but I only want to display Carlos, as he dispatched the trip first.
EDIT: I've adjusted the query above with the help of #Vulcronos to make it work, by adding a table alias (q1) and making the rownum = '1' into rownum = 1 to correctly display my final result.
I would try:
ROW_NUMBER () OVER ( PARTITION BY l.insbilleddate, l.pickupdate, l.patientname,
l.inscompanyname, l.tripid, l.sales, l.cost, l.profitpct, u.pUPFName, u.pUPLName,
d.trDispatchDate ORDER BY trDispatchDate ASC)
This should give you a row number of one on every groups earliest dispatch date. Then you can wrap your whole query in:
select *
from (my_query)
where rownum = 1
How about adding "TOP 1" to the outside query
SELECT TOP 1 *
FROM(SELECT L.Insbilleddate,
L.Pickupdate,
L.Patientname,
L.Inscompanyname AS Payor,
L.Tripid,
L.Sales,
L.Cost,
L.Sales - L.Cost AS Profit,
L.Profitpct AS 'Profit Pct',
U.Pupfname + ' ' + U.Puplname AS Dispatcher,
ROW_NUMBER()OVER(PARTITION BY L.Tripid ORDER BY D.Trdispatchdate ASC)AS Rownum
FROM Pusersprinters U
INNER JOIN Translationdispatch D
INNER JOIN V_Olapdatatr L ON D.Trtripid = L.Tripid ON U.Pup_Id = D.Trdispatchedby
GROUP BY L.Insbilleddate,
L.Pickupdate,
L.Patientname,
L.Inscompanyname,
L.Tripid,
L.Sales,
L.Cost,
L.Profitpct,
U.Pupfname,
U.Puplname,
D.Trdispatchdate
HAVING L.Insbilleddate >= '6/1/2014'
AND L.Insbilleddate < '7/1/2014'
AND L.Cost > '0'
AND L.Profitpct < '30') A
ORDER BY A.Profitpct;