get las record from group in postgres? - sql

i have the next structure for db, i need to get only the most recent row by id, for example the most recen row for id 1, the most recen row for id 2, etc, for the moment i have this query:
SELECT max(fecha), id_transaccion, id_movimiento
FROM transaccion_movimiento
group by (id_transaccion, id_movimiento, fecha);
To later adapt to this:
LEFT JOIN (SELECT max(fecha), id_transaccion, id_movimiento FROM transaccion_movimiento group by (id_transaccion, id_movimiento, fecha) ORDER BY ID_TRANSACCION asc LIMIT 1)nn ON T.id_transaccion = nn.id_transaccion LEFT JOIN ctl_tipo_movimiento IM ON IM.id_tipo_movimiento = nn.id_movimiento " +
Please i need help

I suggest you consider using row_number() over() instead. See Window Functions
SELECT
*
FROM x
LEFT JOIN (
SELECT
fecha
, id_transaccion
, id_movimiento
, ROW_NUMBER() OVER (PARTITION BY id_transaccion, id_movimiento
ORDER BY fecha DESC) AS ultima_fecha_row
FROM transaccion_movimiento
) nn ON T.id_transaccion = nn.id_transaccion asn nn.ultima_fecha_row=1
LEFT JOIN ctl_tipo_movimiento im ON im.id_tipo_movimiento = nn.id_movimiento
Now whilst I recognized fecha to mean date not much else made sense to me, so you may have to adjust this suggestion. For example I'm not sure if id_movimiento is needed in the "partition" or not.
, ROW_NUMBER() OVER (PARTITION BY id_transaccion
ORDER BY fecha DESC) AS ultima_fecha_row

Related

SQL Server-How to avoid repetition of a column in the output

Output of my SQL Server Query is as below:
Following is my query:
SELECT
si.SupplyInvoiceID,
si.CompanyID,
si.TotalBill,
siph.BillPaidAmount,
si.TotalBill - SUM(siph.BillPaidAmount)
over( partition by si.SupplyInvoiceID order by siph.SupplyPaymentID asc) as RemainingBillAmount
from
SupplyInvoicePaymentHistory siph
left join
SupplyInvoice si
on siph.SupplyInvoiceID = si.SupplyInvoiceID
I want that in output column TotaBill, bill amount should be shown only one for each SupplyInvoiceID i.e
Required Output
Your problem requires an ordering for the table. It appears to be by SupplyPaymentId (although any column can be used). To do what you want, you can use row_number() and an explicit order by in the query:
select si.SupplyInvoiceID, si.CompanyID,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY si.SupplyInvoiceID order by siph.SupplyPaymentID) = 1
THEN si.TotalBill
END) as TotalBill
siph.BillPaidAmount,
(si.TotalBill -
SUM(siph.BillPaidAmount) over (partition by si.SupplyInvoiceID order by siph.SupplyPaymentID asc)
) as RemainingBillAmount
from SupplyInvoicePaymentHistory siph left join
SupplyInvoice si
on siph.SupplyInvoiceID = si.SupplyInvoiceID
order by si.SupplyInvoiceID, siph.SupplyPaymentID

SQL ZOO Window LAG #8

Question: For each country that has had at last 1000 new cases in a single day, show the date of the peak number of new cases.
Here is a few sample data of the covid table.
What I write:
SELECT name,date,MAX(confirmed-lag) AS PeakNew
FROM(
SELECT name, DATE_FORMAT(whn,'%Y-%m-%d') date, confirmed,
LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn) lag
FROM covid
ORDER BY confirmed
) temp
GROUP BY name
HAVING PeakNew>=1000
ORDER BY PeakNew DESC;
The result I got is weird, PeakNew seems correct, but the related date is not.
My answer
The right answer
Anyone can help to get the right answer? Thank you!
The below query works perfectly fine for me. Though the dates and values are correct, the output will say otherwise as the order is different. Here the order is by date, then by name.
SELECT z1.name, DATE_FORMAT(c.dt,'%Y-%m-%d'), z1.nc
FROM
(
SELECT z.name, MAX(z.nc) AS 'mx'
FROM (
SELECT DATE(whn) AS 'dt', name, confirmed - LAG(confirmed,1) OVER(PARTITION BY name ORDER BY DATE(whn) ASC) AS 'nc'
FROM covid ) z
WHERE z.nc >= 1000
GROUP BY z.name
) z1
INNER JOIN
(
SELECT DATE(whn) AS 'dt', name, confirmed - LAG(confirmed,1) OVER(PARTITION BY name ORDER BY DATE(whn) ASC) AS 'nc'
FROM covid
) c
ON c.nc = z1.mx
AND c.name = z1.name
ORDER BY 2 ASC
The date value in the outer query doesn't correspond to row where MAX(confirmed-lag) is found - it's just a random date value within that group. Check out the section titled, "The ONLY_FULL_GROUP_BY Issue" in this blog post: https://www.percona.com/blog/2019/05/13/solve-query-failures-regarding-only_full_group_by-sql-mode/ for more information.
I used the ROW_NUMBER() function to get the entire row corresponding to the maximum new cases. However, my final result wasn't ordered the way the answer was, and there's no specification to how it should be ordered, so I still didn't get that satisfying happy emoji.
You need to self join to obtain the date on which the max count occurred:
WITH CTE1 as
(SELECT name,DATE_FORMAT(whn, "%Y-%m-%d") as date,
confirmed - LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY DATE(whn)) as increase
FROM covid
ORDER BY whn),
CTE2 AS
(SELECT name, MAX(increase) as max_increase
FROM CTE1
WHERE increase >999
GROUP BY name
ORDER BY date)
SELECT c1.name,c1.date,c2.max_increase as peakNewCases
FROM CTE1 as c1
JOIN CTE2 as c2
ON c1.name=c2.name AND c1.increase=c2.max_increase
WITH CTE1 as
(SELECT name, DATE_FORMAT(whn,'%Y-%m-%d') as date_form, confirmed - LAG(confirmed,1) OVER(PARTITION BY name ORDER BY whn) AS newcases
FROM covid
ORDER BY name,whn)
SELECT name, date_form, newcases FROM
(
SELECT name, date_form, newcases, ROW_NUMBER() OVER (PARTITION BY name ORDER BY newcases DESC) as rank
FROM CTE1
WHERE newcases > 999
) cte2
WHERE rank =1

Select most recent status for each ID and department code

I have the following table:
I want to get the most recent status for each dept_code that a CL_ID has. So the desired output would be this:
I have tried the following but this give me just the most recent status for each client and not each of their dept_codes.
SELECT *
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT] C
INNER JOIN
(SELECT CLIENT_NUMBER, MAX(STATUS_DATE) AS SDATE
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
GROUP BY CLIENT_NUMBER) X
ON X.CLIENT_NUMBER = C.CLIENT_NUMBER
AND X.SDATE = C.STATUS_DATE
ORDER BY C.CLIENT_NUMBER
Any help would be much appreciated. Thanks.
A convenient method that works in SQL Server is:
select top (1) cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
order by row_number() over (partition by cl_id, dept_code order by status_date desc);
A method that is efficient with the right indexes in almost any database is:
select cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
where cl.status_date = (select max(cl2.status_date)
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl2
where cl2.cl_id = cl.cl_id and cl2.dept_code = cl.dept_code
);
The right index is on (cl_id, dept_code, status_date).
I would also use ROW_NUMBER, but with a subquery:
SELECT CL_ID, Status_date, Status, Dept_code
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CL_ID, Dept_code ORDER BY Status_date DESC) rn
FROM CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) t
WHERE rn = 1;
1) Firstly group everything on Dept_Code,CL_ID and assign rank for each row with in the group in descending order.
2) Select all the rows with rnk=1 which would display your desired result.
SELECT Z.CL_ID,
Z.Status_Date,
Z.Status,
Z.Dept_Code
FROM
(
SELECT *,
RANK() OVER( PARTITION BY Dept_Code,CL_ID, ORDER BY Status_Date DESC ) AS rnk
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) Z
WHERE Z.rnk = 1;
This would work for almost all databases
select * from c3clstat c
where exists
(select 1 from c3clstat c1
where c1.cl_id=c.cl_id
and c1.dept_code=c.dept_code
group by cl_id,dept_code
having c.status_date=max(c1.status_date)
)

(SQL Server) using row count to sort the list but dont need list out the row count number

All the column that inside the select sql are needed to list out ,except the row_number(),any solution to eliminate to row_count ?
SELECT *
FROM
(SELECT Station,
ROW_NUMBER() over (
ORDER BY totalseq ASC) AS rownumber1
FROM [SFCKM].[dbo].[T_DB_Subline]
WHERE Track_Point_No = '3d1')a
LEFT JOIN
(SELECT group_no,
trim_line,
MSC,
lot_no,
color,
AON,
format(Commit_time,'MM/dd/yy h:mm:ss tt')AS time,
datediff(DAY,Commit_Time,SYSDATETIME()) AS aging,
ROW_NUMBER() over (
ORDER BY commit_time DESC) AS rownumber
FROM [SFCKM].[dbo].[T_Work_Actual]
WHERE Track_Point_No = '3d1') c ON a.rownumber1 = c.rownumber
ORDER BY a.rownumber1
You could just select the values you are looking for e.g. Station and aging.
select a.Station, c.aging from
(select Station, ROW_NUMBER() over (order by totalseq asc) AS rownumber1
from [SFCKM].[dbo].[T_DB_Subline] where Track_Point_No = '3d1') a
left join
(*,aging,ROW_NUMBER() over (order by commit_time desc) AS rownumber
FROM [SFCKM].[dbo].[T_Work_Actual] where Track_Point_No = '3d1') c
on a.rownumber1 = c.rownumber
order by a.rownumber1
Don't use SELECT *: specify only the columns you need. SELECT * is not best practice, just lazy
There is no way to exclude a column as per my answer SQL exclude a column using SELECT * [except columnA] FROM tableA?

Want result with out using subquery?

SELECT
ROW_NUMBER()over(partition by tblProductTemplateHdr.product_ID
order by tblProductTemplateHdr.product_ID, tblProcessSequence.sl_No) AS rno,
tblProductTemplateHdr.product_ID
,tblProductProcessHdr.process_ID
,tblProcessSequence.sl_No
FROM
Production.tblProcessSequence
INNER JOIN
Production.tblProductProcessHdr ON tblProductProcessHdr.product_Process_ID = tblProcessSequence.product_Process_ID AND tblProductProcessHdr.isQC_Need = 1
INNER JOIN
Production.tblProductTemplateHdr ON tblProductTemplateHdr.product_Temp_ID = tblProductProcessHdr.product_Temp_ID
I want the row with maximum sl_No in each product_Id without using a subquery, this the result obtained by running this query want to apply filtering on same query
You need to a) rewrite your query just a little, and b) I'd recommend using table aliases to make your query more readable.
Try this:
;WITH ProductData AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY pth.Product_ID
ORDER BY pth.Product_ID, ps.sl_No DESC) AS rno,
tph.product_ID,
tph.process_ID,
ps.sl_No
FROM
Production.tblProcessSequence ps
INNER JOIN
Production.tblProductProcessHdr pph ON tph.product_Process_ID = ps.product_Process_ID
AND pph.isQC_Need = 1
INNER JOIN
Production.tblProductTemplateHdr tph ON tph.product_Temp_ID = pph.product_Temp_ID
)
SELECT *
FROM
ProductData
WHERE
rno = 1
The ROW_NUMBER() function partitions your data by Product_ID and within each partition, it orders the rows by sl_No DESC - so the highest value of sl_No gets the rno = 1 value (all others get higher numbers, in each partition)
You can use another windowed function:
MAX(tblProcessSequence.sl_No) OVER(PARTITION BY tblProductTemplateHdr.product_ID)
ADDENDUM
Just to give the full query in context in case the above was not clear:
SELECT ROW_NUMBER() OVER (PARTITION BY tempHdr.Product_ID ORDER BY Seq.sl_No DESC) AS rno,
tempHdr.product_ID,
procHdr.process_ID,
Seq.sl_No,
MAX(Seq.sl_No) OVER(PARTITION BY tblProductTemplateHdr.Product_ID) AS Max_SL_No
FROM Production.tblProcessSequence Seq
INNER JOIN Production.tblProductProcessHdr procHdr
ON Seq.product_Process_ID = tblProductProcessHdr.product_Process_ID
AND procHdr.isQC_Need = 1
INNER JOIN Production.tblProductTemplateHdr tempHdr
ON tempHdr.product_Temp_ID = procHdr.product_Temp_ID