postgresql - How to get one row the min value - sql

I have table (t_image) with this column
datacd | imagecode | indexdate
----------------------------------
A | 1 | 20170213
A | 2 | 20170213
A | 3 | 20170214
B | 4 | 20170201
B | 5 | 20170202
desired result is this
datacd | imagecode | indexdate
----------------------------------
A | 1 | 20170213
B | 4 | 20170201
In the above table, I want to retrieve 1 row for each datacd who has the minimum index date
Here is my query, but the result returns 2 rows for datacd A
select *
from (
select datacd, min(indexdate) as indexdate
from t_image
group by datacd
) as t1 inner join t_image as t2 on t2.datacd = t1.datacd and t2.indexdate = t1.indexdate;

The Postgres proprietary distinct on () operator is typically the fastest solution for greatest-n-per-group queries:
select distinct on (datacd) *
from t_image
order by datacd, indexdate;

One option uses ROW_NUMBER():
SELECT t.datacd,
t.imagecode,
t.indexdate
FROM
(
SELECT datacd, imagecode, indexdate,
ROW_NUMBER() OVER (PARTITION BY datacd ORDER BY indexdate) rn
FROM t_image
) t
WHERE t.rn = 1

Related

Linking tables on multiple criteria

I've got myself in a bit of a mess on something I'm doing where I'm trying to get two tables linked together based on multiple bits of info.
I want to link one table to another based on the basic rules of(in this hierarchy)
where main linking is where orderid matches between the two tables
records from table 2 where valid=Y,
from those i want the valid records which has the highest seqn1 number and then from those the one that has the highest seqn2 value
table1
orderid | date | otherinfo
223344 | 22/10/2020 | okokkokokooeodijjf
table2
orderid | seqn1 | seqn2 | valid | additonaldata
223344 | 1 | 3 | y | sdfsfsf
223344 | 2 | 1 | y | sffferfr
223344 | 2 | 2 | y | sfrfrefr -- This row
223344 | 2 | 3 | n | rfrg66rr
223344 | 2 | 4 | n | adwere
223344 | 3 | 4 | n | adwere
so would want the final record to be
orderid | date | otherinfo | seqn1 | seqn2 | valid | additonaldata
223344 | 22/10/2020 | okokkokokooeodijjf | 2 | 2 | y | sfrfrefr
I started off with the code below but I'm not sure I'm doing it right and I can't seem to get it to pay attention to the valid flag when i try to add it in.
SELECT * FROM table1
left JOIN table2
ON table1.orderid = table2.orderid
AND table2.seqn1 = (SELECT MAX(table2.seqn1) FROM table2 WHERE table1.orderid = table2.orderid)
AND table2.seqn2 = (SELECT MAX(table2.seqn2) FROM table2 WHERE table1.orderid = table2.orderid
AND table2.seqn1 = (SELECT MAX(table2.seqn1) FROM table2 WHERE table1.orderid = table2.orderid))
Could someone help me amend the code please.
Use row_number analytic function with partition by orderid and order by SEQNRs in the order you need. No need for multiple subselects. To add more selections for the single row, use CASE to map your values to numbers and order by them also.
Fiddle here.
with l as (
select *,
rank() over(partition by orderid order by seqn1 desc, seqn2 desc) as rn
from line
where valid = 'y'
)
select *
from header as h
join l
on h.orderid = l.orderid
and l.rn = 1
How about something like this:
;
with cte_table2 as
(
SELECT ordered
,MAX(seqn1) as seqn1
,MAX(seqn2) as seqn2
FROM table2
where valid = 'y'
group by ordered --check if you need to add 'valid' to the group by but I don't think so.
)
SELECT
t1.*
,t3.otherinfo
--,t3.[OtherFields]
from table1 t1
inner join cte_table2 t2 on t1.orderid = t2.orderid -- first match on id
left join table2 t3 on t3.orderid = t2.orderid and t3.seqn1 = t2.seqn1 and t3.seqn2 = t2.seqn2

Select row with max value saving distinct column

I have values
- id|type|Status|Comment
- 1 | P | 1 | AAA
- 2 | P | 2 | BBB
- 3 | P | 3 | CCC
- 4 | S | 1 | DDD
- 5 | S | 2 | EEE
I wan to get values for each type with max status and with comment from the row with max status:
- id|type|Status|Comment
- 3 | P | 3 | CCC
- 5 | S | 2 | EEE
All the existing questions on SO do not care about the right correspondence of Max type and value.
This gives you one row per type, which have max status
select * from (
select your_table.*, row_number() over(partition by type order by Status desc) as rn from your_table
) tt
where rn = 1
Corrected: The below will use a subquery to figure out each type and what the max status is, then it joins that onto the original table and uses the where clause to only select those rows where the status equals the max status. Of note, if you have multiple records with the same max status, you will get both of them to come up.
WITH T1 AS (SELECT type, MAX(STATUS) AS max_status FROM table_name GROUP BY type)
SELECT t2.id, t2.type, t2.status, t2.comment
FROM T1 LEFT JOIN table_name t2 ON t2.type= T1.type
WHERE t2.status = T1.max_status

SQL Server - Select Distinct of two columns, where the distinct column selected has a maximum value based on two other columns

I have 2 tables - TC and T, with columns specified below. TC maps to T on column T_ID.
TC
----
T_ID,
TC_ID
T
-----
T_ID,
V_ID,
Datetime,
Count
My current result set is:
V_ID TC_ID Datetime Count
----|-----|------------|--------|
2 | 1 | 2013-09-26 | 450600 |
2 | 1 | 2013-12-09 | 14700 |
2 | 1 | 2014-01-22 | 15000 |
2 | 1 | 2014-01-22 | 15000 |
2 | 1 | 2014-01-22 | 7500 |
4 | 1 | 2014-01-22 | 1000 |
4 | 1 | 2013-12-05 | 0 |
4 | 2 | 2013-12-05 | 0 |
Using the following query:
select T.V_ID,
TC.TC_ID,
T.Datetime,
T.Count
from T
inner join TC
on TC.T_ID = T.T_ID
Result set I want:
V_ID TC_ID Datetime Count
----|-----|------------|--------|
2 | 1 | 2014-01-22 | 15000 |
4 | 1 | 2014-01-22 | 1000 |
4 | 2 | 2013-12-05 | 0 |
I want to write a query to select each distinct V_ID + TC_ID combination, but only with the maximum datetime, and for that datetime the maximum count. E.g. for the distinct combination of V_ID = 2 and TC_ID = 1, '2014-01-22' is the maximum datetime, and for that datetime, 15000 is the maximum count, so select this record for the new table. Any ideas? I don't know if this is too ambitious for a query and I should just handle the result set in code instead.
One method uses row_number():
select v_id, tc_id, datetime, count
from (select T.V_ID, TC.TC_ID, T.Datetime, T.Count,
row_number() over (partition by t.V_ID, tc.tc_id
order by datetime desc, count desc
) as seqnum
from t join
tc
on tc.t_id = t._id
) tt
where seqnum = 1;
The only issue is that some rows have the same maximum datetime value. SQL tables represent unordered sets, so there is no way to determine which is really the maximum -- unless the datetime really has a time component or another column specifies the ordering within a day.
It is possible to solve this using CTEs. First, extracting the data from your query. Second, get the maxdates. Third, get the highest count for each maxdate.:
;WITH Dataset AS
(
select T.V_ID,
TC.TC_ID,
T.[Datetime],
T.[Count]
from T
inner join TC
on TC.T_ID = T._ID
),
MaxDates AS
(
SELECT V_ID, TC_ID, MAX(t.[Datetime]) AS MaxDate
FROM Dataset t
GROUP BY t.V_ID, t.TC_ID
)
SELECT t.V_ID, t.TC_ID, t.[Datetime], MAX(t.[Count]) AS [Count]
FROM Dataset t
INNER JOIN MaxDates m ON t.V_ID = m.V_ID AND t.TC_ID = m.TC_ID AND m.MaxDate = t.[Datetime]
GROUP BY t.V_ID, t.TC_ID, t.[Datetime]
Just to keep it simple:
You need to group by T.V_ID,TC.TC_ID,
with selecting the max of date and then to get the maximum count, you must use a sub query as follows,
select T.V_ID,
TC.TC_ID,
max(T.Datetime) as Date_Time,
(select max(Count) from T as tb where v_ID = T.v_ID and DateTime = max(T.DateTime)) as Count
from T
inner join TC
on TC.T_ID = T._ID
group by T.V_ID,TC.TC_ID,

how to query range?

Raw Data
| ID | STATUS |
| 1 | A |
| 2 | A |
| 3 | B |
| 4 | B |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
| 9 | C |
Result
| START | END |
| 1 | 2 |
| 6 | 8 |
Range of STATUS A
How to query ?
This should give you the correct ranges:
SELECT
STATUS,
MIN(ID),
max_id
FROM (
SELECT
t1.STATUS,
t1.ID,
COALESCE(MAX(t2.ID), t1.ID) max_id
FROM
yourtable t1 LEFT JOIN yourtable t2
ON t1.STATUS=t2.STATUS AND t1.ID<t2.ID
WHERE
NOT EXISTS (SELECT NULL
FROM yourtable t3
WHERE
t3.STATUS!=t1.STATUS
AND t3.ID>t1.ID AND t3.ID<t2.ID)
GROUP BY
t1.ID,
t1.STATUS
) s
WHERE
status = 'A'
GROUP BY
STATUS,
max_id
Please see fiddle here.
You are probably better off with a cursor-based solution or a client-side function.
However, if you were using Oracle - the following would work.
WITH LOWER_VALS AS
( -- All the Ids with no immediate predecessor
SELECT ROWNUM AS RN, STATUS, ID AS LOWER FROM
(
SELECT STATUS, ID
FROM RAWDATA RD1
WHERE RD1.ID -1 NOT IN
(SELECT ID FROM RAWDATA PRED_TABLE WHERE PRED_TABLE.STATUS = RD1.STATUS)
ORDER BY STATUS, ID
)
) ,
UPPER_VALS AS
( -- All the Ids with no immediate successor
SELECT ROWNUM AS RN, STATUS, ID AS UPPER FROM
(
SELECT STATUS, ID
FROM RAWDATA RD2
WHERE RD2.ID +1 NOT IN
(SELECT ID FROM RAWDATA SUCC_TABLE WHERE SUCC_TABLE.STATUS = RD2.STATUS)
ORDER BY STATUS, ID
)
)
SELECT
L.STATUS, L.LOWER, U.UPPER
FROM
LOWER_VALS L
JOIN UPPER_VALS U ON
U.RN = L.RN;
Results in the set
A 1 2
A 6 8
B 3 5
C 9 9
http://sqlfiddle.com/#!4/10184/2
There is not a lot to go on from what you put, but I think this might work. I am using T-SQL because I don't know what you are using?
SELECT
min(ID)
, max(ID)
FROM RawData
WHERE [Status] = 'A'

Grouping SQL Results based on order

I have table with data something like this:
ID | RowNumber | Data
------------------------------
1 | 1 | Data
2 | 2 | Data
3 | 3 | Data
4 | 1 | Data
5 | 2 | Data
6 | 1 | Data
7 | 2 | Data
8 | 3 | Data
9 | 4 | Data
I want to group each set of RowNumbers So that my result is something like this:
ID | RowNumber | Group | Data
--------------------------------------
1 | 1 | a | Data
2 | 2 | a | Data
3 | 3 | a | Data
4 | 1 | b | Data
5 | 2 | b | Data
6 | 1 | c | Data
7 | 2 | c | Data
8 | 3 | c | Data
9 | 4 | c | Data
The only way I know where each group starts and stops is when the RowNumber starts over. How can I accomplish this? It also needs to be fairly efficient since the table I need to do this on has 52 Million Rows.
Additional Info
ID is truly sequential, but RowNumber may not be. I think RowNumber will always begin with 1 but for example the RowNumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6", etc.
For the clarified requirements in the comments
The rownumbers for group1 could be "1,1,2,2,3,4" and for group2 they
could be "1,2,4,6" ... a higher number followed by a lower would be a
new group.
A SQL Server 2012 solution could be as follows.
Use LAG to access the previous row and set a flag to 1 if that row is the start of a new group or 0 otherwise.
Calculate a running sum of these flags to use as the grouping value.
Code
WITH T1 AS
(
SELECT *,
LAG(RowNumber) OVER (ORDER BY ID) AS PrevRowNumber
FROM YourTable
), T2 AS
(
SELECT *,
IIF(PrevRowNumber IS NULL OR PrevRowNumber > RowNumber, 1, 0) AS NewGroup
FROM T1
)
SELECT ID,
RowNumber,
Data,
SUM(NewGroup) OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
FROM T2
SQL Fiddle
Assuming ID is the clustered index the plan for this has one scan against YourTable and avoids any sort operations.
If the ids are truly sequential, you can do:
select t.*,
(id - rowNumber) as grp
from t
Also you can use recursive CTE
;WITH cte AS
(
SELECT ID, RowNumber, Data, 1 AS [Group]
FROM dbo.test1
WHERE ID = 1
UNION ALL
SELECT t.ID, t.RowNumber, t.Data,
CASE WHEN t.RowNumber != 1 THEN c.[Group] ELSE c.[Group] + 1 END
FROM dbo.test1 t JOIN cte c ON t.ID = c.ID + 1
)
SELECT *
FROM cte
Demo on SQLFiddle
How about:
select ID, RowNumber, Data, dense_rank() over (order by grp) as Grp
from (
select *, (select min(ID) from [Your Table] where ID > t.ID and RowNumber = 1) as grp
from [Your Table] t
) t
order by ID
This should work on SQL 2005. You could also use rank() instead if you don't care about consecutive numbers.