SELECT only latest record of an ID from given rows - sql

I have this table shown below...How do I select only the latest data of the id based on changeno?
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Yes | 2 | |
| 2 | Maybe | 3 | |
| 3 | Yes | 4 | |
| 3 | Yes | 5 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Maybe | 8 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I would want this result...
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Maybe | 3 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I currently have this SQL statement...
SELECT id, data, MAX(changeno) as changeno FROM Table1 GROUP BY id;
and clearly it doesn't return what I want. This should return an error because of the aggrerate function. If I added fields under the GROUP BY clause it works but it doesn't return what I want. The SQL statement is by far the closest I could think of. I'd appreciate it if anybody could help me on this. Thank you in advance :)

This is typically referred to as the "greatest-n-per-group" problem. One way to solve this in SQL Server 2005 and higher is to use a CTE with a calculated ROW_NUMBER() based on the grouping of the id column, and sorting those by largest changeno first:
;WITH cte AS
(
SELECT id, data, changeno,
rn = ROW_NUMBER() OVER (PARTITION BY id ORDER BY changeno DESC)
FROM dbo.Table1
)
SELECT id, data, changeno
FROM cte
WHERE rn = 1
ORDER BY id;

You want to use row_number() for this:
select id, data, changeno
from (SELECT t.*,
row_number() over (partition by id order by changeno desc) as seqnum
FROM Table1 t
) t
where seqnum = 1;

Not a well formed or performance optimized query but for small tasks it works fine.
SELECT * FROM TEST
WHERE changeno IN (SELECT MAX(changeno)
FROM TEST
GROUP BY id)

for other alternatives :
DECLARE #Table1 TABLE
(
id INT, data VARCHAR(5), changeno INT
);
INSERT INTO #Table1
SELECT 1,'Yes',1
UNION ALL
SELECT 2,'Yes',2
UNION ALL
SELECT 2,'Maybe',3
UNION ALL
SELECT 3,'Yes',4
UNION ALL
SELECT 3,'Yes',5
UNION ALL
SELECT 3,'No',6
UNION ALL
SELECT 4,'No',7
UNION ALL
SELECT 5,'Maybe',8
UNION ALL
SELECT 5,'Yes',9
SELECT Y.id, Y.data, Y.changeno
FROM #Table1 Y
INNER JOIN (
SELECT id, changeno = MAX(changeno)
FROM #Table1
GROUP BY id
) X ON X.id = Y.id
WHERE X.changeno = Y.changeno
ORDER BY Y.id

Related

Get some values from the table by selecting

I have a table:
| id | Number |Address
| -----| ------------|-----------
| 1 | 0 | NULL
| 1 | 1 | NULL
| 1 | 2 | 50
| 1 | 3 | NULL
| 2 | 0 | 10
| 3 | 1 | 30
| 3 | 2 | 20
| 3 | 3 | 20
| 4 | 0 | 75
| 4 | 1 | 22
| 4 | 2 | 30
| 5 | 0 | NULL
I need to get: the NUMBER of the last ADDRESS change for each ID.
I wrote this select:
select dh.id, dh.number from table dh where dh =
(select max(min(t.history)) from table t where t.id = dh.id group by t.address)
But this select not correctly handling the case when the address first changed, and then changed to the previous value. For example id=1: group by return:
| Number |
| -------- |
| NULL |
| 50 |
I have been thinking about this select for several days, and I will be happy to receive any help.
You can do this using row_number() -- twice:
select t.id, min(number)
from (select t.*,
row_number() over (partition by id order by number desc) as seqnum1,
row_number() over (partition by id, address order by number desc) as seqnum2
from t
) t
where seqnum1 = seqnum2
group by id;
What this does is enumerate the rows by number in descending order:
Once per id.
Once per id and address.
These values are the same only when the value is 1, which is the most recent address in the data. Then aggregation pulls back the earliest row in this group.
I answered my question myself, if anyone needs it, my solution:
select * from table dh1 where dh1.number = (
select max(x.number)
from (
select
dh2.id, dh2.number, dh2.address, lag(dh2.address) over(order by dh2.number asc) as prev
from table dh2 where dh1.id=dh2.id
) x
where NVL(x.address, 0) <> NVL(x.prev, 0)
);

Select Last Record Based on few criteria

Before
+--------+--------+---------+-------+------+
| RowNum | Status | Remarks | SetNo | |
+--------+--------+---------+-------+------+
| 1 | Q | | Set 1 | Want |
| 2 | Q | | Set 1 | Want |
| 3 | Q | | Set 1 | Want |
| 4 | Q | | Set 1 | Want |
| 5 | W | | Set 1 | Want |
| 1 | W | abc | Set 2 | |
| 2 | W | abc | Set 2 | |
| 3 | W | abc | Set 2 | |
| 4 | W | abc | Set 2 | Want |
| 1 | Q | | Set 3 | Want |
| 2 | w | abc | Set 3 | |
| 3 | w | abc | Set 3 | Want |
+--------+--------+---------+-------+------+
How to select Status=Q and Status=W based on Rownum=lastnumber and setno? Expectation result is the row with "want" is what i need. Those empty, will be remove
Tried:
select *
from mytable
where (RowNum != (select max(RowNum) from mytable) and status = 'W')
I understand that for each setno, you want all "Q"s and the latest "W". If so, you can use window functions like that:
select *
from (
select t.*,
row_number() over(partition by setno, status order by rownum desc) rn
from mytable t
) t
where rn = 1 or status = 'Q'
You might want to look into Window Functions. I don't fully understand what you need to do but I would suggest something like:
with rowNumberedData as
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY SetNo ORDER BY RowNum DESC) as RowOrder
FROM mytable
)
SELECT *
FROM rowNumberedData
WHERE (Status = 'Q' OR Status = 'W') AND RowOrder = 1
What this will do is add RowOrder column to your data and its value will be 1, for the max RowNum in every set. You can read more here and here to check what the with syntax is if you are unfamiliar.
This query should return the correct rows
with t_cte as (
select t.*,
row_number() over(partition by setno order by rownum desc) rn
from testTable t)
select *
from t_cte
where [status] = 'Q'
or ([status] = 'W'
and rn = 1);
I understand the question and wanting the last row for each set and then all rows with q. One method uses row_number():
select t.*
from (select t.*,
row_number() over (partition by setno order by rownum desc) as seqnum
from mytable t
) t
where seqnum = 1 or status = 'Q';
There are other ways to express this:
select t.*
from mytable t
where t.status = 'Q' or
t.rownum = (select max(t2.rownum)
from mytable t2
where t2.setno = t.setno
);
This is similar to the approach you are trying.

SQL SERVER How to select the latest record in each group? [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 2 years ago.
| ID | TimeStamp | Item |
|----|-----------|------|
| 1 | 0:00:20 | 0 |
| 1 | 0:00:40 | 1 |
| 1 | 0:01:00 | 1 |
| 2 | 0:01:20 | 1 |
| 2 | 0:01:40 | 0 |
| 2 | 0:02:00 | 1 |
| 3 | 0:02:20 | 1 |
| 3 | 0:02:40 | 1 |
| 3 | 0:03:00 | 0 |
I have this and I would like to turn it into
| ID | TimeStamp | Item |
|----|-----------|------|
| 1 | 0:01:00 | 1 |
| 2 | 0:02:00 | 1 |
| 3 | 0:03:00 | 0 |
Please advise, thank you!
A correlated subquery is often the fastest method:
select t.*
from t
where t.timestamp = (select max(t2.timestamp)
from t t2
where t2.id = t.id
);
For this, you want an index on (id, timestamp).
You can also use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by timestamp desc) as seqnum
from t
) t
where seqnum = 1;
This is typically a wee bit slower because it needs to assign the row number to every row, even those not being returned.
You need to group by id, and filter out through timestamp values descending in order to have all the records returning as first(with value 1) in the subquery with contribution of an analytic function :
SELECT *
FROM
(
SELECT *,
DENSE_RANK() OVER (PARTITION BY ID ORDER BY TimeStamp DESC) AS dr
FROM t
) t
WHERE t.dr = 1
where DENSE_RANK() analytic function is used in order to include records with ties also.

How to delete the rows with three same data columns and one different data column

I have a table "MARK_TABLE" as below.
How can I delete the rows with same "STUDENT", "COURSE" and "SCORE" values?
| ID | STUDENT | COURSE | SCORE |
|----|---------|--------|-------|
| 1 | 1 | 1 | 60 |
| 3 | 1 | 2 | 81 |
| 4 | 1 | 3 | 81 |
| 9 | 2 | 1 | 80 |
| 10 | 1 | 1 | 60 |
| 11 | 2 | 1 | 80 |
Now I already filtered the data I want to KEEP, but without the "ID"...
SELECT student, course, score FROM mark_table
INTERSECT
SELECT student, course, score FROM mark_table
The output:
| STUDENT | COURSE | SCORE |
|---------|--------|-------|
| 1 | 1 | 60 |
| 1 | 2 | 81 |
| 1 | 3 | 81 |
| 2 | 1 | 80 |
Use the following query to delete the desired rows:
DELETE FROM MARK_TABLE M
WHERE
EXISTS (
SELECT
1
FROM
MARK_TABLE M_IN
WHERE
M.STUDENT = M_IN.STUDENT
AND M.COURSE = M_IN.COURSE
AND M.SCORE = M_IN.SCORE
AND M.ID < M_IN.ID
)
OUTPUT
db<>fiddle demo
Cheers!!
use distinct
SELECT distinct student, course, score FROM mark_table
Assuming you don't just want to select the unique data you want to keep (you mention you've already done this), you can proceed as follows:
Create a temporary table to hold the data you want to keep
Insert the data you want to keep into the temporary table
Empty the source table
Re-Insert the data you want to keep into the source table.
select * from
(
select row_number() over (partition by student,course,score order by score)
rn,student,course,score from mark_table
) t
where rn=1
Use CTE with RowNumber
create table #MARK_TABLE (ID int, STUDENT int, COURSE int, SCORE int)
insert into #MARK_TABLE
values
(1,1,1,60),
(3,1,2,81),
(4,1,3,81),
(9,2,1,80),
(10,1,1,60),
(11,2,1,80)
;with cteDeleteID as(
Select id, row_number() over (partition by student,course,score order by score) [row_number] from #MARK_TABLE
)
delete from #MARK_TABLE where id in
(
select id from cteDeleteID where [row_number] != 1
)
select * from #MARK_TABLE
drop table #MARK_TABLE

SQL - How to find which page is the first for users?

I have a table like this:
+----------+-------------------------------------+----------------------------------+
| user_id | time | url |
+----------+-------------------------------------+----------------------------------+
| 1 | 02.04.2017 8:56 | www.landingpage.com/ |
| 1 | 02.04.2017 8:57 | www.landingpage.com/about-us |
| 1 | 02.04.2017 8:58 | www.landingpage.com/faq |
| 2 | 02.04.2017 6:34 | www.landingpage.com/about-us |
| 2 | 02.04.2017 6:35 | www.landingpage.com/how-to-order |
| 3 | 03.04.2017 9:11 | www.landingpage.com/ |
| 3 | 03.04.2017 9:12 | www.landingpage.com/contact |
| 3 | 03.04.2017 9:13 | www.landingpage.com/about-us |
| 3 | 03.04.2017 9:14 | www.landingpage.com/our-legacy |
| 3 | 03.04.2017 9:15 | www.landingpage.com/ |
+----------+-------------------------------------+----------------------------------+
I want to figure out which page is the first for most users (first page a user see when he comes to the site) and count the number of times it is viewed as the first page.
Is there a way to write a query to do this? I guess I need to use
MIN(time)
in conjunction with grouping but I don't know how.
So regarding the sample I provided it should be like:
url url_count
---------------------------------------------------
www.landingpage.com/ 2
www.landingpage.com/about-us 1
Thanks!
You're correct, you'll need to use the min() aggregate function within a subselect.
select
my_table.url
from
my_table
where
my_table.time = (
select
min(t.time)
from
my_table t
where
t.user_id = my_table.user_id
)
replace my_table with whatever your table is actually named.
To include how many pages the user has seen, you'll need something like this:
select
my_table.url
, (
select
count(t.url)
from
my_table t
where
t.user_id = my_table.user_id
) as url_count
from
my_table
where
my_table.time = (
select
min(t.time)
from
my_table t
where
t.user_id = my_table.user_id
)
SELECT *
FROM my_table
WHERE time IN
(
SELECT min(time)
FROM my_table
GROUP BY url
);
You can query as below:
Select top (1) with ties *
from yourtable
order by row_number() over(partition by user_id order by [time])
You can use outer query to get the same as below:
Select * from (
Select *, RowN = row_number() over(partition by user_id order by [time]) from yourtable) a
Where a.RowN = 1