In MySQL: fetching rows distinct by year - sql

I have a MySQL table similar to this:
| id | name | create_date |
---------------------------
| 1 | foo | 2003-03-11 |
| 2 | goo | 2003-04-27 |
| 3 | woo | 2004-10-07 |
| 4 | too | 2004-12-01 |
| 5 | hoo | 2005-04-20 |
| 6 | koo | 2006-01-12 |
| 7 | boo | 2006-04-17 |
| 8 | moo | 2006-08-19 |
I want to fetch all the latest yearly rows - one per year. So in the example above I'll get 2, 4, 5 and 8.
What's the right syntax?

Some of the other answers may work for you but this simple query does not require any joins
SELECT YEAR(create_date),
(SELECT id ORDER BY create_date DESC LIMIT 1)
FROM mytable
group by YEAR(create_date)

you can do something like
select * from table_name
where create_date in (
select max(create_date)
from table_name
group by year(create_date))

SELECT id FROM foo JOIN
(SELECT YEAR(create_date),MAX(create_date) AS md
FROM foo
GROUP BY YEAR(create_date)) as maxes
ON (create_date=md);
If you put an index on create_date, this will be fairly fast.

SELECT mi.*
FROM (
SELECT DISTINCT YEAR(created_date) AS dyear
FROM mytable
) md
JOIN mytable mi
ON mi.id =
(
SELECT id
FROM mytable ml
WHERE ml.create_date < CAST(CONCAT_WS('.', dyear + 1, 1, 1)) AS DATETIME)
ORDER BY
ml.create_date DESC
LIMIT 1
)

select id
from mytable
where not exists (
select * from mytable as T2
where T2.id = mytable.id
and T2.id >= year(created_date) + 1
)

Related

SQL - SUM of the for max ID

I have a table like this,
| id | name | subtask | maintask |
|----|------|---------|----------|
| 1 | t1 | 11 | 20 |
| 1 | t1 | 12 | 20 |
| 1 | t1 | 1 | 30 |
| 2 | t1 | 2 | 20 |
| 2 | t1 | 2 | 20 |
I want to prepare a result like this
| id | name | sum_of_subtask | sum_of_maintask | diff |
|----|------|----------------|-----------------|------|
| 2 | t1 | 4 | 40 | 36 |
Need to pick the max ID, then do the sum for subtask and maintask, then the last column is the difference of sum(subtask) and sum(maintask)
I tried this below query, but its calculating the sum for all the columns.
select max(id), name, sum(subtask),sum(maintask),sum(subtask-maintask) from tbl
group by name
Do you just want one row? If so, use order by and limit:
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl
group by id, name
order by id desc
limit 1;
If your data is large, it might be more efficient to filter before aggregating:
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl
where id = (select max(id) from tbl)
group by id, name;
If you want the maximum id per name, then the filtering logic is:
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl t
where t.id = (select max(t2.id) from tbl t2 where t2.name = t.name)
group by id, name;
Please use below query,
select id, name, sum(subtask), sum (maintask), sum(subtask)-sum (maintask)
where id in
(select max(id) from table)
group by id, name;
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl
where id = (select max(id) from tbl)
group by id, name

Getting the last updated name

I am having a table having records like this:
+------+------+
| ID | name |
+------+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
+------+------+
I need to get value of A after it was last updated from a different value, for example here it would be the row at ID 6.
Try this query (MySQL syntax):
select min(ID)
from records
where name = 'A'
and ID >=
(
select max(ID)
from records
where name <> 'A'
);
Illustration:
select * from records;
+------+------+
| ID | name |
+------+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
+------+------+
-- run query:
+---------+
| min(ID) |
+---------+
| 6 |
+---------+
Using the Lag function...
SELECT Max([ID])
FROM (SELECT [name], [ID],
Lag([name]) OVER (ORDER BY [ID]) AS PrvVal
FROM tablename) tbl
WHERE [name] = 'A'
AND prvval <> 'A'
Online Demo: http://www.sqlfiddle.com/#!18/a55eb/2/0
If you want to get the whole row, you can do this...
SELECT Top 1 *
FROM (SELECT [name], [ID],
Lag([name]) OVER (ORDER BY [ID]) AS PrvVal
FROM tablename) tbl
WHERE [name] = 'A' AND prvval <> 'A'
ORDER BY [ID] DESC
Online Demo: http://www.sqlfiddle.com/#!18/a55eb/22/0
The ANSI SQL below uses a self-join on the previous id.
And the where-clause gets those with a name that's different from the previous.
select max(t1.ID) as ID
from YourTable as t1
left join YourTable as t2 on t1.ID = t2.ID+1
where (t1.name <> t2.name or t2.name is null)
and t1.name = 'A';
It should work on most RDBMS, including MS Sql Server.
Note that with the ID+1 that there's an assumption that are no gaps between the ID's.

Select Except the duplicate Records from the table in SQL Server

I have a SQL Server table that has duplicate entries in one of the columns e.g.:
+----+-----------+------------+
| id | object_id | status_val |
+----+-----------+------------+
| 1 | 1 | 0 |
| 2 | 1 | 0 |
| 3 | 1 | 0 |
| 4 | 2 | 0 |
| 5 | 3 | 0 |
| 6 | 4 | 0 |
| 7 | 4 | 0 |
+----+-----------+------------+
I need the output to be like this:
+----+-----------+------------+
| id | object_id | status_val |
+----+-----------+------------+
| 4 | 2 | 0 |
| 5 | 3 | 0 |
+----+-----------+------------+
How to resolve this?
Is this what you are looking for?
SELECT * FROM <yourTable> t1
WHERE t1.object_id NOT IN
(
SELECT t2.object_id
FROM <yourTable> t2
GROUP BY t2.object_id
HAVING COUNT(object_id) > 1
)
Try this:
select min(id),
object_id,
min(status_val)
from table
group by object_id
having count(*) = 1
Use HAVING and GROUP BY
SELECT MIN(id) id, object_id, MIN(status_val) status_val
FROM yourtable
GROUP BY object_id
HAVING COUNT(object_id) = 1
Output
id object_id status_val
4 2 0
5 3 0
SQL Fiddle: http://sqlfiddle.com/#!6/7f643f/9/0
You can use group by for unique record like below :-
SELECT * from TABLENAME
group by TABLE_COLOUM_NAME
This query give you only unique value from your Table.
Give a row number for each row partitioned and ordered by the columns [object_id], [status_val]. Then from the result set select the rows which having maximum row number 1.
Query
;with cte as(
select [rn] = row_number() over(
partition by [object_id], [status_val]
order by [object_id], [status_val]
), *
from [your_table_name]
)
select min([id]) as [id], [object_id], [status_val]
from cte
group by [object_id], [status_val]
having max([rn]) = 1;
Find a demo here
SELECT COUNT(*)
FROM(
SELECT DISTINCT object_id
FROM object_table ) as row_count, status_val,id,object_id FROM object_table where row_count = 1;
I think you are looking for that

SELECT only latest record of an ID from given rows

I have this table shown below...How do I select only the latest data of the id based on changeno?
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Yes | 2 | |
| 2 | Maybe | 3 | |
| 3 | Yes | 4 | |
| 3 | Yes | 5 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Maybe | 8 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I would want this result...
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Maybe | 3 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I currently have this SQL statement...
SELECT id, data, MAX(changeno) as changeno FROM Table1 GROUP BY id;
and clearly it doesn't return what I want. This should return an error because of the aggrerate function. If I added fields under the GROUP BY clause it works but it doesn't return what I want. The SQL statement is by far the closest I could think of. I'd appreciate it if anybody could help me on this. Thank you in advance :)
This is typically referred to as the "greatest-n-per-group" problem. One way to solve this in SQL Server 2005 and higher is to use a CTE with a calculated ROW_NUMBER() based on the grouping of the id column, and sorting those by largest changeno first:
;WITH cte AS
(
SELECT id, data, changeno,
rn = ROW_NUMBER() OVER (PARTITION BY id ORDER BY changeno DESC)
FROM dbo.Table1
)
SELECT id, data, changeno
FROM cte
WHERE rn = 1
ORDER BY id;
You want to use row_number() for this:
select id, data, changeno
from (SELECT t.*,
row_number() over (partition by id order by changeno desc) as seqnum
FROM Table1 t
) t
where seqnum = 1;
Not a well formed or performance optimized query but for small tasks it works fine.
SELECT * FROM TEST
WHERE changeno IN (SELECT MAX(changeno)
FROM TEST
GROUP BY id)
for other alternatives :
DECLARE #Table1 TABLE
(
id INT, data VARCHAR(5), changeno INT
);
INSERT INTO #Table1
SELECT 1,'Yes',1
UNION ALL
SELECT 2,'Yes',2
UNION ALL
SELECT 2,'Maybe',3
UNION ALL
SELECT 3,'Yes',4
UNION ALL
SELECT 3,'Yes',5
UNION ALL
SELECT 3,'No',6
UNION ALL
SELECT 4,'No',7
UNION ALL
SELECT 5,'Maybe',8
UNION ALL
SELECT 5,'Yes',9
SELECT Y.id, Y.data, Y.changeno
FROM #Table1 Y
INNER JOIN (
SELECT id, changeno = MAX(changeno)
FROM #Table1
GROUP BY id
) X ON X.id = Y.id
WHERE X.changeno = Y.changeno
ORDER BY Y.id

How to select row with the latest timestamp from duplicated rows in a database table?

I have a table with duplicate & triplicate rows - how do I select the rows that are duplicated but have the latest timestamp as well as the un-duped rows?
-------------------------------------
| pk_id | user_id | some_timestamp |
|-------------------------------------|
| 1 | 123 | 10-Jun-12 14.30 |
| 2 | 123 | 19-Jun-12 21.50 |
| 3 | 567 | 10-Jun-12 09.23 |
| 4 | 567 | 12-Jun-12 09.45 |
| 5 | 567 | 13-Jun-12 08.40 |
| 6 | 890 | 13-Jun-12 08.44 |
-------------------------------------
So that I end up with:
-------------------------------------
| pk_id | user_id | some_timestamp |
|-------------------------------------|
| 2 | 123 | 19-Jun-12 21.50 |
| 5 | 567 | 13-Jun-12 08.40 |
| 6 | 890 | 13-Jun-12 08.44 |
-------------------------------------
SELECT * FROM (
SELECT pk_id,
user_id,
some_timestamp,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY some_timestamp DESC) col
FROM table) x
WHERE x.col = 1
try this
select * from table
where some_timestamp
in (select max(some_timestamp)
from table group by user_id)
Try this, I made a SQLFIDDLE which returns the correct set of data
SELECT * FROM YourTable AS T1
INNER JOIN
( SELECT user_id , MAX(some_timestamp) AS some_timestamp FROM YourTable
GROUP BY user_id
) AS T2
ON T1.User_Id = T2.User_Id AND T1.some_timestamp = T2.some_timestamp
ORDER BY 1
http://sqlfiddle.com/#!6/f7bba/6
Try this:
select * from my_table
where (user_id, some_timestamp) IN (select user_id, max(some_timestamp) from my_table group by user_id);
select YourTable.*
from
YourTable JOIN
(select User_Id, Max(Some_Timestamp) as Mx
from YourTable
group by User_Id) Mx
on YourTable.User_Id=Mx.User_Id
and YourTable.Some_Timestamp=Mx.Mx