Fill in missing timestamp values in SQL - sql

SQL newby here looking for a bit of help in writing a query.
Some sample data
Time Value
9:00 1.2
9:01 2.3
9:05 2.4
9:06 2.5
I need to fill in those missing times with zero - so the query would return
Time Value
9:00 1.2
9:01 2.3
9:02 0
9:03 0
9:04 0
9:05 2.4
9:06 2.5
Is this possible in T-SQL?
Thanks for any help / advice ...

One method uses a recursive CTE to generate the list of times and then use left join to bring in the values:
with cte as (
select min(s.time) as time, max(s.time) as maxt
from sample s
union all
select dateadd(minute, 1, cte.time), cte.maxt
from cte
where cte.time < cte.maxt
)
select cte.time, coalesce(s.value, 0)
from cte left join
sample s
on cte.time = s.time
order by cte.time;
Note that if you have more than 100 minutes, you will need option (maxrecursion 0) at the end of the query.

You can try to use recursive CTE make calendar table and OUTER JOIN base on that.
CREATE TABLE T(
[Time] Time,
Value float
);
insert into T values ('9:00',1.2);
insert into T values ('9:01',2.3);
insert into T values ('9:05',2.4);
insert into T values ('9:06',2.5);
Query 1:
with cte as (
SELECT MIN([Time]) minDt,MAX([Time] ) maxDt
FROM T
UNION ALL
SELECT dateadd(minute, 1, minDt) ,maxDt
FROM CTE
WHERE dateadd(minute, 1, minDt) <= maxDt
)
SELECT t1.minDt 'Time',
ISNULL(t2.[Value],0) 'Value'
FROM CTE t1
LEFT JOIN T t2 on t2.[Time] = t1.minDt
Results:
| Time | Value |
|------------------|-------|
| 09:00:00.0000000 | 1.2 |
| 09:01:00.0000000 | 2.3 |
| 09:02:00.0000000 | 0 |
| 09:03:00.0000000 | 0 |
| 09:04:00.0000000 | 0 |
| 09:05:00.0000000 | 2.4 |
| 09:06:00.0000000 | 2.5 |

Related

Create all months list from a date column in ORACLE SQL

CREATE TABLE dates(
alldates date);
INSERT INTO dates (alldates) VALUES ('1-May-2017');
INSERT INTO dates (alldates) VALUES ('1-Mar-2018');
I want to generate all months beginning between these two dates. I am very new to Oracle SQL. My solution is below, but it is not working properly.
WITH t1(test) AS (
SELECT MIN(alldates) as test
FROM dates
UNION ALL
SELECT ADD_MONTHS(test,1) as test
FROM t1
WHERE t1.test<= (SELECT MAX(alldates) FROM date)
)
SELECT * FROM t1
The result I want should look like
Test
2017-02-01
2017-03-01
...
2017-12-01
2018-01-01
2018-02-01
2018-03-01
You made a typo and wrote date instead of dates but you also need to make a second change and use ADD_MONTHS in the recursive query's WHERE clause or you will generate one too many rows.
WITH t1(test) AS (
SELECT MIN(alldates)
FROM dates
UNION ALL
SELECT ADD_MONTHS(test,1)
FROM t1
WHERE ADD_MONTHS(test,1) <= (SELECT MAX(alldates) FROM dates)
)
SELECT * FROM t1
Which outputs:
| TEST |
| :-------- |
| 01-MAY-17 |
| 01-JUN-17 |
| 01-JUL-17 |
| 01-AUG-17 |
| 01-SEP-17 |
| 01-OCT-17 |
| 01-NOV-17 |
| 01-DEC-17 |
| 01-JAN-18 |
| 01-FEB-18 |
| 01-MAR-18 |
However, a more efficient query would be to get the minimum and maximum values in the same query and then iterate using these pre-found bounds:
WITH t1(min_date, max_date) AS (
SELECT MIN(alldates),
MAX(alldates)
FROM dates
UNION ALL
SELECT ADD_MONTHS(min_date,1),
max_date
FROM t1
WHERE ADD_MONTHS(min_date,1) <= max_date
)
SELECT min_date AS month
FROM t1
db<>fiddle here
Update
Oracle 11gR2 has bugs handling recursive date queries; this is fixed in later Oracle versions but if you want to use SQL Fiddle and Oracle 11gR2 then you need to iterate over a numeric value and not a date. Something like this:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE dates(
alldates date);
INSERT INTO dates (alldates) VALUES ('1-May-2017');
INSERT INTO dates (alldates) VALUES ('1-Mar-2018');
Query 1:
WITH t1(min_date, month, total_months) AS (
SELECT MIN(alldates),
0,
MONTHS_BETWEEN(MAX(alldates),MIN(alldates))
FROM dates
UNION ALL
SELECT min_date,
month+1,
total_months
FROM t1
WHERE month+1<=total_months
)
SELECT ADD_MONTHS(min_date,month) AS month
FROM t1
Results:
| MONTH |
|----------------------|
| 2017-05-01T00:00:00Z |
| 2017-06-01T00:00:00Z |
| 2017-07-01T00:00:00Z |
| 2017-08-01T00:00:00Z |
| 2017-09-01T00:00:00Z |
| 2017-10-01T00:00:00Z |
| 2017-11-01T00:00:00Z |
| 2017-12-01T00:00:00Z |
| 2018-01-01T00:00:00Z |
| 2018-02-01T00:00:00Z |
| 2018-03-01T00:00:00Z |
You seem to want a recursive CTE. That syntax would be:
WITH CTE(min_date, max_date) as (
SELECT MIN(alldates) as min_date, MAX(alldates) as max_date
FROM dates
UNION ALL
SELECT add_months(min_date, 1), max_date
FROM CTE
WHERE min_date < max_date
)
SELECT min_date
FROM CTE;
Here is a db<>fiddle.
You just made a typo: date instead of dates:
WITH t1(test) AS (
SELECT MIN(alldates) as test
FROM dates
UNION ALL
SELECT ADD_MONTHS(test,1) as test
FROM t1
WHERE t1.test<= (SELECT MAX(alldates) FROM dateS) -- fixed here
)
SELECT * FROM t1

How to apply Excel operation into SQL server Query?

I am currently using SSMS 2008.
I would like to complete the operation, using SSMS and described in the Excel screenshot.
I have two tables joined, one having a positive count for when an employee's start working and one with a negative count for when the employee's leave. I am looking to have a column showing the count of employee's per hour.
I appreciate any help on this matter,
Thank you,
It is running total and could be implemented using windowed SUM:
SELECT *, SUM(Employee) OVER(ORDER BY [Date], [Time]) as Total_available
FROM tab
ORDER BY [Date], [Time];
An alternative method to SUM OVER is a self-join, with an aggregation on the lower or equal values.
Sample data:
CREATE TABLE TestEmployeeRegistration (
[Date] DATE,
[Time] TIME,
[Employees] INT NOT NULL DEFAULT 0,
PRIMARY KEY ([Date], [Time])
);
INSERT INTO TestEmployeeRegistration
([Date], [Time], [Employees]) VALUES
('2019-11-01', '08:00', 2),
('2019-11-01', '09:00', 5),
('2019-11-01', '10:00', 3),
('2019-11-01', '12:00',-5),
('2019-11-01', '13:00', 2),
('2019-11-01', '14:00',-5);
Query:
SELECT t.[Date], t.[Time], t.[Employees]
, SUM(t2.[Employees]) AS [Total available]
FROM [TestEmployeeRegistration] t
JOIN [TestEmployeeRegistration] t2
ON t2.[Date] = t.[Date]
AND t2.[Time] <= t.[Time]
GROUP BY t.[Date], t.[Time], t.[Employees]
ORDER BY t.[Date], t.[Time];
When using the window function of SUM, then I advice a partition by the "Date".
SELECT *
, SUM([Employees]) OVER (PARTITION BY [Date] ORDER BY [Time]) AS [Total available]
FROM [TestEmployeeRegistration]
ORDER BY [Date], [Time];
A test on rextester here
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE MyTable (Dates Date,Times Time, EmployeesAvailable int)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','08:00',2)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','09:00',5)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','10:00',3)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','12:00',-5)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','13:00',2)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','14:00',-5)
Query 1:
SELECT Dates,Times,EmployeesAvailable,
SUM(EmployeesAvailable) OVER(ORDER BY Dates,Times) AS 'Total Available'
FROM MyTable
Results:
| Dates | Times | EmployeesAvailable | Total Available |
|------------|------------------|--------------------|-----------------|
| 2019-11-01 | 08:00:00.0000000 | 2 | 2 |
| 2019-11-01 | 09:00:00.0000000 | 5 | 7 |
| 2019-11-01 | 10:00:00.0000000 | 3 | 10 |
| 2019-11-01 | 12:00:00.0000000 | -5 | 5 |
| 2019-11-01 | 13:00:00.0000000 | 2 | 7 |
| 2019-11-01 | 14:00:00.0000000 | -5 | 2 |

Mathematical comparison between rows

I have a table
+----+-----------+---------+
| ID | StartTime | EndTime |
+----+-----------+---------+
| 1 | 2:00pm | 3:00pm |
| 2 | 4:00pm | 5:00pm |
| 3 | 7:00pm | 9:00pm |
+----+-----------+---------+
I need to get the difference between the end time of one row and the start time of the NEXT row. i.e. End time of row 1 compared to start time of row 2, or end time of row 2 compared to start time of row 3.
Ideally I'd like a result that looks similar to
+----+----------------+
| ID | TimeDifference |
+----+----------------+
| 2 | 1.0 hours |
| 3 | 2.0 hours |
+----+----------------+
I have no clue whatsoever on how to do something like this. I'm thinking that I may need 2 temp tables, one to hold start times another for end times so that I can more easily do the comparisons, but honestly that's just a shot in the dark at the moment.
FYI, on server 2008 in case that makes a difference for some of the commands.
NOTE: The question was not tagged SQL Server 2008 when this answer was written.
You can use lag():
select t.*,
datediff(minute, lag(endtime) over (order by id), starttime) / 60.0 as hours_diff
from t;
This does not filter out any rows. The description of the problem ("next row") and the sample data (which is based on "previous row") are inconsistent.
Well, since it's 2008 version you can't use the Lead() or Lag() window functions, but you can use subqueries to mimic them:
SELECT Id,
DATEDIFF(minute,
(
SELECT TOP 1 EndTime
FROM table t1
WHERE t1.Id < t0.Id
ORDER BY t1.Id DESC
), StartTime) / 60.0 As TimeDifference
FROM Table t0
WHERE EXISTS
(
SELECT 1
FROM Table t2
WHERE t2.Id < t0.Id
)
You can try it
declare #t as table (ID int, StartTime time , EndTime time)
INSERT #t SELECT 1 ,'2:00pm', '3:00pm'
INSERT #t SELECT 2 ,'4:00pm', '5:00pm'
INSERT #t SELECT 3 ,'7:00pm', '9:00pm'
---- For sequential IDs
select
a.ID
,a.StartTime
,a.EndTime
,datediff(minute, (SELECT EndTime FROM #t b where b.ID = a.ID - 1) , a.StartTime) / 60.0 as hours_diff
from #t a
---- For non-sequential IDs
;WIth cte_times as (
SELECT
ROW_NUMBER() OVER (ORDER BY Id) as new_ID
, ID
,StartTime
,EndTime
FROM
#t
)
select
a.ID
,a.StartTime
,a.EndTime
,datediff(minute, (SELECT EndTime FROM cte_times b where b.new_ID = a.new_ID - 1) , a.StartTime) / 60.0 as hours_diff
from cte_times a

Without using DISTINCT, how to group data without altering value?

I have a feeling this is a dumb question with a simple answer, but here goes.
How can I group the following data without using DISTINCT? #Table has 5 rows, which shows data for Hrs 5-9. I just don't like DISTINCT.
Since I need to display all hours of the day upto Hr9 (including 0-4), I'm joining it with table DimTime. DimTime has all hours, but with its 15-min intervals. So, DimTime looks like this:
Hour Minute
0 0
0 15
0 30
0 45
1 0
1 15
1 30
1 45
So here's my script:
declare #table table
(
Hour int,
Value int
)
insert into #table select 5, 25
insert into #table select 6, 34
insert into #table select 7, 54
insert into #table select 8, 65
insert into #table select 9, 11
select d.hour, t.hour, sum(value)
from #table t
left join dimtime d on d.hour = t.hour
group by d.hour, t.hour
If I use GROUP BY, then I need to have an aggregate function. So if I use SUM, it'll multiply all values by 4. If I remove the aggregate function, I'll get a syntax error.
Also, I cannot use a CTE since the contents in #table comes from a CTE (I just didn't include it here).
Here's the result that I need to display:
Hour Value
0 null
1 null
2 null
3 null
4 null
5 25
6 34
7 54
8 65
9 11
Simply add a condition WHERE minute = 0 to return only one row per hour.
If you really with to skip the sorting operation on dimtime with the use of distinct clause then check the below explanation.
Display all hours (0-9) from dimtime and sum the value given in #table for a particular hour:
SELECT
d.hour, SUM(t.value)
FROM
dimtime d
LEFT JOIN #table t
ON d.hour = t.hour
WHERE d.minute = 0 -- retrieves one row for every hour from dimtime
GROUP BY d.hour
ORDER BY d.hour -- not needed, but will give you resultset sorted by hour
Assuming that you have a row with value minute = 0 in your dimtable for every hour you could just limit the rows retrieved for join operation. That will work with any value from list 0, 15, 30, 45.
SUM() will work properly by summing all the values for a given hour in #table. If there are no rows with a particular hour, it will return 0 value.
You should have a better reason for not using a programming function than "I just don't like it"
You can have a CTE that uses another CTE
#dnoeth provided an excellent answer, but here's another option:
SELECT
d.hour,
t.value
FROM
#table t
INNER JOIN (SELECT DISTINCT hour FROM dimTime) d ON d.hour = t.hour
Try
SELECT *
FROM DimTime D
LEFT JOIN myTable T
ON D.Hour = T.Hour
WHERE D.Minute = 0
SQL Fiddle Demo
Output
| Hour | Minute | Hour | Value |
|------|--------|--------|--------|
| 0 | 0 | (null) | (null) |
| 1 | 0 | (null) | (null) |
| 2 | 0 | (null) | (null) |
| 3 | 0 | (null) | (null) |
| 4 | 0 | (null) | (null) |
| 5 | 0 | 5 | 25 |
| 6 | 0 | 6 | 34 |
| 7 | 0 | 7 | 54 |
| 8 | 0 | 8 | 65 |
| 9 | 0 | 9 | 11 |
If I use GROUP BY, then I need to have an aggregate function
Only if you include expressions in your SELECT that are not part of your group key. You could certainly do
select d.hour, t.hour, value
from #table t
inner join dimtime d
on d.hour = t.hour
group by d.hour, t.hour, value
or
select d.hour, t.hour, MIN(value)
from #table t
inner join dimtime d
on d.hour = t.hour
group by d.hour, t.hour
Note that the first query gives you the exact same results as DISTINCT (and may even be compiled to the same query plan) so I'm not sure what your aversion is to DISTINCT.

A very basic SQL issue I'm stuck with [duplicate]

I have a table of player performance:
CREATE TABLE TopTen (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
home INT UNSIGNED NOT NULL,
`datetime`DATETIME NOT NULL,
player VARCHAR(6) NOT NULL,
resource INT NOT NULL
);
What query will return the rows for each distinct home holding its maximum value of datetime? In other words, how can I filter by the maximum datetime (grouped by home) and still include other non-grouped, non-aggregate columns (such as player) in the result?
For this sample data:
INSERT INTO TopTen
(id, home, `datetime`, player, resource)
VALUES
(1, 10, '04/03/2009', 'john', 399),
(2, 11, '04/03/2009', 'juliet', 244),
(5, 12, '04/03/2009', 'borat', 555),
(3, 10, '03/03/2009', 'john', 300),
(4, 11, '03/03/2009', 'juliet', 200),
(6, 12, '03/03/2009', 'borat', 500),
(7, 13, '24/12/2008', 'borat', 600),
(8, 13, '01/01/2009', 'borat', 700)
;
the result should be:
id
home
datetime
player
resource
1
10
04/03/2009
john
399
2
11
04/03/2009
juliet
244
5
12
04/03/2009
borat
555
8
13
01/01/2009
borat
700
I tried a subquery getting the maximum datetime for each home:
-- 1 ..by the MySQL manual:
SELECT DISTINCT
home,
id,
datetime AS dt,
player,
resource
FROM TopTen t1
WHERE `datetime` = (SELECT
MAX(t2.datetime)
FROM TopTen t2
GROUP BY home)
GROUP BY `datetime`
ORDER BY `datetime` DESC
The result-set has 130 rows although database holds 187, indicating the result includes some duplicates of home.
Then I tried joining to a subquery that gets the maximum datetime for each row id:
-- 2 ..join
SELECT
s1.id,
s1.home,
s1.datetime,
s1.player,
s1.resource
FROM TopTen s1
JOIN (SELECT
id,
MAX(`datetime`) AS dt
FROM TopTen
GROUP BY id) AS s2
ON s1.id = s2.id
ORDER BY `datetime`
Nope. Gives all the records.
I tried various exotic queries, each with various results, but nothing that got me any closer to solving this problem.
You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:
SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime
The fastest MySQL solution, without inner queries and without GROUP BY:
SELECT m.* -- get the row that contains the max value
FROM topten m -- "m" from "max"
LEFT JOIN topten b -- "b" from "bigger"
ON m.home = b.home -- match "max" row with "bigger" row by `home`
AND m.datetime < b.datetime -- want "bigger" than "max"
WHERE b.datetime IS NULL -- keep only if there is no bigger than max
Explanation:
Join the table with itself using the home column. The use of LEFT JOIN ensures all the rows from table m appear in the result set. Those that don't have a match in table b will have NULLs for the columns of b.
The other condition on the JOIN asks to match only the rows from b that have bigger value on the datetime column than the row from m.
Using the data posted in the question, the LEFT JOIN will produce this pairs:
+------------------------------------------+--------------------------------+
| the row from `m` | the matching row from `b` |
|------------------------------------------|--------------------------------|
| id home datetime player resource | id home datetime ... |
|----|-----|------------|--------|---------|------|------|------------|-----|
| 1 | 10 | 04/03/2009 | john | 399 | NULL | NULL | NULL | ... | *
| 2 | 11 | 04/03/2009 | juliet | 244 | NULL | NULL | NULL | ... | *
| 5 | 12 | 04/03/2009 | borat | 555 | NULL | NULL | NULL | ... | *
| 3 | 10 | 03/03/2009 | john | 300 | 1 | 10 | 04/03/2009 | ... |
| 4 | 11 | 03/03/2009 | juliet | 200 | 2 | 11 | 04/03/2009 | ... |
| 6 | 12 | 03/03/2009 | borat | 500 | 5 | 12 | 04/03/2009 | ... |
| 7 | 13 | 24/12/2008 | borat | 600 | 8 | 13 | 01/01/2009 | ... |
| 8 | 13 | 01/01/2009 | borat | 700 | NULL | NULL | NULL | ... | *
+------------------------------------------+--------------------------------+
Finally, the WHERE clause keeps only the pairs that have NULLs in the columns of b (they are marked with * in the table above); this means, due to the second condition from the JOIN clause, the row selected from m has the biggest value in column datetime.
Read the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
Here goes T-SQL version:
-- Test data
DECLARE #TestTable TABLE (id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT)
INSERT INTO #TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
-- Answer
SELECT id, home, date, player, resource
FROM (SELECT id, home, date, player, resource,
RANK() OVER (PARTITION BY home ORDER BY date DESC) N
FROM #TestTable
)M WHERE N = 1
-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource
FROM #TestTable T
INNER JOIN
( SELECT TI.id, TI.home, TI.date,
RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
FROM #TestTable TI
WHERE TI.date IN (SELECT MAX(TM.date) FROM #TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id
EDIT
Unfortunately, there are no RANK() OVER function in MySQL.
But it can be emulated, see Emulating Analytic (AKA Ranking) Functions with MySQL.
So this is MySQL version:
SELECT id, home, date, player, resource
FROM TestTable AS t1
WHERE
(SELECT COUNT(*)
FROM TestTable AS t2
WHERE t2.home = t1.home AND t2.date > t1.date
) = 0
This will work even if you have two or more rows for each home with equal DATETIME's:
SELECT id, home, datetime, player, resource
FROM (
SELECT (
SELECT id
FROM topten ti
WHERE ti.home = t1.home
ORDER BY
ti.datetime DESC
LIMIT 1
) lid
FROM (
SELECT DISTINCT home
FROM topten
) t1
) ro, topten t2
WHERE t2.id = ro.lid
I think this will give you the desired result:
SELECT home, MAX(datetime)
FROM my_table
GROUP BY home
BUT if you need other columns as well, just make a join with the original table (check Michael La Voie answer)
Best regards.
Since people seem to keep running into this thread (comment date ranges from 1.5 year) isn't this much simpler:
SELECT * FROM (SELECT * FROM topten ORDER BY datetime DESC) tmp GROUP BY home
No aggregation functions needed...
Cheers.
You can also try this one and for large tables query performance will be better. It works when there no more than two records for each home and their dates are different. Better general MySQL query is one from Michael La Voie above.
SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
FROM t_scores_1 t1
INNER JOIN t_scores_1 t2
ON t1.home = t2.home
WHERE t1.date > t2.date
Or in case of Postgres or those dbs that provide analytic functions try
SELECT t.* FROM
(SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
, row_number() over (partition by t1.home order by t1.date desc) rw
FROM topten t1
INNER JOIN topten t2
ON t1.home = t2.home
WHERE t1.date > t2.date
) t
WHERE t.rw = 1
SELECT tt.*
FROM TestTable tt
INNER JOIN
(
SELECT coord, MAX(datetime) AS MaxDateTime
FROM rapsa
GROUP BY
krd
) groupedtt
ON tt.coord = groupedtt.coord
AND tt.datetime = groupedtt.MaxDateTime
This works on Oracle:
with table_max as(
select id
, home
, datetime
, player
, resource
, max(home) over (partition by home) maxhome
from table
)
select id
, home
, datetime
, player
, resource
from table_max
where home = maxhome
Try this for SQL Server:
WITH cte AS (
SELECT home, MAX(year) AS year FROM Table1 GROUP BY home
)
SELECT * FROM Table1 a INNER JOIN cte ON a.home = cte.home AND a.year = cte.year
Here is MySQL version which prints only one entry where there are duplicates MAX(datetime) in a group.
You could test here http://www.sqlfiddle.com/#!2/0a4ae/1
Sample Data
mysql> SELECT * from topten;
+------+------+---------------------+--------+----------+
| id | home | datetime | player | resource |
+------+------+---------------------+--------+----------+
| 1 | 10 | 2009-04-03 00:00:00 | john | 399 |
| 2 | 11 | 2009-04-03 00:00:00 | juliet | 244 |
| 3 | 10 | 2009-03-03 00:00:00 | john | 300 |
| 4 | 11 | 2009-03-03 00:00:00 | juliet | 200 |
| 5 | 12 | 2009-04-03 00:00:00 | borat | 555 |
| 6 | 12 | 2009-03-03 00:00:00 | borat | 500 |
| 7 | 13 | 2008-12-24 00:00:00 | borat | 600 |
| 8 | 13 | 2009-01-01 00:00:00 | borat | 700 |
| 9 | 10 | 2009-04-03 00:00:00 | borat | 700 |
| 10 | 11 | 2009-04-03 00:00:00 | borat | 700 |
| 12 | 12 | 2009-04-03 00:00:00 | borat | 700 |
+------+------+---------------------+--------+----------+
MySQL Version with User variable
SELECT *
FROM (
SELECT ord.*,
IF (#prev_home = ord.home, 0, 1) AS is_first_appear,
#prev_home := ord.home
FROM (
SELECT t1.id, t1.home, t1.player, t1.resource
FROM topten t1
INNER JOIN (
SELECT home, MAX(datetime) AS mx_dt
FROM topten
GROUP BY home
) x ON t1.home = x.home AND t1.datetime = x.mx_dt
ORDER BY home
) ord, (SELECT #prev_home := 0, #seq := 0) init
) y
WHERE is_first_appear = 1;
+------+------+--------+----------+-----------------+------------------------+
| id | home | player | resource | is_first_appear | #prev_home := ord.home |
+------+------+--------+----------+-----------------+------------------------+
| 9 | 10 | borat | 700 | 1 | 10 |
| 10 | 11 | borat | 700 | 1 | 11 |
| 12 | 12 | borat | 700 | 1 | 12 |
| 8 | 13 | borat | 700 | 1 | 13 |
+------+------+--------+----------+-----------------+------------------------+
4 rows in set (0.00 sec)
Accepted Answers' outout
SELECT tt.*
FROM topten tt
INNER JOIN
(
SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home
) groupedtt ON tt.home = groupedtt.home AND tt.datetime = groupedtt.MaxDateTime
+------+------+---------------------+--------+----------+
| id | home | datetime | player | resource |
+------+------+---------------------+--------+----------+
| 1 | 10 | 2009-04-03 00:00:00 | john | 399 |
| 2 | 11 | 2009-04-03 00:00:00 | juliet | 244 |
| 5 | 12 | 2009-04-03 00:00:00 | borat | 555 |
| 8 | 13 | 2009-01-01 00:00:00 | borat | 700 |
| 9 | 10 | 2009-04-03 00:00:00 | borat | 700 |
| 10 | 11 | 2009-04-03 00:00:00 | borat | 700 |
| 12 | 12 | 2009-04-03 00:00:00 | borat | 700 |
+------+------+---------------------+--------+----------+
7 rows in set (0.00 sec)
SELECT c1, c2, c3, c4, c5 FROM table1 WHERE c3 = (select max(c3) from table)
SELECT * FROM table1 WHERE c3 = (select max(c3) from table1)
Another way to gt the most recent row per group using a sub query which basically calculates a rank for each row per group and then filter out your most recent rows as with rank = 1
select a.*
from topten a
where (
select count(*)
from topten b
where a.home = b.home
and a.`datetime` < b.`datetime`
) +1 = 1
DEMO
Here is the visual demo for rank no for each row for better understanding
By reading some comments what about if there are two rows which have same 'home' and 'datetime' field values?
Above query will fail and will return more than 1 rows for above situation. To cover up this situation there will be a need of another criteria/parameter/column to decide which row should be taken which falls in above situation. By viewing sample data set i assume there is a primary key column id which should be set to auto increment. So we can use this column to pick the most recent row by tweaking same query with the help of CASE statement like
select a.*
from topten a
where (
select count(*)
from topten b
where a.home = b.home
and case
when a.`datetime` = b.`datetime`
then a.id < b.id
else a.`datetime` < b.`datetime`
end
) + 1 = 1
DEMO
Above query will pick the row with highest id among the same datetime values
visual demo for rank no for each row
Why not using:
SELECT home, MAX(datetime) AS MaxDateTime,player,resource FROM topten GROUP BY home
Did I miss something?
In MySQL 8.0 this can be achieved efficiently by using row_number() window function with common table expression.
(Here row_number() basically generating unique sequence for each row for every player starting with 1 in descending order of resource. So, for every player row with sequence number 1 will be with highest resource value. Now all we need to do is selecting row with sequence number 1 for each player. It can be done by writing an outer query around this query. But we used common table expression instead since it's more readable.)
Schema:
create TABLE TestTable(id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT);
INSERT INTO TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
Query:
with cte as
(
select id, home, date , player, resource,
Row_Number()Over(Partition by home order by date desc) rownumber from TestTable
)
select id, home, date , player, resource from cte where rownumber=1
Output:
id
home
date
player
resource
1
10
2009-03-04 00:00:00
john
399
2
11
2009-03-04 00:00:00
juliet
244
5
12
2009-03-04 00:00:00
borat
555
8
13
2009-01-01 00:00:00
borat
700
db<>fiddle here
This works in SQLServer, and is the only solution I've seen that doesn't require subqueries or CTEs - I think this is the most elegant way to solve this kind of problem.
SELECT TOP 1 WITH TIES *
FROM TopTen
ORDER BY ROW_NUMBER() OVER (PARTITION BY home
ORDER BY [datetime] DESC)
In the ORDER BY clause, it uses a window function to generate & sort by a ROW_NUMBER - assigning a 1 value to the highest [datetime] for each [home].
SELECT TOP 1 WITH TIES will then select one record with the lowest ROW_NUMBER (which will be 1), as well as all records with a tying ROW_NUMBER (also 1)
As a consequence, you retrieve all data for each of the 1st ranked records - that is, all data for records with the highest [datetime] value with their given [home] value.
Try this
select * from mytable a join
(select home, max(datetime) datetime
from mytable
group by home) b
on a.home = b.home and a.datetime = b.datetime
Regards
K
#Michae The accepted answer will working fine in most of the cases but it fail for one for as below.
In case if there were 2 rows having HomeID and Datetime same the query will return both rows, not distinct HomeID as required, for that add Distinct in query as below.
SELECT DISTINCT tt.home , tt.MaxDateTime
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime
this is the query you need:
SELECT b.id, a.home,b.[datetime],b.player,a.resource FROM
(SELECT home,MAX(resource) AS resource FROM tbl_1 GROUP BY home) AS a
LEFT JOIN
(SELECT id,home,[datetime],player,resource FROM tbl_1) AS b
ON a.resource = b.resource WHERE a.home =b.home;
Hope below query will give the desired output:
Select id, home,datetime,player,resource, row_number() over (Partition by home ORDER by datetime desc) as rownum from tablename where rownum=1
(NOTE: The answer of Michael is perfect for a situation where the target column datetime cannot have duplicate values for each distinct home.)
If your table has duplicate rows for homexdatetime and you need to only select one row for each distinct home column, here is my solution to it:
Your table needs one unique column (like id). If it doesn't, create a view and add a random column to it.
Use this query to select a single row for each unique home value. Selects the lowest id in case of duplicate datetime.
SELECT tt.*
FROM topten tt
INNER JOIN
(
SELECT min(id) as min_id, home from topten tt2
INNER JOIN
(
SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt2
ON tt2.home = groupedtt2.home
) as groupedtt
ON tt.id = groupedtt.id
Accepted answer doesn't work for me if there are 2 records with same date and home. It will return 2 records after join. While I need to select any (random) of them. This query is used as joined subquery so just limit 1 is not possible there.
Here is how I reached desired result. Don't know about performance however.
select SUBSTRING_INDEX(GROUP_CONCAT(id order by datetime desc separator ','),',',1) as id, home, MAX(datetime) as 'datetime'
from topten
group by (home)