SQL Find First Occurrence

SQL Find First Occurrence - sql

I've been at this for about an hour now and am making little to no progress - thought I'd come here for some help/advice.
So, given a sample of my table:
+-----------+-----------------------------+--------------+
| MachineID | DateTime | AlertType |
+-----------+-----------------------------+--------------+
| 56 | 2015-10-05 00:00:23.0000000 | 2000 |
| 42 | 2015-10-05 00:01:26.0000000 | 1006 |
| 50 | 2015-10-05 00:08:33.0000000 | 1018 |
| 56 | 2015-10-05 00:08:48.0000000 | 2003 |
| 56 | 2015-10-05 00:10:15.0000000 | 2000 |
| 67 | 2015-10-05 00:11:59.0000000 | 3001 |
| 60 | 2015-10-05 00:13:02.0000000 | 1006 |
| 67 | 2015-10-05 00:13:08.0000000 | 3000 |
| 56 | 2015-10-05 00:13:09.0000000 | 2003 |
| 67 | 2015-10-05 00:14:50.0000000 | 1018 |
| 67 | 2015-10-05 00:15:00.0000000 | 1018 |
| 47 | 2015-10-05 00:16:55.0000000 | 1006 |
+-----------+-----------------------------+--------------+
How would I get the first occurrence of MachineID w/ an AlertType of 2000
and the last occurrence of the same MachineID w/ and AlertType of 2003.
Here is what I have tried - but it is not outputting what I expect.
SELECT *
FROM [Alerts] a
where
DateTime >= '2015-10-05 00:00:00'
AND DateTime <= '2015-10-06 00:00:00'
and not exists(
select b.MachineID
from [Alerts] b
where b.AlertType=a.AlertType and
b.MachineID<a.MachineID
)
order by a.DateTime ASC
EDIT: The above code doesn't get me what I want because I am not specifically telling it to search for AlertType = 2000 or AlertType = 2003, but even when I try that, I am still unable to gather my desired results.
Here is what I would like my output to display:
+-----------+-----------------------------+--------------+
| MachineID | DateTime | AlertType |
+-----------+-----------------------------+--------------+
| 56 | 2015-10-05 00:00:23.0000000 | 2000 |
| 56 | 2015-10-05 00:13:09.0000000 | 2003 |
+-----------+-----------------------------+--------------+
Any help with this would be greatly appreciated!

Not sure, but:
select * from [Table]
WHERE [DateTime] IN (
SELECT MIN([DateTime]) as [DateTime]
FROM [Table]
WHERE AlertType = 2000
GROUP BY MachineId
UNION ALL
SELECT MAX([DateTime]) as [DateTime]
FROM [Table]
WHERE AlertType = 2003
GROUP BY MachineId)
ORDER BY MachineId, AlertType

It looks like your outer section takes all records between 2015-10-05 to 2015-10-06, which includes all the records sorted by date. The inner portion only happens when no records fit the outer date range.
Looks like GSazheniuk has it right, but I am not sure if you just want the 2 records or everything that matches the MachineID and the two alerts?

Not sure what your attempt has to do with your question, but to answer this:
How would I get the first occurrence of MachineID w/ an AlertType of
2000 and the last occurrence of the same MachineID w/ and AlertType of
2003.
Simple:
SELECT * FROM (
SELECT TOP 1 * FROM Alerts WHERE AlertType='2000' ORDER BY Datetime ASC
UNION ALL
SELECT TOP 1 * FROM Alerts WHERE AlertType='2003' ORDER BY Datetime DESC
) t

I think everyone misses that your alert type is NOT a deciding factor, but a supplemental.
This should give you what you are looking for. I walked through the whole process.
`IF OBJECT_ID('tempdb..#alerts') IS NOT NULL DROP table #alerts
CREATE TABLE #alerts
(
MachineID int,
dte DATETIME,
alerttype int
)
INSERT INTO #alerts VALUES ('56','20151005 00:00:23','2000')
INSERT INTO #alerts VALUES ('42','20151005 00:01:26','1006')
INSERT INTO #alerts VALUES ('50','20151005 00:08:33','1018')
INSERT INTO #alerts VALUES ('56','20151005 00:08:48','2003')
INSERT INTO #alerts VALUES ('56','20151005 00:10:15','2000')
INSERT INTO #alerts VALUES ('67','20151005 00:11:59','3001')
INSERT INTO #alerts VALUES ('60','20151005 00:13:02','1006')
INSERT INTO #alerts VALUES ('67','20151005 00:13:08','3000')
INSERT INTO #alerts VALUES ('56','20151005 00:13:09','2003')
INSERT INTO #alerts VALUES ('67','20151005 00:14:50','1018')
INSERT INTO #alerts VALUES ('67','20151005 00:15:00','1018')
INSERT INTO #alerts VALUES ('47','20151005 00:16:55','1006')
GO
WITH rnk as ( --identifies the order of the records.
Select
MachineID,
dte = dte,
rnk = RANK() OVER (partition BY machineid ORDER BY dte DESC) --ranks the machine ID's based on date (first to Last)
FROM #alerts
),
agg as( --Pulls your first and last record
SELECT
MachineID,
frst = MIN(rnk),
lst = MAX(rnk)
FROM rnk
GROUP BY MachineID
)
SELECT
pop.MachineID,
pop.dte,
pop.alerttype
FROM #alerts pop
JOIN rnk r ON pop.MachineID = r.MachineID AND pop.dte = r.dte --the date join allows you to hook into your ranks
JOIN agg ON pop.MachineID = agg.MachineID
WHERE agg.frst = r.rnk OR agg.lst = r.rnk -- or clause can be replaced by two queries with a union all
ORDER BY 1,2 --viewability... machineID, date`
I personally use cross apply's to preform tasks like this, but CTE's are much more visually friendly for this exercise.

Related

Average of Days between ordered dates per group

+-------+-------+-----------+
| EmpID | PerID | VisitDate |
+-------+-------+-----------+
| 1 | 22 | 2/24/2017 |
| 1 | 22 | 3/25/2017 |
| 1 | 22 | 4/5/2017 |
| 2 | 33 | 5/6/2017 |
| 2 | 33 | 8/9/2017 |
| 2 | 33 | 6/7/2017 |
+-------+-------+-----------+
I am trying to find the latest visit date and average days between visits per EmpID. For Avg, I'll first have to order the days consecutively and then find the average.
Eg: Avg. days for EmpID=1 and PerID=22 would be [29(Days between 3/25 and 2/24) + 11 (Days between 3/25 and 4/5)/2] = 20 Days.
Desired Output:
+-------+-------+----------+----------+
| EmpID | PerID | MaxVDate | AvgVDays |
+-------+-------+----------+----------+
| 1 | 22 | 4/5/2017 | 20 |
| 2 | 33 | 8/9/2017 | 47.5 |
+-------+-------+----------+----------+
Attempt:
SELECT
EmpID
,PerID
,MAX(VisitDate) AS MaxVDate
,--Dunno how to find average AS AvgVDays
FROM
T1
GROUP BY
EmpID
,PerID

You can use lag to get the previous date and compute the date difference. Then use avg window function to get the average days.
Select distinct empid,perid,maxVdate,avg(diff_with_prev) OVER(Partition by empid) as avgVDays
from (
SELECT EmpID,PerID
,MAX(VisitDate) OVER(Partition BY EmpID) AS MaxVDate
,DATEDIFF(DAY,LAG(VisitDate) OVER(Partition BY EmpID order by VisitDate), VisitDate) as diff_with_prev
FROM T1
) t

Here's an option...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
DROP TABLE #TestData;
CREATE TABLE #TestData (
EmpID INT NOT NULL,
PerID INT NOT NULL,
VisitDate DATE NOT NULL
);
INSERT #TestData (EmpID, PerID, VisitDate) VALUES
(1, 22, '2/24/2017'),
(1, 22, '3/25/2017'),
(1, 22, '4/5/2017'),
(2, 33, '5/6/2017'),
(2, 33, '8/9/2017'),
(2, 33, '6/7/2017');
-- SELECT * FROM #TestData td;
SELECT
db.EmpID,
db.PerID,
AvgDays = AVG(db.DaysBetween * 1.0)
FROM (
SELECT
*,
DaysBetween = DATEDIFF(dd, LAG(td.VisitDate, 1) OVER (PARTITION BY td.EmpID, td.PerID ORDER BY td.VisitDate), td.VisitDate)
FROM
#TestData td
) db
GROUP BY
db.EmpID,
db.PerID;
Results...
EmpID PerID AvgDays
----------- ----------- ---------------------------------------
1 22 20.000000
2 33 47.500000

The task is much easier than you think. You get the average with (last visit - first visit) / (count visits - 1).
select
empid,
perid,
max(VisitDate) as MaxVDate,
datediff(day, min(VisitDate), max(VisitDate)) * 1.0 / (count(*) - 1) as avgvdays
from mytable
group by empid, perid
having count(*) > 1
order by empid, perid;
The multiplication with 1.0 is necessary in order to avoid integer division. (You could also cast to decimal instead.)
As the calcualtion only makes sense for empid/perid pairs with more than one entry (and in order to avoid division by zero), I have applied an according HAVING clause.
Here is a test: http://rextester.com/AIFPA62612

A very basic SQL issue I'm stuck with [duplicate]

I have a table of player performance:
CREATE TABLE TopTen (
id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
home INT UNSIGNED NOT NULL,
`datetime`DATETIME NOT NULL,
player VARCHAR(6) NOT NULL,
resource INT NOT NULL
);
What query will return the rows for each distinct home holding its maximum value of datetime? In other words, how can I filter by the maximum datetime (grouped by home) and still include other non-grouped, non-aggregate columns (such as player) in the result?
For this sample data:
INSERT INTO TopTen
(id, home, `datetime`, player, resource)
VALUES
(1, 10, '04/03/2009', 'john', 399),
(2, 11, '04/03/2009', 'juliet', 244),
(5, 12, '04/03/2009', 'borat', 555),
(3, 10, '03/03/2009', 'john', 300),
(4, 11, '03/03/2009', 'juliet', 200),
(6, 12, '03/03/2009', 'borat', 500),
(7, 13, '24/12/2008', 'borat', 600),
(8, 13, '01/01/2009', 'borat', 700)
;
the result should be:
id
home
datetime
player
resource
1
10
04/03/2009
john
399
2
11
04/03/2009
juliet
244
5
12
04/03/2009
borat
555
8
13
01/01/2009
borat
700
I tried a subquery getting the maximum datetime for each home:
-- 1 ..by the MySQL manual:
SELECT DISTINCT
home,
id,
datetime AS dt,
player,
resource
FROM TopTen t1
WHERE `datetime` = (SELECT
MAX(t2.datetime)
FROM TopTen t2
GROUP BY home)
GROUP BY `datetime`
ORDER BY `datetime` DESC
The result-set has 130 rows although database holds 187, indicating the result includes some duplicates of home.
Then I tried joining to a subquery that gets the maximum datetime for each row id:
-- 2 ..join
SELECT
s1.id,
s1.home,
s1.datetime,
s1.player,
s1.resource
FROM TopTen s1
JOIN (SELECT
id,
MAX(`datetime`) AS dt
FROM TopTen
GROUP BY id) AS s2
ON s1.id = s2.id
ORDER BY `datetime`
Nope. Gives all the records.
I tried various exotic queries, each with various results, but nothing that got me any closer to solving this problem.

You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:
SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime

The fastest MySQL solution, without inner queries and without GROUP BY:
SELECT m.* -- get the row that contains the max value
FROM topten m -- "m" from "max"
LEFT JOIN topten b -- "b" from "bigger"
ON m.home = b.home -- match "max" row with "bigger" row by `home`
AND m.datetime < b.datetime -- want "bigger" than "max"
WHERE b.datetime IS NULL -- keep only if there is no bigger than max
Explanation:
Join the table with itself using the home column. The use of LEFT JOIN ensures all the rows from table m appear in the result set. Those that don't have a match in table b will have NULLs for the columns of b.
The other condition on the JOIN asks to match only the rows from b that have bigger value on the datetime column than the row from m.
Using the data posted in the question, the LEFT JOIN will produce this pairs:
+------------------------------------------+--------------------------------+
| the row from `m` | the matching row from `b` |
|------------------------------------------|--------------------------------|
| id home datetime player resource | id home datetime ... |
|----|-----|------------|--------|---------|------|------|------------|-----|
| 1 | 10 | 04/03/2009 | john | 399 | NULL | NULL | NULL | ... | *
| 2 | 11 | 04/03/2009 | juliet | 244 | NULL | NULL | NULL | ... | *
| 5 | 12 | 04/03/2009 | borat | 555 | NULL | NULL | NULL | ... | *
| 3 | 10 | 03/03/2009 | john | 300 | 1 | 10 | 04/03/2009 | ... |
| 4 | 11 | 03/03/2009 | juliet | 200 | 2 | 11 | 04/03/2009 | ... |
| 6 | 12 | 03/03/2009 | borat | 500 | 5 | 12 | 04/03/2009 | ... |
| 7 | 13 | 24/12/2008 | borat | 600 | 8 | 13 | 01/01/2009 | ... |
| 8 | 13 | 01/01/2009 | borat | 700 | NULL | NULL | NULL | ... | *
+------------------------------------------+--------------------------------+
Finally, the WHERE clause keeps only the pairs that have NULLs in the columns of b (they are marked with * in the table above); this means, due to the second condition from the JOIN clause, the row selected from m has the biggest value in column datetime.
Read the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.

Here goes T-SQL version:
-- Test data
DECLARE #TestTable TABLE (id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT)
INSERT INTO #TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
-- Answer
SELECT id, home, date, player, resource
FROM (SELECT id, home, date, player, resource,
RANK() OVER (PARTITION BY home ORDER BY date DESC) N
FROM #TestTable
)M WHERE N = 1
-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource
FROM #TestTable T
INNER JOIN
( SELECT TI.id, TI.home, TI.date,
RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
FROM #TestTable TI
WHERE TI.date IN (SELECT MAX(TM.date) FROM #TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id
EDIT
Unfortunately, there are no RANK() OVER function in MySQL.
But it can be emulated, see Emulating Analytic (AKA Ranking) Functions with MySQL.
So this is MySQL version:
SELECT id, home, date, player, resource
FROM TestTable AS t1
WHERE
(SELECT COUNT(*)
FROM TestTable AS t2
WHERE t2.home = t1.home AND t2.date > t1.date
) = 0

This will work even if you have two or more rows for each home with equal DATETIME's:
SELECT id, home, datetime, player, resource
FROM (
SELECT (
SELECT id
FROM topten ti
WHERE ti.home = t1.home
ORDER BY
ti.datetime DESC
LIMIT 1
) lid
FROM (
SELECT DISTINCT home
FROM topten
) t1
) ro, topten t2
WHERE t2.id = ro.lid

I think this will give you the desired result:
SELECT home, MAX(datetime)
FROM my_table
GROUP BY home
BUT if you need other columns as well, just make a join with the original table (check Michael La Voie answer)
Best regards.

Since people seem to keep running into this thread (comment date ranges from 1.5 year) isn't this much simpler:
SELECT * FROM (SELECT * FROM topten ORDER BY datetime DESC) tmp GROUP BY home
No aggregation functions needed...
Cheers.

You can also try this one and for large tables query performance will be better. It works when there no more than two records for each home and their dates are different. Better general MySQL query is one from Michael La Voie above.
SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
FROM t_scores_1 t1
INNER JOIN t_scores_1 t2
ON t1.home = t2.home
WHERE t1.date > t2.date
Or in case of Postgres or those dbs that provide analytic functions try
SELECT t.* FROM
(SELECT t1.id, t1.home, t1.date, t1.player, t1.resource
, row_number() over (partition by t1.home order by t1.date desc) rw
FROM topten t1
INNER JOIN topten t2
ON t1.home = t2.home
WHERE t1.date > t2.date
) t
WHERE t.rw = 1

SELECT tt.*
FROM TestTable tt
INNER JOIN
(
SELECT coord, MAX(datetime) AS MaxDateTime
FROM rapsa
GROUP BY
krd
) groupedtt
ON tt.coord = groupedtt.coord
AND tt.datetime = groupedtt.MaxDateTime

This works on Oracle:
with table_max as(
select id
, home
, datetime
, player
, resource
, max(home) over (partition by home) maxhome
from table
)
select id
, home
, datetime
, player
, resource
from table_max
where home = maxhome

Try this for SQL Server:
WITH cte AS (
SELECT home, MAX(year) AS year FROM Table1 GROUP BY home
)
SELECT * FROM Table1 a INNER JOIN cte ON a.home = cte.home AND a.year = cte.year

Here is MySQL version which prints only one entry where there are duplicates MAX(datetime) in a group.
You could test here http://www.sqlfiddle.com/#!2/0a4ae/1
Sample Data
mysql> SELECT * from topten;
+------+------+---------------------+--------+----------+
| id | home | datetime | player | resource |
+------+------+---------------------+--------+----------+
| 1 | 10 | 2009-04-03 00:00:00 | john | 399 |
| 2 | 11 | 2009-04-03 00:00:00 | juliet | 244 |
| 3 | 10 | 2009-03-03 00:00:00 | john | 300 |
| 4 | 11 | 2009-03-03 00:00:00 | juliet | 200 |
| 5 | 12 | 2009-04-03 00:00:00 | borat | 555 |
| 6 | 12 | 2009-03-03 00:00:00 | borat | 500 |
| 7 | 13 | 2008-12-24 00:00:00 | borat | 600 |
| 8 | 13 | 2009-01-01 00:00:00 | borat | 700 |
| 9 | 10 | 2009-04-03 00:00:00 | borat | 700 |
| 10 | 11 | 2009-04-03 00:00:00 | borat | 700 |
| 12 | 12 | 2009-04-03 00:00:00 | borat | 700 |
+------+------+---------------------+--------+----------+
MySQL Version with User variable
SELECT *
FROM (
SELECT ord.*,
IF (#prev_home = ord.home, 0, 1) AS is_first_appear,
#prev_home := ord.home
FROM (
SELECT t1.id, t1.home, t1.player, t1.resource
FROM topten t1
INNER JOIN (
SELECT home, MAX(datetime) AS mx_dt
FROM topten
GROUP BY home
) x ON t1.home = x.home AND t1.datetime = x.mx_dt
ORDER BY home
) ord, (SELECT #prev_home := 0, #seq := 0) init
) y
WHERE is_first_appear = 1;
+------+------+--------+----------+-----------------+------------------------+
| id | home | player | resource | is_first_appear | #prev_home := ord.home |
+------+------+--------+----------+-----------------+------------------------+
| 9 | 10 | borat | 700 | 1 | 10 |
| 10 | 11 | borat | 700 | 1 | 11 |
| 12 | 12 | borat | 700 | 1 | 12 |
| 8 | 13 | borat | 700 | 1 | 13 |
+------+------+--------+----------+-----------------+------------------------+
4 rows in set (0.00 sec)
Accepted Answers' outout
SELECT tt.*
FROM topten tt
INNER JOIN
(
SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home
) groupedtt ON tt.home = groupedtt.home AND tt.datetime = groupedtt.MaxDateTime
+------+------+---------------------+--------+----------+
| id | home | datetime | player | resource |
+------+------+---------------------+--------+----------+
| 1 | 10 | 2009-04-03 00:00:00 | john | 399 |
| 2 | 11 | 2009-04-03 00:00:00 | juliet | 244 |
| 5 | 12 | 2009-04-03 00:00:00 | borat | 555 |
| 8 | 13 | 2009-01-01 00:00:00 | borat | 700 |
| 9 | 10 | 2009-04-03 00:00:00 | borat | 700 |
| 10 | 11 | 2009-04-03 00:00:00 | borat | 700 |
| 12 | 12 | 2009-04-03 00:00:00 | borat | 700 |
+------+------+---------------------+--------+----------+
7 rows in set (0.00 sec)

SELECT c1, c2, c3, c4, c5 FROM table1 WHERE c3 = (select max(c3) from table)
SELECT * FROM table1 WHERE c3 = (select max(c3) from table1)

Another way to gt the most recent row per group using a sub query which basically calculates a rank for each row per group and then filter out your most recent rows as with rank = 1
select a.*
from topten a
where (
select count(*)
from topten b
where a.home = b.home
and a.`datetime` < b.`datetime`
) +1 = 1
DEMO
Here is the visual demo for rank no for each row for better understanding
By reading some comments what about if there are two rows which have same 'home' and 'datetime' field values?
Above query will fail and will return more than 1 rows for above situation. To cover up this situation there will be a need of another criteria/parameter/column to decide which row should be taken which falls in above situation. By viewing sample data set i assume there is a primary key column id which should be set to auto increment. So we can use this column to pick the most recent row by tweaking same query with the help of CASE statement like
select a.*
from topten a
where (
select count(*)
from topten b
where a.home = b.home
and case
when a.`datetime` = b.`datetime`
then a.id < b.id
else a.`datetime` < b.`datetime`
end
) + 1 = 1
DEMO
Above query will pick the row with highest id among the same datetime values
visual demo for rank no for each row

Why not using:
SELECT home, MAX(datetime) AS MaxDateTime,player,resource FROM topten GROUP BY home
Did I miss something?

In MySQL 8.0 this can be achieved efficiently by using row_number() window function with common table expression.
(Here row_number() basically generating unique sequence for each row for every player starting with 1 in descending order of resource. So, for every player row with sequence number 1 will be with highest resource value. Now all we need to do is selecting row with sequence number 1 for each player. It can be done by writing an outer query around this query. But we used common table expression instead since it's more readable.)
Schema:
create TABLE TestTable(id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT);
INSERT INTO TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
Query:
with cte as
(
select id, home, date , player, resource,
Row_Number()Over(Partition by home order by date desc) rownumber from TestTable
)
select id, home, date , player, resource from cte where rownumber=1
Output:
id
home
date
player
resource
1
10
2009-03-04 00:00:00
john
399
2
11
2009-03-04 00:00:00
juliet
244
5
12
2009-03-04 00:00:00
borat
555
8
13
2009-01-01 00:00:00
borat
700
db<>fiddle here

This works in SQLServer, and is the only solution I've seen that doesn't require subqueries or CTEs - I think this is the most elegant way to solve this kind of problem.
SELECT TOP 1 WITH TIES *
FROM TopTen
ORDER BY ROW_NUMBER() OVER (PARTITION BY home
ORDER BY [datetime] DESC)
In the ORDER BY clause, it uses a window function to generate & sort by a ROW_NUMBER - assigning a 1 value to the highest [datetime] for each [home].
SELECT TOP 1 WITH TIES will then select one record with the lowest ROW_NUMBER (which will be 1), as well as all records with a tying ROW_NUMBER (also 1)
As a consequence, you retrieve all data for each of the 1st ranked records - that is, all data for records with the highest [datetime] value with their given [home] value.

Try this
select * from mytable a join
(select home, max(datetime) datetime
from mytable
group by home) b
on a.home = b.home and a.datetime = b.datetime
Regards
K

#Michae The accepted answer will working fine in most of the cases but it fail for one for as below.
In case if there were 2 rows having HomeID and Datetime same the query will return both rows, not distinct HomeID as required, for that add Distinct in query as below.
SELECT DISTINCT tt.home , tt.MaxDateTime
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime

this is the query you need:
SELECT b.id, a.home,b.[datetime],b.player,a.resource FROM
(SELECT home,MAX(resource) AS resource FROM tbl_1 GROUP BY home) AS a
LEFT JOIN
(SELECT id,home,[datetime],player,resource FROM tbl_1) AS b
ON a.resource = b.resource WHERE a.home =b.home;

Hope below query will give the desired output:
Select id, home,datetime,player,resource, row_number() over (Partition by home ORDER by datetime desc) as rownum from tablename where rownum=1

(NOTE: The answer of Michael is perfect for a situation where the target column datetime cannot have duplicate values for each distinct home.)
If your table has duplicate rows for homexdatetime and you need to only select one row for each distinct home column, here is my solution to it:
Your table needs one unique column (like id). If it doesn't, create a view and add a random column to it.
Use this query to select a single row for each unique home value. Selects the lowest id in case of duplicate datetime.
SELECT tt.*
FROM topten tt
INNER JOIN
(
SELECT min(id) as min_id, home from topten tt2
INNER JOIN
(
SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt2
ON tt2.home = groupedtt2.home
) as groupedtt
ON tt.id = groupedtt.id

Accepted answer doesn't work for me if there are 2 records with same date and home. It will return 2 records after join. While I need to select any (random) of them. This query is used as joined subquery so just limit 1 is not possible there.
Here is how I reached desired result. Don't know about performance however.
select SUBSTRING_INDEX(GROUP_CONCAT(id order by datetime desc separator ','),',',1) as id, home, MAX(datetime) as 'datetime'
from topten
group by (home)

How to transform rows into column? [duplicate]

This question already has answers here:
Convert Rows to columns using 'Pivot' in SQL Server
(9 answers)
Closed 7 years ago.
I have a table like this and there are only two feature for all user in this table
+-------+---------+-----------+----------+
| User | Feature | StartDate | EndDate |
+-------+---------+-----------+----------+
| Peter | F1 | 2015/1/1 | 2015/2/1 |
| Peter | F2 | 2015/3/1 | 2015/4/1 |
| John | F1 | 2015/5/1 | 2015/6/1 |
| John | F2 | 2015/7/1 | 2015/8/1 |
+-------+---------+-----------+----------+
I want to transform to
+-------+--------------+------------+--------------+------------+
| User | F1_StartDate | F1_EndDate | F2_StartDate | F2_EndDate |
+-------+--------------+------------+--------------+------------+
| Peter | 2015/1/1 | 2015/2/1 | 2015/3/1 | 2015/4/1 |
| John | 2015/5/1 | 2015/6/1 | 2015/7/1 | 2015/8/1 |
+-------+--------------+------------+--------------+------------+

If you are using SQL Server 2005 or up by any chance, PIVOT is what you are looking for.

The best general way to perform this sort of operation is a simple group by statement. This should work across all major ODBMS:
select user,
max(case when feature='F1' then StartDate else null end) F1_StartDate,
max(case when feature='F1' then EndDate else null end) F1_EndDate,
max(case when feature='F2' then StartDate else null end) F2_StartDate,
max(case when feature='F2' then EndDate else null end) F2_EndDate
from table
group by user
Note: as mentioned in the comments, this is often bad practice, as depending on your needs, it can make the data harder to work with. However, there are cases where it makes sense, when you have a small, limited number of values.

This is a bit of a hack with a CTE:
;WITH CTE AS (
SELECT [User], [Feature] + '_StartDate' AS [Type], StartDate AS [Date]
FROM Table1
UNION ALL
SELECT [User], [Feature] + '_EndDate' AS [Type], EndDate AS [Date]
FROM Table1)
SELECT * FROM CTE
PIVOT(MAX([Date]) FOR [Type] IN ([F1_StartDate],[F2_StartDate], [F1_EndDate], [F2_EndDate])) PIV

Use UNPIVOT & PIVOT like this:
Test data:
DECLARE #t table
(User1 varchar(20),Feature char(2),StartDate date,EndDate date)
INSERT #t values
('Pete','F1','2015/1/1 ','2015/2/1'),
('Pete','F2','2015/3/1 ','2015/4/1'),
('John','F1','2015/5/1 ','2015/6/1'),
('John','F2','2015/7/1 ','2015/8/1')
Query:
;WITH CTE AS
(
SELECT User1, date1, Feature + '_' + Seq cat
FROM #t as p
UNPIVOT
(date1 FOR Seq IN
([StartDate], [EndDate]) ) AS unpvt
)
SELECT * FROM CTE
PIVOT
(MIN(date1)
FOR cat
IN ([F1_StartDate],[F1_EndDate],[F2_StartDate],[F2_EndDate])
) as p
Result:
User1 F1_StartDate F1_EndDate F2_StartDate F2_EndDate
John 2015-05-01 2015-06-01 2015-07-01 2015-08-01
Pete 2015-01-01 2015-02-01 2015-03-01 2015-04-01

How to optimise a query that uses multiple sub select statements

was hoping some-one could help me with this:
My table is:
id Version datetime name resource
---|--------|------------|--------|---------
1 | 1 | 03/03/2009 | con1 | 399
2 | 2 | 03/03/2009 | con1 | 244
3 | 3 | 01/03/2009 | con1 | 555
4 | 1 | 03/03/2009 | con2 | 200
5 | 2 | 03/03/2009 | con2 | 500
6 | 3 | 04/03/2009 | con2 | 600
7 | 4 | 31/03/2009 | con2 | 700
I need to select each distinct "name" that has greatest value of "datetime" that less than or equal to a given date; and where the version is the maximum version if there are multiple records that satisfy the first condition.
The result if the given date were '04/03/2009' would be:
id Version datetime name resource
---|--------|------------|--------|---------
2 | 2 | 03/03/2009 | con1 | 244
6 | 3 | 04/03/2009 | con2 | 600
Currently I've created the following query, which works, but I suspect it's not the best when it comes to performance when run on a large table:
SELECT [id], [Version], [datetime], [name], [resource]
FROM theTable
WHERE [Version] =
(
SELECT MAX(Version) FROM theTable AS theTable2 WHERE theTable.[name] = theTable2.[name]
AND theTable2.[datetime] =
(
SELECT MAX(theTable3.[datetime]) FROM theTable AS theTable3
WHERE theTable2.[name] = theTable3.[name] AND theTable3.[datetime] <= '04/03/2009'
)
)
I'd appreciate if some-one could suggest a more efficient way to do this; and if possible, provide an example:-).
Thanks in advance.

You can use PARTITION BY. This lets you basically rank the results. In your instance, you then want to select only the result with ranking 1. First, filter out the results with invalid date times (using WHERE), then in the partition, order by the columns descending (thus, the first result would be the one with the maximum datetime, and, in case of datetime tie, the maximum version as well.)
SELECT [id], [Version], [datetime], [name], [resource]
FROM
(
SELECT [id], [Version], [datetime], [name], [resource], row_number()
OVER (PARTITION BY [name] ORDER BY [datetime] DESC, [Version] DESC) as groupIndex
FROM theTable
WHERE [datetime] <= '04/03/2009'
) AS t
WHERE groupIndex = 1

SQL Order By and "Not-So-Much Group"

Lets say I have a table:
--------------------------------------
| ID | DATE | GROUP | RESULT |
--------------------------------------
| 1 | 01/06 | Group1 | 12345 |
| 2 | 01/05 | Group2 | 54321 |
| 3 | 01/04 | Group1 | 11111 |
--------------------------------------
I want to order the result by the most recent date at the top but group the "group" column together, but still have distinct entries. The result that I want would be:
1 | 01/06 | Group1 | 12345
3 | 01/04 | Group1 | 11111
2 | 01/05 | Group2 | 54321
What would be a query to get that result?
thank you!
EDIT:
I'm using MSSQL. I'll look into translating the oracle query into MS SQL and report my results.
EDIT
SQL Server 2000, so OVER/PARTITION is not supported =[
Thank you!

You should specify what RDBMS you are using. This answer is for Oracle, may not work in other systems.
SELECT * FROM table
ORDER BY MAX(date) OVER (PARTITION BY group) DESC, group, date DESC

declare #table table (
ID int not null,
[DATE] smalldatetime not null,
[GROUP] varchar(10) not null,
[RESULT] varchar(10) not null
)
insert #table values (1, '2009-01-06', 'Group1', '12345')
insert #table values (2, '2009-01-05', 'Group2', '12345')
insert #table values (3, '2009-01-04', 'Group1', '12345')
select t.*
from #table t
inner join (
select
max([date]) as [order-date],
[GROUP]
from #table orderer
group by
[GROUP]
) x
on t.[GROUP] = x.[GROUP]
order by
x.[order-date] desc,
t.[GROUP],
t.[DATE] desc

use an order by clause with two params:
...order by group, date desc
this assumes that your date column does hold dates and not varchars

SELECT table2.myID,
table2.mydate,
table2.mygroup,
table2.myresult
FROM (SELECT DISTINCT mygroup FROM testtable as table1) as grouptable
JOIN testtable as table2
ON grouptable.mygroup = table2.mygroup
ORDER BY grouptable.mygroup,table2.mydate
SORRY, could NOT bring myself to use columns that were reserved names, rename the columns to make it work :)
this is MUCH simpler than the accepted answer btw.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Find First Occurrence - sql

Not sure, but: select * from [Table] WHERE [DateTime] IN ( SELECT MIN([DateTime]) as [DateTime] FROM [Table] WHERE AlertType = 2000 GROUP BY MachineId UNION ALL SELECT MAX([DateTime]) as [DateTime] FROM [Table] WHERE AlertType = 2003 GROUP BY MachineId) ORDER BY MachineId, AlertType

Related

Average of Days between ordered dates per group

A very basic SQL issue I'm stuck with [duplicate]

How to transform rows into column? [duplicate]

How to optimise a query that uses multiple sub select statements

SQL Order By and "Not-So-Much Group"

Categories

Resources