Correct sql/hql query (aggregate in where clause) - sql

I want to do query as below. Query is wrong but describes my intentions.
SELECT name, dateTime, data
FROM Record
WHERE dateTime = MAX(dateTime)
Update: Ok. The query describes intentions not quite good. My bad.
I want to select latest record for each person.

Try This:
SELECT name, dateTime, data
FROM Record
WHERE dateTime = SELECT MAX(dateTime) FROM Record
You could also write it using an inner join:
SELECT R.name, R.dateTime, R.data
FROM Record R
INNER JOIN (SELECT MAX(dateTime) FROM Record) RMax ON R.dateTime = RMax.dateTime
Which is the same but written from a different perspective
SELECT R.name, R.dateTime, R.data
FROM Record R,
(SELECT MAX(dateTime) FROM Record) RMax
WHERE R.dateTime = RMax.dateTime

I like Miky's answer and the from Quassnoi (and upvoted Miky's) but, if your needs are similar to mine, you should keep in mind some limitations. First and most importantly, it only works if you are looking for the latest record overall or the latest record for a single name. If you want the latest record for each person in a set (one record per person but the latest record for each) then the above solutions fall short. Second, and less importantly, if you'll be working with large datasets, might prove a bit slow over the long run. So, what is the work-around?
What I do is to add a bit field to the table marked "newest." Then, when I store a record (which is done in a stored procedure in SQL Server) I follow this pattern:
Update Table Set Newest=0 Where Name=#Name
Insert into Table (Name, dateTimeVal, Data, Newest) Values (#Name, GetDate(), #Data, 1);
Also, there is an index on Name and Newest to make Selects very fast.
Then the Select is just:
Select dateTimeVal, Data From Table Where (Name=#Name) and (Newest=1);
A select for a group will be something like:
Select Name, dateTimeVal, Data from Table Where (Newest=1); -- Gets multiple records
If the records may not be entered in date order, then your logic is a little bit different:
Update Table Set Newest=0 Where Name=#Name
Insert into Table (Name, dateTimeVal, Data, Newest) Values (#Name, GetDate(), #Data, 0); -- NOTE ZERO
Update Table Set Newest=1 Where dateTimeVal=(Select Max(dateTimeVal) From Table Where Name=#Name);
The rest stays the same.

In MySQL and PostgreSQL:
SELECT name, dateTime, data
FROM Record
ORDER BY
dateTime DESC
LIMIT 1
In SQL Server:
SELECT TOP 1 name, dateTime, data
FROM Record
ORDER BY
dateTime DESC
In Oracle
SELECT *
FROM (
SELECT name, dateTime, data
FROM Record
ORDER BY
dateTime DESC
)
WHERE rownum = 1
Update:
To select one person for each record, in SQL Server, use this:
WITH q AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY person ORDER BY dateTime DESC)
FROM Record
)
SELECT *
FROM q
WHERE rn = 1
or this:
SELECT ro.*
FROM (
SELECT DISTINCT person
FROM Record
) d
CROSS APPLY
(
SELECT TOP 1 *
FROM Record r
WHERE r.person = d.person
ORDER BY
dateTime DESC
) ro
See this article in my blog:
SQL Server: Selecting records holding group-wise maximum
for benefits and drawbacks of both solutions.

I tried Milky's advice but all three ways of constructing subquery resulted in HQL parser errors.
What does work though, is a slight change to the first method (added extra parentheses).
SELECT name, dateTime, data
FROM Record
WHERE dateTime = (SELECT MAX(dateTime) FROM Record)
PS: This is just for pointing out the obvious to HQL newbies and the like. Thought it would help.

Related

Minus values from the same column

I have a table with 2 columns. one contains the user's ID and the other contains the datetime whenever that user logged into the application.
I want to create a custom computation where the datetime value from the same column subtracts each other, as the latest datetime will minus the previous datetime of that latest datetime, and keeps going like that til the first-time login datetime of that user.
Right now I have no clue on how to code this computation so I really need your help and would very appreciate it.
Sample date:
I fiddled around in the AdventureWorks db and might have found a way to get what you are looking for(although I am new to this, so someone with more experience and knowledge might have a better idea). But hopefully this provides some starting point. I created 2 CTEs with Row_Number, second CTE had Row_Number-1 as it's count. Then joined those two CTEs and did a datediff on the event_datetimes.
WITH Example_CTE (Row, UserID, Event_datetime)
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Row
,UserID
,Event_datetime
FROM table
ORDER BY UserID,Event_datetime DESC
)
,
Example_CTE2 (Row, UserID, Event_datetime)
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1 AS Row
,UserID
,Event_datetime
FROM table
ORDER BY UserID, Event_datetime DESC
)
SELECT Example_CTE.UserID
,DATEDIFF(day, Example_CTE.Event_datetime, Example_CTE2.Event_datetime) as TimeDiff
FROM Example_CTE
left outer join Example_CTE2 on Example_CTE.row = Example_CTE2.Row

select rows in sql with latest date from 3 tables in each group

I'm creating PREDICATE system for my application.
Please see image that I already
I have a question how can I select rows in SQL with latest date "Taken On" column tables for each "QuizESId" columns, before that I am understand how to select it but it only using one table, I learn from this
select rows in sql with latest date for each ID repeated multiple times
Here is what I have already tried
SELECT tt.*
FROM myTable tt
INNER JOIN
(SELECT ID, MAX(Date) AS MaxDateTime
FROM myTable
GROUP BY ID) groupedtt ON tt.ID = groupedtt.ID
AND tt.Date = groupedtt.MaxDateTime
What I am confused about here is how can I select from 3 tables, I hope you can guide me, of course I need a solution with good query and efficient performance.
Thanks
This is for SQL Server (you didn't specify exactly what RDBMS you're using):
if you want to get the "latest row for each QuizId" - this sounds like you need a CTE (Common Table Expression) with a ROW_NUMBER() value - something like this (updated: you obviously want to "partition" not just by QuizId, but also by UserName):
WITH BaseData AS
(
SELECT
mAttempt.Id AS Id,
mAttempt.QuizModelId AS QuizId,
mAttempt.StartedAt AS StartsOn,
mUser.UserName,
mDetail.Score AS Score,
RowNum = ROW_NUMBER() OVER (PARTITION BY mAttempt.QuizModelId, mUser.UserName
ORDER BY mAttempt.TakenOn DESC)
FROM
UserQuizAttemptModels mAttempt
INNER JOIN
AspNetUsers mUser ON mAttempt.UserId = muser.Id
INNER JOIN
QuizAttemptDetailModels mDetail ON mDetail.UserQuizAttemptModelId = mAttempt.Id
)
SELECT *
FROM BaseData
WHERE QuizId = 10053
AND RowNum = 1
The BaseData CTE basically selects the data (as you did) - but it also adds a ROW_NUMBER() column. This will "partition" your data into groups of data - based on the QuizModelId - and it will number all the rows inside each data group, starting at 1, and ordered by the second condition - the ORDER BY clause. You said you want to order by "Taken On" date - but there's no such date visible in your query - so I just guessed it might be on the UserQuizAttemptModels table - change and adapt as needed.
Now you can select from that CTE with your original WHERE condition - and you specify, that you want only the first row for each data group (for each "QuizId") - the one with the most recent "Taken On" date value.

SQL Server: Select all but the first result in query

We have an issue with two tables, let's call them Item and ItemStatuses. We track each change to the status, so there is a start date and end date in the ItemStatuses table. We track the current status by looking for the one with an end date that is null.
Through an error in the system the newest status was added multiple times to a number of items. I need to select all but the first status for each item. I have the following query which gives me all the open statuses. I was trying this route because I figured I could use the row number to skip the first one, but there are multiple Items in these sets, so I need to skip the first status for each item. I think I'm pretty close with my query, but I'm not sure what I need to do.
SELECT ID, rn = ROW_NUMBER() OVER (ORDER BY ItemID)
FROM ItemStatuses WHERE ID IN
(
SELECT
s.ID
FROM Items as i
INNER JOIN ItemStatuses AS s ON
i.ID = s.ItemID AND
s.EndDate IS NULL
GROUP BY i.ID
HAVING COUNT(i.ID) > 1
)
To illustrate how to update all but the first status of your table:
declare #itemstatuses table (id int, Enddate datetime, theStatus int)
insert into #itemstatuses
values (1,getdate()-3,1),(1,getdate()-2,2),(1,getdate()-1,3),
(2,getdate()-3,2),(2,getdate()-2,2),(2,getdate()-1,99),
(3,getdate(),1)
select 'before',* from #itemStatuses
;with sorted
as (
select [r] = row_number()over(partition by id order by Enddate), *
from #ItemStatuses
)
update sorted
set theStatus = 100
where r>1
select 'after',* from #itemStatuses
I would simplify your SQL query since this can become overly complicated and expensive.
Then use your server sided language to perform any filtering or condition.

sql query to get earliest date

If I have a table with columns id, name, score, date
and I wanted to run a sql query to get the record where id = 2 with the earliest date in the data set.
Can you do this within the query or do you need to loop after the fact?
I want to get all of the fields of that record..
If you just want the date:
SELECT MIN(date) as EarliestDate
FROM YourTable
WHERE id = 2
If you want all of the information:
SELECT TOP 1 id, name, score, date
FROM YourTable
WHERE id = 2
ORDER BY Date
Prevent loops when you can. Loops often lead to cursors, and cursors are almost never necessary and very often really inefficient.
SELECT TOP 1 ID, Name, Score, [Date]
FROM myTable
WHERE ID = 2
Order BY [Date]
While using TOP or a sub-query both work, I would break the problem into steps:
Find target record
SELECT MIN( date ) AS date, id
FROM myTable
WHERE id = 2
GROUP BY id
Join to get other fields
SELECT mt.id, mt.name, mt.score, mt.date
FROM myTable mt
INNER JOIN
(
SELECT MIN( date ) AS date, id
FROM myTable
WHERE id = 2
GROUP BY id
) x ON x.date = mt.date AND x.id = mt.id
While this solution, using derived tables, is longer, it is:
Easier to test
Self documenting
Extendable
It is easier to test as parts of the query can be run standalone.
It is self documenting as the query directly reflects the requirement
ie the derived table lists the row where id = 2 with the earliest date.
It is extendable as if another condition is required, this can be easily added to the derived table.
Try
select * from dataset
where id = 2
order by date limit 1
Been a while since I did sql, so this might need some tweaking.
Using "limit" and "top" will not work with all SQL servers (for example with Oracle).
You can try a more complex query in pure sql:
select mt1.id, mt1."name", mt1.score, mt1."date" from mytable mt1
where mt1.id=2
and mt1."date"= (select min(mt2."date") from mytable mt2 where mt2.id=2)

Get list of records with multiple entries on the same date

I need to return a list of record id's from a table that may/may not have multiple entries with that record id on the same date. The same date criteria is key - if a record has three entries on 09/10/2008, then I need all three returned. If the record only has one entry on 09/12/2008, then I don't need it.
SELECT id, datefield, count(*) FROM tablename GROUP BY datefield
HAVING count(*) > 1
Since you mentioned needing all three records, I am assuming you want the data as well. If you just need the id's, you can just use the group by query. To return the data, just join to that as a subquery
select * from table
inner join (
select id, date
from table
group by id, date
having count(*) > 1) grouped
on table.id = grouped.id and table.date = grouped.date
The top post (Leigh Caldwell) will not return duplicate records and needs to be down modded. It will identify the duplicate keys. Furthermore, it will not work if your database doesn't allows the group by to not include all select fields (many do not).
If your date field includes a time stamp then you'll need to truncate that out using one of the methods documented above ( I prefer: dateadd(dd,0, datediff(dd,0,#DateTime)) ).
I think Scott Nichols gave the correct answer and here's a script to prove it:
declare #duplicates table (
id int,
datestamp datetime,
ipsum varchar(200))
insert into #duplicates (id,datestamp,ipsum) values (1,'9/12/2008','ipsum primis in faucibus')
insert into #duplicates (id,datestamp,ipsum) values (1,'9/12/2008','Vivamus consectetuer. ')
insert into #duplicates (id,datestamp,ipsum) values (2,'9/12/2008','condimentum posuere, quam.')
insert into #duplicates (id,datestamp,ipsum) values (2,'9/13/2008','Donec eu sapien vel dui')
insert into #duplicates (id,datestamp,ipsum) values (3,'9/12/2008','In velit nulla, faucibus sed')
select a.* from #duplicates a
inner join (select id,datestamp, count(1) as number
from #duplicates
group by id,datestamp
having count(1) > 1) b
on (a.id = b.id and a.datestamp = b.datestamp)
SELECT RecordID
FROM aTable
WHERE SameDate IN
(SELECT SameDate
FROM aTable
GROUP BY SameDate
HAVING COUNT(SameDate) > 1)
If I understand your question correctly you could do something similar to:
select
recordID
from
tablewithrecords as a
left join (
select
count(recordID) as recordcount
from
tblwithrecords
where
recorddate='9/10/08'
) as b on a.recordID=b.recordID
where
b.recordcount>1
http://www.sql-server-performance.com/articles/dba/delete_duplicates_p1.aspx will get you going. Also, http://en.allexperts.com/q/MS-SQL-1450/2008/8/SQL-query-fetch-duplicate.htm
I found these by searching Google for 'sql duplicate data'. You'll see this isn't an unusual problem.
SELECT * FROM the_table WHERE ROW(record_id,date) IN
( SELECT record_id, date FROM the_table
GROUP BY record_id, date WHERE COUNT(*) > 1 )
For matching on just the date part of a Datetime:
select * from Table
where id in (
select alias1.id from Table alias1, Table alias2
where alias1.id != alias2.id
and datediff(day, alias1.date, alias2.date) = 0
)
I think. This is based on my assumption that you need them on the same day month and year, but not the same time of day, so I did not use a Group by clause. From the other posts it looks like I could have more cleverly used a Having clause. Can you use a having or group by on a datediff expression?
I'm not sure I understood your question, but maybe you want something like this:
SELECT id, COUNT(*) AS same_date FROM foo GROUP BY id, date HAVING same_date = 3;
This is just written from my mind and not tested in any way. Read the GROUP BY and HAVING section here. If this is not what you meant, please ignore this answer.
Note that there's some extra processing necessary if you're using a SQL DateTime field. If you've got that extra time data in there, then you can't just use that column as-is. You've got to normalize the DateTime to a single value for all records contained within the day.
In SQL Server here's a little trick to do that:
SELECT CAST(FLOOR(CAST(CURRENT_TIMESTAMP AS float)) AS DATETIME)
You cast the DateTime into a float, which represents the Date as the integer portion and the Time as the fraction of a day that's passed. Chop off that decimal portion, then cast that back to a DateTime, and you've got midnight at the beginning of that day.
Without knowing the exact structure of your tables or what type of database you're using it's hard to answer. However if you're using MS SQL and if you have a true date/time field that has different times that the records were entered on the same date then something like this should work:
select record_id,
convert(varchar, date_created, 101) as log date,
count(distinct date_created) as num_of_entries
from record_log_table
group by convert(varchar, date_created, 101), record_id
having count(distinct date_created) > 1
Hope this helps.
SELECT id, count(*)
INTO #tmp
FROM tablename
WHERE date = #date
GROUP BY id
HAVING count(*) > 1
SELECT *
FROM tablename t
WHERE EXISTS (SELECT 1 FROM #tmp WHERE id = t.id)
DROP TABLE tablename
TrickyNixon writes;
The top post (Leigh Caldwell) will not return duplicate records and needs to be down modded.
Yet the question doesn't ask about duplicate records. It asks about duplicate record-ids on the same date...
GROUP-BY,HAVING seems good to me. I've used it in production before.
.
Something to watch out for:
SELECT ... FROM ... GROUP BY ... HAVING count(*)>1
Will, on most database systems, run in O(NlogN) time. It's a good solution. (Select is O(N), sort is O(NlogN), group by is O(N), having is O(N) -- Worse case. Best case, date is indexed and the sort operation is more efficient.)
.
Select ... from ..., .... where a.data = b.date
Granted only idiots do a Cartesian join. But you're looking at O(N^2) time. For some databases, this also creates a "temporary" table. It's all insignificant when your table has only 10 rows. But it's gonna hurt when that table grows!
Ob link: http://en.wikipedia.org/wiki/Join_(SQL)
select id from tbl where date in
(select date from tbl group by date having count(*)>1)
GROUP BY with HAVING is your friend:
select id, count(*) from records group by date having count(*) > 1