SQL querying for set of distinctive values - sql

I have a DB table with columns ID, time, and text. ID is non-unique. I'm trying to construct a query that says "For each distinct IDs, give me the row whose time is the greatest value before T". The second part is easy enough with a ORDER BY DESC LIMIT 1 WHERE time < T, but I don't know how to ensure I get coverage of all the IDs.
As an example:
ID,time,text
1,10,"hello"
1,20,"world"
2,10,"foo"
3,10,"bar"
4,50,"blah"
If I searched with time 25, I'd want:
1,20,"world"
2,10,"foo"
3,10,"bar"
I could do this is multiple queries by searching for DISTINCT IDs and then doing a search for each, but I was hoping there was a more efficient compound query form.

SELECT t.ID, t.time, t.text
FROM YourTable t
INNER JOIN (SELECT ID, MAX(time) AS MaxTime
FROM YourTable
WHERE time < T
GROUP BY ID) q
ON t.ID = q.ID
AND t.time = q.MaxTime

You can use a sub-query:
select t1.id, t1.time, t1.test
from yourtable t1
inner join
(
select id, max(time) MaxTime
from yourtable
where time < T
group by id
) t2
on t1.id = t2.id
and t1.time = t2.time

Related

How to get latest date that is not greater then record date?

I have table with names with eomonth history.
I need to join another table that has information about "tests" performed on names:
The result table should show the latest ID and date of performed test but the date of test cannot be greater than the data month:
Any ideas how to perform this?
I am guessing that you want the most recent EOMONTH from the second table for each row in the first table. If that is the correct interpretation, then you can simply use apply:
select t1.*, t2.*
from table1 t1 outer apply
(select top (1) t2.*
from table2 t2
where t2.test_id = t1.test_id and t2.eomonth <= t1.test_date
order by t2.eomonth desc
) t2;
This join works perfectly:
from table1 a
left join table2 b on a.name = b.name and a.eomonth >= b.id_date and a.eomonth < LAG(b.id_date,1,'3000-01-01') OVER (PARTITION BY name ORDER BY b.id_date desc)

SQL Server query showing most recent distinct data

I am trying to build a SQL query to recover only the most young record of a table (it has a Timestamp column already) where the item by which I want to filter appears several times, as shown in my table example:
.
Basically, I have a table1 with Id, Millis, fkName and Price, and a table2 with Id and Name.
In table1, items can appear several times with the same fkName.
What I need to achieve is building up a single query where I can list the last record for every fkName, so that I can get the most actual price for every item.
What I have tried so far is a query with
SELECT DISTINCT [table1].[Millis], [table2].[Name], [table1].[Price]
FROM [table1]
JOIN [table2] ON [table2].[Id] = [table1].[fkName]
ORDER BY [table2].[Name]
But I don't get the correct listing.
Any advice on this? Thanks in advance,
A simple and portable approach to this greatest-n-per-group problem is to filter with a subquery:
select t1.millis, t2.name, t1.price
from table1 t1
inner join table2 t2 on t2.id = t1.fkName
where t1.millis = (select max(t11.millis) from table1 t11 where t11.fkName = t1.fkName)
order by t1.millis desc
using Common Table Expression:
;with [LastPrice] as (
select [Millis], [Price], ROW_NUMBER() over (Partition by [fkName] order by [Millis] desc) rn
from [table1]
)
SELECT DISTINCT [LastPrice].[Millis],[table2].[Name],[LastPrice].[Price]
FROM [LastPrice]
JOIN [table2] ON [table2].[Id] = [LastPrice].[fkName]
WHERE [LastPrice].rn = 1
ORDER BY [table2].[Name]

Using MAX when selecting a high number of fields in a query

I understand some varieties of this question have been asked, but I could not find an answer to my specific scenario.
My query has over 50 fields being selected, and only one of them is an aggregate, using MAX(). On the GROUP BY clause, I would only like to pass two specific fields, name and UserID, not all 50 to make the query run. See small subset below.
SELECT
t1.name,
MAX(t2.id) as UserID,
t3.age,
t3.height,
t3.dob,
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
LEFT JOIN table3 t3 ON t1.id = t3.id
GROUP BY t1.name, UserID
Is there any workaround or better approach to accomplish my goal?
The database is SQL Server and any help would be greatly appreciated.
Hmmm . . . What values do you want for the other fields? If you want the max() of one column for each id and code, you can do:
select t.*
from (select t.*, max(col) over (partition by id, code) as maxcol
from t
) t
where col = maxcol;
Given that id might be unique, you might want the maximum id as well as the other columns for each code:
select t.*
from (select t.*, max(id) over (partition by code) as maxid
from t
) t
where id = maxid;

SQL - Query to return result

There is a table with Columns as below:
Id : long autoincrement;
timestamp:long;
price:long
Timestamp is given as a unix_time in ms.
Question: what is the average time difference between the records ?
First thought is a sub-query grabbing the record immediately previous:
SELECT timestamp -
(select top 1 timestamp from Table T1 where T1.Id < Table.Id order by Id desc)
FROM Table
Then you can take the average of that:
SELECT AVG(delta)
from (SELECT timestamp -
(select top 1 timestamp from Table T1 where T1.Id < Table.Id order by Id desc) as delta
FROM Table) T
There will probably need to be some handling of the null that results for the first row, but I haven't tested to be sure.
In SQL Server, you could write something like that to get that information:
SELECT
t1.ID, t2.ID,
DATEDIFF(MILLISECOND, t2.PriceTime, test2.PriceTime)
FROM table t1
INNER JOIN table t2 ON t2.ID = t1.ID-1
WHERE t1.ID > (SELECT MIN(ID) FROM table)
and if you're only interested in the AVG across all entries, you could use:
SELECT
AVG(DATEDIFF(MILLISECOND, t2.PriceTime, test2.PriceTime))
FROM table t1
INNER JOIN table t2 ON t2.ID = t1.ID-1
WHERE t1.ID > (SELECT MIN(ID) FROM table)
Basically, you need to join the table with itself, and use "t1.ID = t2.ID-1" to associate item no. 2 in one table with item no. 1 in the other table and then calculate the time difference between the two. In order to avoid accessing item no. 0 which doesn't exist, use the "T1.ID > (SELECT MIN(ID) FROM table)" clause to start from the second item.
Marc
At a guess:
SELECT AVG(timestamp)
I think you need to provide more information in your question for us to help.
If you mean difference between each-other row:
select AVG(x) from (
select a.timestamp - b.timestamp as x
from table a, table b -- this multiplies a*b ) sub
SELECT AVG(T2.Timestamp - T1.TimeStamp)
FROM Table T1
JOIN Table T2 ON T2.ID = T1.ID + 1
try this
Select Avg(E.Timestamp - B.Timestamp)
From Table B Join Table E
On E.Timestamp =
(Select Max(Timestamp)
From Table
Where Timestamp < R.Timestamp)

SQL query for Top 5 of every category

I have a table that has three columns: Category, Timestamp and Value.
What I want is a SQL select that will give me the 5 most recent values of each category. How would I go about and do that?
I tried this:
select
a."Category",
b."Timestamp",
b."Value"
from
(select "Category" from "Table" group by "Category" order by "Category") a,
(select a."Category", c."Timestamp", c."Value" from "Table" c
where c."Category" = a."Category" limit 5) b
Unfortunately, it won't allow it because "subquery in FROM cannot refer to other relations of same query level".
I'm using PostGreSQL 8.3, by the way.
Any help will be appreciated.
SELECT t1.category, t1.timestamp, t1.value, COUNT(*) as latest
FROM foo t1
JOIN foo t2 ON t1.id = t2.id AND t1.timestamp <= t2.timestamp
GROUP BY t1.category, t1.timestamp
HAVING latest <= 5;
Note: Try this out and see if it performs suitably for your needs. It will not scale well for large groups.