How to use a select command to find all the records that has the maximum date value for a specific item? - sql

Say I have a table like this, we call it tbl_test
ID thedate actionid songid
1 2014-10-01 100 10
2 2014-09-30 100 10
3 2014-10-01 80 10
4 2014-09-30 80 10
5 2014-10-01 80 21
6 2014-09-30 100 21
Now I want to find all the record thats in the tbl_test where actionid=100 and with the latest [thedate] value. In this case, I want the final select result to be
(this is the result I want, not an existing table)
ID thedate actionid songid
1 2014-10-01 100 10
6 2014-09-30 100 21
Question, how am I going to do that use nothing but a single select command in MS SQL Server?

Use a join to a query that returns the latest date for each song:
select tbl_test.*
from tbl_test
join (select songid, max(theDate) maxDate
from tbl_test
where actionId = 100
group by songid) t on t.songId = tbl_test.songId and theDate = maxDate
where actionid = 100
This should perform pretty well as it makes only 2 passes over the table - one for the inner query that determines the latest date, and another to output the matching rows

A general SQL way to get this is using not exists:
select t.*
from tbl_test t
where actionid = 100 and
not exists (select 1
from tbl_test t2
where t2.songid = t.songid and t2.actionid = 100 and t2.thedate > t.thedate
);
For performance, you want an index on songid, actionid, thedate.

Related

Time consuming query to Skip First inserted record of Id list

In postgressql I have a data with multiple articleId list on table. Whereever I query it should skip first inserted record of particular userID in specified list of articleID.
select * from (
select * , row_number() over (partition by articleId order by date) rn
from table where articleId in (1200) and userId = 1
) t
where t.rn > 1
It will return expected record by skip first inserted record of each articleId of particular userId.
But above query consuming more time to execute if there is large data.
table:
id
name
articleId
date
userId
1
abc
1200
2021-05-01 06:09:35
1
2
bcd
1400
2021-05-02 06:08:35
1
3
xyz
1200
2021-05-03 09:09:35
2
4
pqr
1200
2021-05-04 08:09:35
1
5
xyz
1200
2021-05-05 09:09:35
3
Expected query Output:
id
name
articleId
date
userId
4
pqr
1200
2021-05-04 08:09:35
1
Try adding the following index, which should cover the call to ROW_NUMBER as well as the WHERE clause:
CREATE INDEX idx ON yourTable (articleId, date, userId);
This should speed up your current query. As always, check the execution plan before and after using EXPLAIN.
I would suggest using a correlated subquery with the right indexing:
select *
from t
where t.articleid = 1200 and t.userId = 1 and
t.date > (select min(t2.date)
from t t2
where t2.articleId = t.articleId
);
Then for this query, you want two indexes: (articleid, userId) and (articleId, date).
Note: I'm a bit surprised that userId is not in the partition by clause.

Get time stamp of change in column value

I have a table that tracks a certain status using a bit column.I want to get the first timestamp of the status change. I have got the desired output using temp table but is there a better way to do this?
I get the max time stamp for status 1, then I get the min timestamp for status 0 and if the min timestamp for status 0 is greater than max timestamp for status 1 then I include it in the result set.
Sample data
123 0 2016-12-21 20:04:56.217
123 0 2016-12-21 19:00:28.980
123 0 2016-12-21 17:00:10.207 <-- Get this record because this is the latest status change from 1 to 0
123 1 2016-12-20 16:15:58.787
123 1 2016-12-20 16:11:36.523
123 1 2016-12-20 14:20:02.467
123 1 2016-12-20 13:57:57.623
123 0 2016-12-20 13:55:31.421 <-- This should not be included in the result even though it is a status change but since it is not the latest
123 1 2016-12-20 13:54:57.307
123 0 2016-12-19 12:23:46.103
123 0 2016-12-18 11:47:21.267
SQL
CREATE TABLE #temp_status_changed
(
id VARCHAR(22) NOT NULL,
enabled BIT NOT NULL,
dt_create DATETIME NOT null
)
INSERT INTO #temp_status_changed
SELECT id,enabled,MAX(dt_create) FROM mytable WHERE enabled=1
GROUP BY id,enabled
SELECT a.id,a.enabled,MIN(a.dt_create) FROM mytable a
JOIN #temp_status_changed b ON a.id=b.id
WHERE a.enabled=0
GROUP BY a.id,a.enabled
HAVING MIN(a.dt_create) > (SELECT dt_create FROM #temp_status_changed WHERE id=a.id)
DROP TABLE #temp_status_changed
There are several ways to achieve that.
For example, using LAG() function you can always get the previous value and compare it:
SELECT * FROM
(
SELECT *, LAG(Enabled) OVER (PARTITION BY id ORDER BY dt_create) PrevEnabled
FROM YourTable
) x
WHERE Enabled = 0 AND PrevEnabled = 1
Another approach without window functions would be:
SELECT
sc.id,
sc.enabled,
dt_create = MIN(sc.dt_create)
FROM
YourTable AS sc
JOIN (
SELECT
id,
max_dt_create = MAX(dt_create)
FROM
YourTable
WHERE
enabled = 1
GROUP BY
id
) as MaxStatusChanges
ON sc.id = MaxStatusChanges.id AND
sc.dt_create > MaxStatusChanges.max_dt_create
GROUP BY
sc.id,
sc.enabled
The query returns no rows for an id if there's no rows with status 1 for that id, as well as if the most recent status for the id is 1. An unclustered index on enabled column with included id and dt_create columns could improve query performance.

trying to get Statistics for data based on another parameter

Struggling again on statistics on data based on other sets of data.
I have a list of customers. like the following:
CustomerID Value Date
1 3 01/01/2017
2 2 01/02/2017
3 1 01/02/2017
1 5 01/04/2017
1 6 01/04/2017
2 1 01/04/2017
2 2 01/04/2017
I want to get an average for a date range for Customer 1 on the days where customer 2 also has values. Does anyone have any thoughts?
example
Select avg(value)
from Table where customerid=1
and (customer 2 values are not blank)
and date between '01/01/2017' and '01/31/2017'
I am using SQL Server Express 2012.
Another Option
Select AvgValue = Avg(Value+0.0) -- Remove +0.0 if you want an INT
From YourTable
Where CustomerID = 1
and Date in (Select Distinct Date from YourTable Where CustomerID=2)
Returns
AvgValue
5.500000
You can select the dates using exists or in and then calculate the average:
select avg(value)
from datatbl t
where customerid = 1 and
exists (select 1 from datatbl t2 where t2.customerId = 2 and t2.date = t.date);
If you want the average per date, then include group by date.

Select Previous Record in SQL Server 2008

Here's the case: I have one table myTable which contains 3 columns:
ID int, identity
Group varchar(2), not null
value decimal(18,0), not null
Table looks like this:
ID GROUP VALUE Prev_Value Result
------------------------------------------
1 A 20 0 20
2 A 30 20 10
3 A 35 30 5
4 B 100 0 100
5 B 150 100 50
6 B 300 200 100
7 C 40 0 40
8 C 60 40 20
9 A 50 35 15
10 A 70 50 20
Prev_Value and Result columns should be custom columns. I need to make it on view. Anyone can help? please... Thank you so much.
The gist of what you need to do here is to join the table to itself, where part of the join condition is that the value column of the joined copy of the table is less than value column of the original. Then you can group by the columns from the original table and select the max value from the joined table to get your results:
SELECT t1.id, t1.[Group], t1.Value
, coalesce(MAX(t2.Value),0) As Prev_Value
, t1.Value - coalesce(MAX(t2.Value),0) As Result
FROM MyTable t1
LEFT JOIN MyTable t2 ON t2.[Group] = t1.[Group] and t2.Value < t1.Value
GROUP BY t1.id, t1.[Group], t1.Value
Once you can update to Sql Server 2012 you'll also be able to take advantage of the new LAG keyword.

TOP 1 Query from each ID with multiple instances

This query will return the top for all rows in MS Access.
SELECT TOP 1 * FROM [table]
ORDER BY table.[Date] DESC;
I need to return the top date for each id that can have multiple dates.
ID DATE
1 01/01/2001
1 01/12/2011
3 01/01/2001
3 01/12/2011
Should return only the top dates like this.
1 01/12/2011
3 01/12/2011
You'll want to use the MAX function, along with a GROUP BY.
SELECT ID, MAX(DATE)
FROM [table]
GROUP BY ID