Get First Record of Each Group - sql

First I would like to apologize if it is a basic question.
So, i have monitoring data being stored every 5 seconds.
I want create a query that returns me the first record every 10 minutes, for example:
|Data | Voltage (V) |
|2020-08-14 14:00:00 | 10
|2020-08-14 14:00:05 | 15
|2020-08-14 14:00:00 | 12
.... |
|2020-08-14 14:10:10 | 25
|2020-08-14 14:10:15 | 30
|2020-08-14 14:10:20 | 23
The desired result is:
|Data |Voltage (V) |
|2020-08-14 14:00:00 10 |
|2020-08-14 14:10:10 25 |
I'm using SQLServer database.
I read about similar solutions as post: Select first row in each GROUP BY group?
But i can't resolve my issue.
I started with:
SELECT Data, Voltage
GROUP BY DATEADD(MINUTE,(DATEDIFF(MINUTE, 0 , Data)/10)*10,0)
ORDER BY DATA DESC
But i can't use FIRST() or top 1 in this query.
Anyone have ideas?
Thanks a lot!

If I understand correctly:
select t.*
from (select t.*,
row_number() over (partition by DATEADD(MINUTE,(DATEDIFF(MINUTE, 0 , Data)/10)*10,0) order by data) as seqnum
from t
) t
where seqnum = 1;

Related

Calculate time span over a number of records

I have a table that has the following schema:
ID | FirstName | Surname | TransmissionID | CaptureDateTime
1 | Billy | Goat | ABCDEF | 2018-09-20 13:45:01.098
2 | Jonny | Cash | ABCDEF | 2018-09-20 13:45.01.108
3 | Sally | Sue | ABCDEF | 2018-09-20 13:45:01.298
4 | Jermaine | Cole | PQRSTU | 2018-09-20 13:45:01.398
5 | Mike | Smith | PQRSTU | 2018-09-20 13:45:01.498
There are well over 70,000 records and they store logs of transmissions to a web-service. What I'd like to know is how would I go about writing a script that would select the distinct TransmissionID values and also show the timespan between the earliest CaptureDateTime record and the latest record? Essentially I'd like to see what the rate of records the web-service is reading & writing.
Is it even possible to do so in a single SELECT statement or should I just create a stored procedure or report in code? I don't know where to start aside from SELECT DISTINCT TransmissionID for this sort of query.
Here's what I have so far (I'm stuck on the time calculation)
SELECT DISTINCT [TransmissionID],
COUNT(*) as 'Number of records'
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1
Not sure how to get the difference between the first and last record with the same TransmissionID I would like to get a result set like:
TransmissionID | TimeToCompletion | Number of records |
ABCDEF | 2.001 | 5000 |
Simply GROUP BY and use MIN / MAX function to find min/max date in each group and subtract them:
SELECT
TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime))
FROM yourdata
GROUP BY TransmissionID
HAVING COUNT(*) > 1
Use min and max to calculate timespan
SELECT [TransmissionID],
COUNT(*) as 'Number of records',datediff(s,min(CaptureDateTime),max(CaptureDateTime)) as timespan
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1
A method that returns the average time for all transmissionids, even those with only 1 record:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime)) * 1.0 / NULLIF(COUNT(*) - 1, 0)
FROM yourdata
GROUP BY TransmissionID;
Note that you may not actually want the maximum of the capture date for a given transmissionId. You might want the overall maximum in the table -- so you can consider the final period after the most recent record.
If so, this looks like:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second,
MIN(CaptureDateTime),
MAX(MAX(CaptureDateTime)) OVER ()
) * 1.0 / COUNT(*)
FROM yourdata
GROUP BY TransmissionID;

Count first occurrences in time (SQL)

I have a table like this
+----+---------------------+
| Id | Date application |
+----+---------------------+
| 1 | 2016-08-22 03:05:06 |
| 2 | 2016-08-22 03:05:06 |
| 1 | 2016-08-23 03:05:06 |
| 2 | 2016-08-23 03:05:06 |
+----+---------------------+
I would like to find out when was the first application for each user (ID)
and then to count how many occurred in the past 7 days
so far here is what I have
SELECT id,
min(date_of_application)
FROM mytable
GROUP BY id
ORDER BY date_of_application ASC
Will the min() work on dates ?
From there, how do I count how many first applications there are in the past 7 days ?
Please tag your database. min() will work on dates.
Assuming your is mysql db here is what you can do to get the application usage count in the past 7 days from now.
select
id, count(*) as 'appUsageCount'
from
mytable
where
dateApplication >= DATE(DATE_SUB(NOW(),INTERVAL 7 DAY))
and date_of_application <= DATE(NOW()))
group by id
#Neeraj: Using your query with little modification.
Try this:
select
id, count(id) as 'appUsageCount', min(date_of_application)
from
mytable
where
date_of_application >= DATE(DATE_SUB(NOW(),INTERVAL 7 DAY))
and date_of_application <= DATE(NOW()))
group by id

Select last record from data table for each device in devices table [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 6 years ago.
I have a problem with the executing speed of my sql query to postgres database.
I have 2 tables:
table 1: DEVICES
ID | NAME
------------------
1 | first device
2 | second device
table 2: DATA
ID | DEVICE_ID | TIME | DATA
--------------------------------------------
1 | 1 | 2016-07-14 2:00:00 | data1
2 | 1 | 2016-07-14 1:00:00 | data2
3 | 2 | 2016-07-14 4:00:00 | data3
4 | 1 | 2016-07-14 3:00:00 | data4
5 | 2 | 2016-07-14 6:00:00 | data5
6 | 2 | 2016-07-14 5:00:00 | data6
I need get this select's result table:
ID | DEVICE_ID | TIME | DATA
-------------------------------------------
4 | 1 | 2016-07-14 3:00:00 | data4
5 | 2 | 2016-07-14 6:00:00 | data5
i.e. for each device in devices table I need to get only one data record with the last TIME value.
This is my sql query:
SELECT * FROM db.data d
WHERE d.time = (
SELECT MAX(d2.time) FROM db.data d2
WHERE d2.device_id = d.device_id);
This is HQL query equivalent:
SELECT d FROM Data d
WHERE d.time = (
SELECT MAX(d2.time) FROM Data d2
WHERE d2.device.id = t2.device.id)
Yes, I use Hibernate ORM in my project - may this info will be useful for someone.
I got correct answer on my queries, BUT it's too long - about 5-10 seconds on 10k records in data table and only 2 devices in devices table. It's terrible.
First of all, I thought that problem is in Hibernate. But native sql query from psql in linux terminal execute the same time as through hibernate.
How can I optimize my query? This query is too complexity:
O(device_count * data_count^2)
Since you're using Postgres, you could use window functions to achieve this, like so:
select
sq.id,
sq.device_id,
sq.time,
sq.data
from (
select
data.*,
row_number() over (partition by data.device_id order by data.time desc) as rnk
from
data
) sq
where
sq.rnk = 1
The row_number() window function first ranks the rows in the data table on the basis of the device_id and time columns, and the outer query then picks the highest-ranked rows.

Show last update date

I am new in this forum and also new in SQL my question is
I have an Excel sheet link to database with "From Microsoft query" I have 3 tables link together pd_ln,pdcflbrt,pdlbr
By using the following query I am getting this data
SELECT pdcflbrt.lbrcod, pdcflbrt.lbrrat, pd_ln.prdnum, pdcflbrt.begeffdat
FROM velocity.dbo.pd_ln pd_ln, velocity.dbo.pdcflbrt pdcflbrt, velocity.dbo.pdlbr pdlbr
WHERE pdlbr.lbrrattky = pdcflbrt.lbrrattky AND pd_ln.pd_ln_tky = pdlbr.pd_ln_tky
+--------------+--------------+-----------+------------------+
| lbrcod | lbrrat | prdnum | begeffdat |
+--------------+--------------+-----------+------------------+
| FC Braselton | 0.11 | 00236 | 7/15/2012 0:00 |
| FC Braselton | 0.11 | 00236 | 7/15/2012 0:00 |
| FC Braselton | 0.1 | 00236 | 12/10/2012 0:00 |
| Sizing | 0.21 | 03103 | 8/28/2015 0:00 |
| Sizing | 0.2 | 03103 | 10/13/2011 0:00 |
+--------------+--------------+-----------+------------------+
How do I query to get the last begeffdat of each prdnum.
Magood's answer may work in this situation. However, if there was a unique identifier for each edit that you were selecting, it wouldn't work. As far as I know, you would have to get involved with row_number() like so:
SELECT s2.lbrcod, s2.lbrrat, s2.prdnum, s2.begeffdat from
(SELECT pdcflbrt.lbrcod
, pdcflbrt.lbrrat
, pd_ln.prdnum
, pdcflbrt.begeffdat
, row_number() over (partition by pd_ln.prdnum order by pdcflbrt.begeffdat desc) as RN
FROM velocity.dbo.pd_ln pd_ln, velocity.dbo.pdcflbrt pdcflbrt, velocity.dbo.pdlbr pdlbr
WHERE pdlbr.lbrrattky = pdcflbrt.lbrrattky AND pd_ln.pd_ln_tky = pdlbr.pd_ln_tky) s2
where s2.rn = 1
This will return only the top date (it is the same query on the inner portion, but with the row_number() function added, with each different prdnum starting the numbers over, and ordering the rows by date, with the newest date first. The outer portion selects only row 1 (that's the last where) which is the newest date.
EDIT: Alternatively, if you only want the OLDEST update, you could change the desc in the main query's select statement to say asc.
-- Only for name and latest date
select lbrcod, max(begeffdate) begeffdat from #table
group by lbrcod
-- For all columns
select * from (
select *, row_number() over (partition by prdnum order by begeffdate desc) rowNum from #table
) data
where rowNum = 1

MS-SQL get difference value

I have this query that calculates current gallons value from all fuel tanks in my database.
SELECT DISTINCT y.TankNumber as TankNumber
, y.Gallons as Gallons
, y.timeUpdated
, y.FuelType as FuelType
FROM (
SELECT TankNumber, max(timeUpdated) as maxdate
FROM someTable
GROUP BY TankNumber) as x
JOIN someTable y
ON x.TankNumber = y.TankNumber
AND x.maxdate = y.timeUpdated
ORDER BY y.TankNumber
Based on the fuel usage, data gets dumped in to my database automatically at any time. And query above will give me only the current gallons value in each fueltank:
TankNumber | Gallons | timeUpdated | FuelType
1 | 14 | 2012-10-22 04:16 | 89
2 | 8 | 2012-10-22 04:14 | 93
and etc..
My problem is, that I am trying to add another output value to my page, that will give me a difference how much fuel was used since last update. So it will look something like this:
TankNumber | Gallons | timeUpdated | FuelType | GallonsUsed
1 | 14 | 2012-10-22 04:16 | 89 | 5
2 | 8 | 2012-10-22 04:14 | 93 | -11
Unfortunately my SQL experience is not as solid for this type of problem and I have spent about two days trying to figure out or google something close. So, any help will be greatly appreciated.
Assuming you're using MS SQL 2005 or later, you can use the ROW_NUMBER function:
WITH cteOrderedUpdates As
(
SELECT
TankNumber,
Gallons,
TimeUpdated,
FuelType,
ROW_NUMBER() OVER
(
PARTITION BY
TankNumber
ORDER BY
TimeUpdated DESC
) As RowNumber
FROM
someTable
)
SELECT
x.TankNumber,
x.Gallons,
x.TimeUpdated,
x.FuelType,
x.Gallons - IsNull(y.Gallons, 0) As GallonsUsed
FROM
cteOrderedUpdates As x
LEFT JOIN cteOrderedUpdates As y
ON x.TankNumber = y.TankNumber
And x.RowNumber = y.RowNumber - 1
WHERE
x.RowNumber = 1
ORDER BY
x.TankNumber
;