SQL Find difference between rows with unique IDs

SQL Find difference between rows with unique IDs - sql

I have a table with two rows, one has a start and one an end time with the same id.
For example
ID | Time
12345 | 12-12-18 12:00
12345 | 12-12-18 12:12
54321 | 12-12-18 11:30
54321 | 12-12-18 11:35
How would i go around getting the output
ID | Time
12345 | 12
54321 | 5
Guessing Lag or Over?

You seem to want an aggregation:
select id,
datediff(minute, min(time), max(time)) as diff_minutes
from t
group by id;

You can also selfjoin as Ezlo also wrote:
select a.ID, datediff(minute,MIN(a.time1),max(x.time1) ) as maxtime
from #test a inner join #test x on A.id = x.id group by a.id
But seems like Gordons is much simpler :)

Related

How to pull data based on current and last update?

Our data table looks like this:
Machine Name
Lot Number
Qty
Load TxnDate
Unload TxnDate
M123
ABC
500
10/1/2020
10/2/2020
M741
DEF
325
10/1/2020
M123
ZZZ
100
10/5/2020
10/7/2020
M951
AAA
550
10/5/2020
10/9/2020
M123
BBB
550
10/7/2020
I need to create an SQL query that shows the currently loaded Lot number - Machines with no Unload TxnDate - and the last loaded Lot number based on the unload TxnDate.
So in the example, when I run a query for M123, the result will show:
Machine Name
Lot Number
Qty
Load TxnDate
Unload TxnDate
M123
ZZZ
100
10/5/2020
10/7/2020
M123
BBB
550
10/7/2020
As you can see although Machine Name has 3 records, the results only show the currently loaded and the last loaded. Is there anyway to replicate this? The Machine Name is dynamic, so my user can enter the Machine Name and see the results the machine based on the missing Unload TxnDate and the last Unload Txn Date

You seem to want the last two rows. That would be something like this:
select t.*
from t
where machine_name = 'M123'
order by load_txn_date desc
fetch first 2 rows only;
Note: not all databases support the first first clause. Some spell it limit, or select top, or even something else.

If you want two rows per machine, one option uses window functions:
select *
from (
select t.*,
row_number() over(
partition by machine_name, (case when unload_txn_date is null then 0 else 1 end)
order by coalesce(unload_txn_date, load_txn_date) desc
) rn
from mytable t
) t
where rn = 1
The idea is to separate rows between those that have an unload date, and those that do not. We can then bring the top record per group.
For your sample data, this returns:
Machine_Name | Lot_Number | Qty | Load_Txn_Date | Unload_Txn_Date | rn
:----------- | :--------- | --: | :------------ | :-------------- | -:
M123 | BBB | 550 | 2020-10-07 | null | 1
M123 | ZZZ | 100 | 2020-10-05 | 2020-10-07 | 1
M741 | DEF | 325 | 2020-10-01 | null | 1
M951 | AAA | 550 | 2020-10-05 | 2020-10-09 | 1

You might use the following query, presuming that you're on a database having Window(or Analytic) Function
WITH t AS
(
SELECT COALESCE(Unload_Txn_Date -
LAG(Load_Txn_Date) OVER
(PARTITION BY Machine_Name ORDER BY Load_Txn_Date DESC),0) AS lg,
MAX(CASE WHEN Unload_Txn_Date IS NULL THEN Load_Txn_Date END) OVER
(PARTITION BY Machine_Name) AS mx,
t.*
FROM tab t
), t2 AS
(
SELECT DENSE_RANK() OVER (ORDER BY mx DESC NULLS LAST) AS dr, t.*
FROM t
WHERE mx IS NOT NULL
)
SELECT Machine_Name,Lot_Number,Qty,Load_Txn_Date,Unload_Txn_Date
FROM t2
WHERE dr = 1 AND lg = 0
ORDER BY Load_Txn_Date
where if previous row's Unload_Txn_Date is equal to the current Load_Txn_Date, then it's accepted that there's no interruption will occur for the job, while determining the last Unload Txn Dates with no unload date values per each machine. And then, the result set returns through filtering by the values derived from the window functions within the penultimate query.
Demo

SQL: Calculate number of days since last success

Following table represents results of given test.
Every result for the same test is either pass ( error_id=0) or fail ( error_id <> 0)
I need help to write a query, that returns the number of runs since last good run ( error_id= 0) and the date.
| Date | test_id | error_id |
-----------------------------------
| 2019-12-20 | 123 | 23
| 2019-12-19 | 123 | 23
| 2019-12-17 | 123 | 22
| 2019-12-18 | 123 | 0
| 2019-12-16 | 123 | 11
| 2019-12-15 | 123 | 11
| 2019-12-13 | 123 | 11
| 2019-12-12 | 123 | 0
So the result for this example should be:
| 2019-12-18 | 123 | 4
as the test 123 was PASS on 2019-12-18 and this happened 4 runs ago.
I have a query to determine whether given run is error or not, but I have trouble applying appropriate window function to it to get the wanted result
select test_id, Date, error_id, (CASE WHEN error_id 0 THEN 1 ELSE 0 END) as is_error
from testresults

You can generate a row number, in reverse order from the sorting of the query itself:
SELECT test_date, test_id, error_code,
(row_number() OVER (ORDER BY test_date asc) - 1) as runs_since_last_pass
FROM tests
WHERE test_date >= (SELECT MAX(test_date) FROM tests WHERE error_code=0)
ORDER BY test_date DESC
LIMIT 1;
Note that this will run into issues if test_date is not unique. Better use a timestamp (precise to the millisecond) instead of a date.
Here's a DBFiddle: https://www.db-fiddle.com/f/8gSHVcXMztuRiFcL8zLeEx/0
If there's more than one test_id, you'll want to add a PARTITION BY clause to the row number function, and the subquery would become a bit more complex. It may be more efficient to come up with a way to do this by a JOIN instead of a subquery, but it would be more cognitively complex.

I think you just want aggregation and some filtering:
select id, count(*),
max(date) over (filter where error_id = 0) as last_success_date
from t
where date > (select max(t2.date) from t t2 where t2.error_id = 0);
group by id;

You have to use the Maximum date of the good runs for every test_id in your query. You can try this query:
select tr2.Date_error, tr.test_id, count(tr.error_id) from
testresults tr inner join (select max(Date_error), test_id
from testresult where error_id=0 group by test_id) tr2 on
tr.test_id=tr2.test_id and tr.date_error >=tr2.date_error
group by test_id

This should do the trick:
select count(*) from table t,
(select max(date) date from table where error_id = 0) good
where t.date >= good.date
Basically you are counting the rows that have a date >= the date of the last success.
Please note: If you need the number of days, it is a complete different query:
select now()::date - max(test_date) last_valid from tests
where error_code = 0;

Calculate time span over a number of records

I have a table that has the following schema:
ID | FirstName | Surname | TransmissionID | CaptureDateTime
1 | Billy | Goat | ABCDEF | 2018-09-20 13:45:01.098
2 | Jonny | Cash | ABCDEF | 2018-09-20 13:45.01.108
3 | Sally | Sue | ABCDEF | 2018-09-20 13:45:01.298
4 | Jermaine | Cole | PQRSTU | 2018-09-20 13:45:01.398
5 | Mike | Smith | PQRSTU | 2018-09-20 13:45:01.498
There are well over 70,000 records and they store logs of transmissions to a web-service. What I'd like to know is how would I go about writing a script that would select the distinct TransmissionID values and also show the timespan between the earliest CaptureDateTime record and the latest record? Essentially I'd like to see what the rate of records the web-service is reading & writing.
Is it even possible to do so in a single SELECT statement or should I just create a stored procedure or report in code? I don't know where to start aside from SELECT DISTINCT TransmissionID for this sort of query.
Here's what I have so far (I'm stuck on the time calculation)
SELECT DISTINCT [TransmissionID],
COUNT(*) as 'Number of records'
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1
Not sure how to get the difference between the first and last record with the same TransmissionID I would like to get a result set like:
TransmissionID | TimeToCompletion | Number of records |
ABCDEF | 2.001 | 5000 |

Simply GROUP BY and use MIN / MAX function to find min/max date in each group and subtract them:
SELECT
TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime))
FROM yourdata
GROUP BY TransmissionID
HAVING COUNT(*) > 1

Use min and max to calculate timespan
SELECT [TransmissionID],
COUNT(*) as 'Number of records',datediff(s,min(CaptureDateTime),max(CaptureDateTime)) as timespan
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1

A method that returns the average time for all transmissionids, even those with only 1 record:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime)) * 1.0 / NULLIF(COUNT(*) - 1, 0)
FROM yourdata
GROUP BY TransmissionID;
Note that you may not actually want the maximum of the capture date for a given transmissionId. You might want the overall maximum in the table -- so you can consider the final period after the most recent record.
If so, this looks like:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second,
MIN(CaptureDateTime),
MAX(MAX(CaptureDateTime)) OVER ()
) * 1.0 / COUNT(*)
FROM yourdata
GROUP BY TransmissionID;

Query Max/Min value shows original values

I'm taking the max value and min value of a table composed of Date, Time, and Load. For example:
Date | Time | Temp
-------------------------------------
1/1/2014 | 09:00:00 AM | 100
-------------------------------------
1/1/2014 | 09:01:00 AM | 110
-------------------------------------
1/1/2014 | 09:02:00 AM | 120
-------------------------------------
1/1/2014 | 09:03:00 AM | 111
-------------------------------------
....................And so on
I've tried to just use the functions Min(), Max() but these values output the same data as the original table. See SQL code:
SELECT Table1.Date, Table1.Time, Min(Table1.Temp) AS MinLoad
FROM Table1
GROUP BY Table1.Date, Table1.Time;
I tried using DMin() and DMax() functions but instead of getting a value I got a null of the values. I tried the syntax
DMin("[Temp]", "[Table1]", [Time] Between #09:00# And #15:00#)
I'm fairly new to Access so any help would be appreciated.
Thanks!

Figured it out:
SELECT Date.DateLog, Min(Table1.Data) AS MinOfData
FROM [Date] INNER JOIN Table1 ON Date.DateLog = Table1.Date
GROUP BY Date.DateLog;

SQL Where Query to Return Distinct Values

I have an app that has the built in initial Select option and only allows me to enter from the Where section. I have rows with duplicate values. I'm trying to get the list of just one record for each distinct value but am unsure how to get the statement to work. I've found one that almost does the trick but it doesn't give me any rows that had a dup. I assume due to the = so just need a way to get one for each that matches my where criteria. Examples below.
Initial Data Set
Date | Name | ANI | CallIndex | Duration
---------------------------------------------------------
2/2/2015 | John | 5555051000 | 00000.0001 | 60
2/2/2015 | John | | 00000.0001 | 70
3/1/2015 | Jim | 5555051001 | 00000.0012 | 80
3/4/2015 | Susan | | 00000.0022 | 90
3/4/2015 | Susan | 5555051002 | 00000.0022 | 30
4/10/2015 | April | 5555051003 | 00000.0030 | 35
4/11/2015 | Leon | 5555051004 | 00000.0035 | 10
4/15/2015 | Jane | 5555051005 | 00000.0050 | 20
4/15/2015 | Jane | 5555051005 | 00000.0050 | 60
4/15/2015 | Kevin | 5555051006 | 00000.0061 | 35
What I Want the Query to Return
Date | Name | ANI | CallIndex | Duration
---------------------------------------------------------
2/2/2015 | John | 5555051000 | 00000.0001 | 60
3/1/2015 | Jim | 5555051001 | 00000.0012 | 80
3/4/2015 | Susan | 5555051002 | 00000.0022 | 30
4/10/2015 | April | 5555051003 | 00000.0030 | 35
4/11/2015 | Leon | 5555051004 | 00000.0035 | 10
4/15/2015 | Jane | 5555051005 | 00000.0050 | 20
4/15/2015 | Kevin | 5555051006 | 00000.0061 | 35
Here is what I was able to get but when i run it I don't get the rows that did have dups callindex values. duration doesn't mattern and they never match up so if it helps to query using that as a filter that would be fine. I've added mock data to assist.
use Database
SELECT * FROM table
WHERE Date between '4/15/15 00:00' and '4/15/15 23:59'
and callindex in
(SELECT callindex
FROM table
GROUP BY callinex
HAVING COUNT(callindex) = 1)
Any help would be greatly appreciated.
Ok with the assistance of everyone here i was able to get the query to work perfectly within SQL. That said apparently the app I'm trying this on has a built in character limit and the below query is too long. This is the query i have to use as far as the restrictions and i have to be able to search both ID's at the same time because some get stamped with one or the other rarely both. I'm hoping someone might be able to help me shorten it?
use Database
select * from tblCall
WHERE
flddate between '4/15/15 00:00' and '4/15/15 23:59'
and fldAgentLoginID='1234'
and fldcalldir='incoming'
and fldcalltype='external'
and EXISTS (SELECT * FROM (SELECT MAX(fldCallName) AS fldCallName, fldCallID FROM tblCall GROUP BY fldCallID) derv WHERE tblCall.fldCallName = derv.fldCallName AND tblCall.fldCallID = derv.fldCallID)
or
flddate between '4/15/15 00:00' and '4/15/15 23:59'
and '4/15/15 23:59'
and fldPhoneLoginID='56789'
and fldcalldir='incoming'
and fldcalltype='external'
and EXISTS (SELECT * FROM (SELECT MAX(fldCallName) AS fldCallName, fldCallID FROM tblCall GROUP BY fldCallID) derv WHERE tblCall.fldCallName = derv.fldCallName AND tblCall.fldCallID = derv.fldCallID)

If the constraint is that we can only add to the WHERE clause, I don't think it's possible, due to there being 2 absolutely identical rows:
4/15/2015 | Jane | 5555051005 | 00000.0050
4/15/2015 | Jane | 5555051005 | 00000.0050
Is it possible that you can add HAVING or GROUP BY to the WHERE? or possibly UNION the SELECT to another SELECT statement? That may open up some additional possibilities.

Maybe with an union:
SELECT *
FROM table
GROUP BY Date, Name, ANI, CallIndex
HAVING ( COUNT(*) > 1 )
UNION
SELECT *
FROM table
WHERE Name not in (SELECT name from table
GROUP BY Date, Name, ANI, CallIndex
HAVING ( COUNT(*) > 1 ))

From your sample, it seems like you could just exclude rows in which there was no value in the ANI column. If that is the case you could simply do:
use Database
SELECT * FROM table
WHERE Date between '4/15/15 00:00' and '4/15/15 23:59'
and ANI is not null
If this doesn't work for you, let me know and I can see what else I can do.
Edit:
You've made it sound like the CallIndex combined with the Duration is a unique value. That seems somewhat doubtful to me, but if that is the case you could do something like this:
use Database
SELECT * FROM table
WHERE Date between '4/15/15 00:00' and '4/15/15 23:59'
and cast(callindex as varchar(80))+'-'+cast(min(duration) as varchar(80)) in
(SELECT cast(callindex as varchar(80))+'-'+cast(min(duration) as varchar(80))
FROM table
GROUP BY callindex)

There are two keywords you can use to get non-duplicated data, either DISTINCT or GROUP BY. In this case, I would use a GROUP BY, but you should read up on both.
This query groups all of the records by CallIndex and takes the MAX value for each of the other columns and should give you the results you want:
SELECT MAX(Date) AS Date, MAX(Name) AS Name, MAX(ANI) AS ANI, CallIndex
FROM table
GROUP BY CallIndex
EDIT
Since you can't use GROUP BY directly but you can have any SQL in the WHERE clause you can do:
SELECT *
FROM table
WHERE EXISTS
(
SELECT *
FROM
(
SELECT MAX(Date) AS Date, MAX(Name) AS Name, MAX(ANI) AS ANI, CallIndex
FROM table
GROUP BY CallIndex
) derv
WHERE table.Date = derv.Date
AND table.Name = derv.Name
AND table.ANI = derv.ANI
AND table.CallIndex = derv.CallIndex
)
This selects all rows from the table where there exists a matching row from the GROUP BY.
It won't be perfect, if any two rows match exactly, you'll still have duplicates, but that's the best you'll get with your restriction.

In your data, why not just do this?
SELECT *
FROM table
WHERE Date >= '2015-04-15' and Date < '2015-04-16'
ani is not null;
If the blank values are only a coincidence, then you have a problem just using a where clause. If the results are full duplicates (no column has a different value), then you probably cannot do what you want with just a where clause -- unless you are using SQLite, Oracle, or Postgres.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Find difference between rows with unique IDs - sql

I have a table with two rows, one has a start and one an end time with the same id. For example ID | Time 12345 | 12-12-18 12:00 12345 | 12-12-18 12:12 54321 | 12-12-18 11:30 54321 | 12-12-18 11:35 How would i go around getting the output ID | Time 12345 | 12 54321 | 5 Guessing Lag or Over?

You seem to want an aggregation: select id, datediff(minute, min(time), max(time)) as diff_minutes from t group by id;

You can also selfjoin as Ezlo also wrote: select a.ID, datediff(minute,MIN(a.time1),max(x.time1) ) as maxtime from #test a inner join #test x on A.id = x.id group by a.id But seems like Gordons is much simpler :)

Related

How to pull data based on current and last update?

SQL: Calculate number of days since last success

Calculate time span over a number of records

Query Max/Min value shows original values

SQL Where Query to Return Distinct Values

Categories

Resources