SQL Query data issues - sql

I have the following data:
ID Date interval interval_date tot_activity non-activity
22190 2011-09-27 00:00:00 1000 2011-09-27 10:00:00.000 265 15
I have another table with this data:
Date ID Start END sched_non_activity non_activity
10/3/2011 12:00:00 AM HBLV-22267 10/3/2011 2:02:00 PM 10/3/2011 2:11:00 PM 540
Now, in the second table's non_activity field, I would like this to be the value from the first table. However, I need to capture the tot_activity - non_activity where the intervals(in 15 min increments) from the first table, fall in the same time frame as the start and end of the second table.
I have the following so far:
SELECT 1.ID, 1.Date, 1.interval, 1.interval_date, 1.tot_activity, 1.non_activity,
1.tot_activity - 1.non_activity AS non_activity
FROM table1 AS 1 INNER JOIN
LIST AS L ON 1.ID = L.ID INNER JOIN
table2 AS 2 ON 1.Date = 2.Date AND L.ID = Right(2.ID,5)
Where 1.interval_date >= 2.Start AND 1.interval_date < 2.End
ORDER BY 1.ID, 1.interval_date
With this, I can already see I will be unable to capture if a start from table 2 is at 15:50, which means that I need to capture interval 15:45.
is there any way of doing this through queries, or should I be using variables, and doing the check per interval. Any help at all would be greatly appreciated.

I think you are asking too much from a query here.
What i would do is treat the two tables as lists ordered by time stamps and solve the problem programatically (ie not with a single query)
For example, create a function that traverses the first table in 15min increments and find the best match in the second table (i am guessing this is what you are trying to do). Implement your function to return the same results set as your query above or store it in a temporary table. Select from the result set. T-SQL is your friend :)

I'm having a tough time understanding your issue, but you might have better luck with the DATEDIFF function:
DATEDIFF(SECOND, 1.interval_date, 2.Start) >= 0 AND DATEDIFF(SECOND, 1.interval_date, 2.End) <= 0
I apologize if I'm not catching your drift. If I'm missing something, could you try to clarify a little bit?

Related

Oracle DB, for each row delete any similar timestamps in the table

I have a program recording timestamps when users connect to a program. However sometimes it likes to record the same connections multiple times, so I'll have duplicate entries that are half a second apart or less. I need an oracle query that can essentially look at the timestamp in each row and delete rows with a timestamp that is within 5 seconds of it. I'm not sure what the best way to approach this problem is, but due to technical limitations I'm trying to avoid scripting it. Any advice would be greatly appreciated.
Sample data looks something like this.
User, Timestamp
user1, 20-NOV-20 05.09.09.650146000 PM
user1, 20-NOV-20 05.09.11.764345432 PM
user2, 23-NOV-20 02.51.31.765432432 PM
user2, 23-NOV-20 02.51.32.355684235 PM
and I would want the query to trim it down to this.
user1, 20-NOV-20 05.09.09.650146000 PM
user2, 23-NOV-20 02.51.31.765432432 PM
If there were even more rows attributed to the same user close together it would get rid of them all. I imagine it would make look at each row and make a query to delete from UserSessions where timestamp <= (timestamp this row) + 5 seconds and timestamp >= (timestamp this row) - 5 seconds. But not where timestamp = (timestamp this row)
I have no idea what the syntax for this is or how do do the query per row.
Sample data and expected results would be helpful, particularly for various corner cases (several rows each within a few seconds of each other for example). Assuming you want to keep the latest row, something like this would work
delete from your_table earlier
where exists( select 1
from your_table later
where earlier.user_id = later.user_id
and later.login_date > earlier.login_date
and later.login_date <= earlier.login_date + interval '5' second )
Here's a liveSQL link using your sample data that shows the query producing the results you expect.
Hmmm . . . You seem to want to keep only rows where there is not a timestamp within 5 seconds before. I think that would be:
delete from t
where exists (select 1
from t t2
where t2.user_id = t.user_id and
t2.timetamp < t.timestamp and
t2.timestamp >= t.timestamp - interval '5' second
);

Giving the wrong records when used datetime parameter in MS Access Query

I am working MS-Access 2007 DB .
I am trying to write the query for the Datetime, I want to get records between 14 December and 16 December so I write the bellow query.
SELECT * FROM Expense WHERE CreatedDate > #14-Dec-15# and CreatedDate < #16-Dec-15#
( I have to use the two dates for the query.)
But It returning the records having CreatedDate is 14 December...
Whats wrong with the query ?
As #vkp mentions in the comments, there is a time part to a date as well. If it is not defined it defaults to midnight (00:00:00). As 14-dec-2015 6:46:56 is after 14-dec-2015 00:00:00 it is included in the result set. You can use >= 15-dec-15 to get around this, as it will also include records from 15-dec-2015. Same goes for the end date.
It seems you want only records from Dec 15th regardless of the time of day stored in CreatedDate. If so, this query should give you what you want with excellent performance assuming an index on CreatedDate ...
SELECT *
FROM Expense
WHERE CreatedDate >= #2015-12-15# and CreatedDate < #2015-12-16#;
Beware of applying functions to your target field in the WHERE criterion ... such as CDATE(INT(CreatedDate)). Although logically correct, it would force a full table scan. That might not be a problem if your Expense table contains only a few rows. But for a huge table, you really should try to avoid a full table scan.
You must inlcude the time in your thinking:
EDIT: I wrote this with the misunderstanding, that you wanted to
include data rows from 14th to 16th of Dec (three full days).
If you'd write <#17-Dec-15# it would be the full 16th. Or you'd have to write <=#16-Dec-15 23:59:59#.
A DateTime on the 16th of December with a TimePart of let's say 12:30 is bigger than #16-Dec-15#...
Just some backgorund: In Ms-Access a DateTime is stored as a day's number and a fraction part for the time. 0.5 is midday, 0.25 is 6 in the morning...
Comparing DateTime values means to compare Double-values in reality.
Just add one day to your end date and exclude this:
SELECT * FROM Expense WHERE CreatedDate >= #2015/12/14# AND CreatedDate < #2015/12/17#
Thanks A Lot guys for your help...
I finally ended with the solution given by Darren Bartrup-Cook and Gustav ....
My previous query was....
SELECT * FROM Expense WHERE CreatedDate > #14-Dec-15# and CreatedDate < #16-Dec-15#
And the New working query is...
SELECT * FROM Expense WHERE CDATE(INT(CreatedDate)) > #14-Dec-15# and CDATE(INT(CreatedDate)) < #16-Dec-15#

Possible to calculate iterated count of timestamps relative to one another?

This question is a bit complicated but to make it as simple as possible:
I have a list of timestamps (it is in the millions but let's say for simplicity sake it is much smaller):
order_times
-----------
2014-10-11 15:00:00
2014-10-11 15:02:00
2014-10-11 15:03:31
2014-10-11 15:07:00
2014-10-11 16:00:00
2014-10-11 16:04:00
I am trying to build a query (in PostgeSQL) that would allow me to determine the number of times a an order_time occurs within 10 minutes of 2 order_times prior to it (and no more).
In the sample data above:
first time stamp is considered 0 as there were no orders before it
second timestamp is considered 0 as it was within 10 minutes of it
prior but there was only 1 request before it
third timestamp is considered 1 because there were at least 2 orders within 10 minutes of it
(and so on)...
I hope this is clear!
You don't need to look at the first previous, just the one 2 prior to each. If that is within 10 minutes, then the one after it will be also.
Best way is to get the data that is important to you into a single row, so you can do set operations on it. For that, use the windowing function ROW_NUMBER() and a self join. This is the MS SQL way of doing what you want.
WITH T1 AS (
SELECT ID, Order_Time, ROW_NUMBER() OVER( ORDER BY Order_Time) AS RowNumber FROM myTest)
SELECT T1.ID,T1.Order_Time, T2.ID AS CompareID,T2.Order_Time AS CompareTime
FROM T1 LEFT OUTER JOIN T1 AS T2 ON T1.RowNumber-2 = T2.RowNumber
WHERE DATEDIFF(n,t2.Order_Time,t1.Order_Time)<=10
First we create a query that has the row numbers, then use it as an inline table to do a self join to build a row that contains each order, and the one that happened 2 orders prior to it. Then just do a simple date comparison to select out the rows you want.

How do I count data from 2 different tables by date

I have 2 tables with no relations, both tables have different number of columns, but there are a few columns that are the same but hold different data. I was able to create a function or view of only the data I wanted, but when I try to count the data by filtering the date, I always get the wrong count in return. Let me explain by showing the 2 functions and what I try to do:
Function 1
ID - number from 1 to 8
data sent - YES or NO
Date - date value
Function 2
ID - number from 1 to 8
data sent - yes or no
date - date value
Upon running both separately, I get all the rows from the tables and everything looks good.
Then I try to add the following to each function:
select
count([data sent]), ID
from function1
Where (date between #date1 and #date2)
group by ID
The above statement works great and gives me the right result for each function.
Now I thought what if I want to add those 2 functions into one and get the count from both functions on 1 page.
So I created the following function:
Function 3
select
count(Function1.[data sent]) as Expr1,
Function1.id,
count(Function2.[data sent]) as Expr2,
Function1.date
from
Function1
LEFT OUTER JOIN
Function2 on Function1.id = Function2.id
Where
(Function1.date between #date1 and #date2)
group by
Function1.id
Upon running the above, I get the following table:
ID Expr1 Expr2
On both Expr1 and Expr2, I get results which I am not sure where they come from. I guess something is being multiplied by 100000 since one table holds almost 15000 rows and the other around 5000 rows.
What I would like to know first is if it possible at all to be able to filter by date and count records from both table at the same time. If anyone need more information please let me know and I will be glad to share and explain more.
Thank you
The LEFT OUTER JOIN is taking each row of the left table, finding ALL of the rows in the right table with the same id field, and creating that many rows in the result table. Since id isn't what we usually think of as an identity field (it looks more like a "deviceId" or something), you'll get lots of matches for each one. Repeat 15000 times and you get your combinatorial explosion.
Tip: To debug things like this, you can create sample tables with a tiny subset of the real data, say 10 rows from each, and run your query on them. You'll see the issue immediately.
It's possible to filter by date. It's hard to recommend an actual solution without better understanding your phrase "I want to add those 2 functions into one and get the count from both functions on 1 page".
Why can't you create a temporary table for each function then join them together?
Maybe subqueries can help you to achieve what you want:
SELECT
ID = COALESCE(f1.ID, f2.ID),
Date = COALESCE(f1.Date, f2.Date),
f1.Expr1,
f2.Expr2
FROM (
SELECT
ID,
Date,
Expr1 = COUNT([data sent])
FROM Function1
WHERE Date BETWEEN #date1 AND #date2
GROUP BY
ID,
Date
) f1
FULL JOIN (
SELECT
ID,
Date,
Expr2 = COUNT([data sent])
FROM Function2
WHERE Date BETWEEN #date1 AND #date2
GROUP BY
ID,
Date
) f2
ON f1.ID = f2.ID AND f1.Date = f2.Date
This query also uses full (outer) join instead of left join, in case the right side of the join contains rows that have no match in the left side (and you want those rows).

T-SQL looping procedure

I have the following data:
ID Date interval interval_date tot_activity non-activity
22190 2011-09-27 00:00:00 1000 2011-09-27 10:00:00.000 265 15
I have another table with this data:
Date ID Start END sched_non_activity non_activity
10/3/2011 12:00:00 AM HBLV-22267 10/3/2011 2:02:00 PM 10/3/2011 2:11:00 PM 540
Now, in the second table's non_activity field, I would like this to be the value from the first table. However, I need to capture the tot_activity - non_activity where the intervals(in 15 min increments) from the first table, fall in the same time frame as the start and end of the second table.
I have tried setting variables and setting a loop where for each row it verifies the starttime by interval, but I have no idea how to return a variable with only one record, as I keep getting errors that my variable is getting too many results.
I have tried looking everywhere for tutorials and I can't find anything to help me out. Anyone have any pointers or tutorials on looping they could share?
You need to generate the interval end dates somehow; I'm assuming that there is always a record in the first table with a 15 minute interval record. In this case, an example would look like this:
;WITH Intervals
AS
(SELECT
Interval_date
,DATEADD(ms,840997,Interval_date) AS interval_end
,nonactivity
FROM A)
--Select query for Validation
--SELECT
-- b.[Date]
-- ,b.ID
-- ,b.Start
-- ,b.sched_non_activity
-- ,i.nonactivity
--FROM B
--JOIN Intervals AS i
--ON b.Start >= i.Interval_date
--AND b.[END] <= i.interval_end
UPDATE B
SET non_activity = i.nonactivity
FROM B
JOIN Intervals AS i
ON b.Start >= i.Interval_date
AND b.[END] <= i.interval_end
Obviously, you might need to tweak this depending on the exact circumstances.