please, I read that best way (and most effective) is saving time as a SMALLINT (INTEGER in real). It is true? I'm building application which using SQLite3 Database and DB size must be small and calculate with date and time (time specially) must be most effective.
I Have this table:
CREATE TABLE events (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
time_start SMALLINT NOT NULL ,
time_end SMALLINT NOT NULL,
message_code VARCHAR(64) NOT NULL,
FOREIGN KEY (message_code) REFERENCES messages (message_code),
CHECK ((time_start BETWEEN 0 AND 1440) AND (time_end between 0 and 1440) AND (message_code <> ''))
);
But in fact I want inserting values in real time like 08:20 because no one wants calculate that 8 hours is 480 minutes + 20 = 500 minutes.
It is possible how to convert ('08:20') into number 500? I don't need calculate with seconds.
Or do do you think that using DATETIME is a better way?
PS: It's a scheduler which listing a events like a agenda in outlook e.g.
Thank you very much for any advice.
Best regards
SQLite has no DateTime data type.
You can store values as char(5) strings - this will only take up five bytes. You can compare these values directly.
sqlite> select '08:20' > '08:30';
0
sqlite> select '08:20' = '08:30';
0
sqlite> select '08:20' < '08:30';
1
0 is false, 1 is true.
And also with the built-in time function:
sqlite> select time('08:20') = time('now');
sqlite> select time('08:20') = time('2018-01-01 05:30:59');
Related
If I have two tables:
items
Id VARCHAR(26)
CreateAt bigint(20)
Type VARCHAR(26)
expiry
Id VARCHAR(26)
Expiry bigint(20)
The items table contains when the item was created, and what type it is. Then another table, expiry, is a lookup table to say how long certain types should last for. A query is run every day to make sure that items that have expired are removed.
At the moment this query is written in our app, as programming code:
for item in items {
expiry = expiry.get(item.Type)
if (currentDate() - expiry.Expiry > item.CreateAt) {
item.delete()
}
}
This was fine when we only had a few thousand items, but now we have tens of millions it takes a significant amount of time to run. Is there a way to put this into just an SQL statement?
Assuming all date values are actually UNIX timestamps, you could write a query such as:
SELECT * -- DELETE
FROM items
WHERE EXISTS (
SELECT 1
FROM expiry
WHERE expiry.id = items.type
AND items.CreateAt + expiry.Expiry < UNIX_TIMESTAMP()
)
Replace SELECT with DELETE once you're sure that the query selects the correct rows.
If the dates stored are in seconds since the UNIX epoch, you could use this PostgreSQL query:
DELETE FROM items
USING expiry
WHERE items.type = expiry.id
AND items.createat < EXTRACT(epoch FROM current_timestamp) - expiry.expiry;
A standard SQL solution that should work anywhere would be
DELETE FROM items
WHERE items.createat < EXTRACT(epoch FROM current_timestamp)
- (SELECT expiry.expiry FROM expiry
WHERE expiry.id = items.type);
That can be less efficient in PostgreSQL.
Your code is getting slow because you do the join between the tables outside the database.
Second slowing aspect is that you delete the items 1 by 1.
So using the compact delete statements which were provided is the correct solution.
It seems that you are using something like python-sqlalchemy. There the code would be something like:
items.delete().\
where(items.c.type==\
select([expiry.c.id]).\
where(currentDate() - expiry.Expiry > item.c.CreateAt ))
Given a PostgreSQL table that is supposed to contain rows with continuous, non-overlapping valid_range ranges such as:
CREATE TABLE tracking (
id INT PRIMARY KEY,
valid_range TSTZRANGE NOT NULL,
EXCLUDE USING gist (valid_range WITH &&)
);
INSERT INTO tracking (id, valid_range) VALUES
(1, '["2017-03-01 13:00", "2017-03-31 14:00")'),
(2, '["2017-03-31 14:00", "2017-04-01 00:00")'),
(3, '["2017-04-01 00:00",)');
That creates a table that contains:
id | valid_range
----+-----------------------------------------------------
1 | ["2017-03-01 13:00:00-07","2017-03-31 14:00:00-06")
2 | ["2017-03-31 14:00:00-06","2017-04-01 00:00:00-06")
3 | ["2017-04-01 00:00:00-06",)
I need to query for the row that was the valid row at the end of a given quarter, where I'm defining "at the end of a quarter" as "the instant in time right before the date changed to be the first day of the new quarter." In the above example, querying for the end of Q1 2017 (Q1 ends at the end of 2017-03-31, and Q2 begins 2017-04-01), I want my query to return only the row with ID 2.
What is the best way to express this condition in PostgreSQL?
SELECT * FROM tracking WHERE valid_range #> TIMESTAMPTZ '2017-03-31' is wrong because it returns the row that contains midnight on 2017-03-31, which is ID 1.
valid_range #> TIMESTAMPTZ '2017-04-01' is also wrong because it skips over the row that was actually valid right at the end of the quarter (ID 2) and instead returns the row with ID 3, which is the row that starts the new quarter.
I'm trying to avoid using something like ...ORDER BY valid_range DESC LIMIT 1 in the query.
Note that the end of the ranges must always be exclusive, I cannot change that.
The best answer I've come up with so far is
SELECT
*
FROM
tracking
WHERE
lower(valid_range) < '2017-04-01'
AND upper(valid_range) >= '2017-04-01'
This seems like the moral equivalent of saying "I want to reverse the inclusivity/exclusivity of the bounds on this TSTZRANGE column for this query" which makes me think I'm missing a better way of doing this. I wouldn't be surprised if it also negates the benefits of typical indexes on a range column.
You can use <# operator for check when value is within range:
SELECT *
FROM tracking
WHERE to_timestamp('2017-04-01','YYY-MM-DD')::TIMESTAMP WITH TIME ZONE <# valid_range;
Test PostgreSQL queries online
I have a sqlite3 table that records the state of my heating system and furnce every 30 seconds. The table looks like this
TABLE CLIMATESYSTEM (
id INTEGER PRIMARY KEY AUTOINCREMENT,
Timestamp INT,
FAN INT,
SYSTEM INT
);
timestamp is the seconds since the epoch ((int)time.time) in python
a few lines of the table looks like this
5577|1452049280|1|1
5578|1452049339|1|1
5579|1452049399|1|1
5580|1452049459|1|1
5581|1452049520|0|0
5582|1452049580|0|0
5583|1452049644|1|1
5584|1452049700|1|1
5585|1452049760|1|1
5586|1452049820|0|0
what I am trying to do is count the seconds in time between when the state transition goes from on (1) to off (0) and the next transition from off to on.
example count the seconds between #5577 and #5581 -> add to TIME_SYS_ON
example count the seconds between #5581 and #5583 -> add to TIME_SYS_OFF
What I am intending on doing is to measure the total time in a 24 hour period that my heating system is running
any ideas on a starting point?
Thanks
First, look up the next timestamp for each row:
SELECT Timestamp,
(SELECT MIN(Timestamp)
FROM ClimateSystem AS CS2
WHERE CS2.Timestamp > ClimateSystem.Timestamp
) AS NextTimestamp,
System
FROM ClimateSystem;
Then use that to compute the length of each interval:
SELECT NextTimestamp - Timestamp,
System
FROM (SELECT Timestamp,
(SELECT MIN(Timestamp)
FROM ClimateSystem AS CS2
WHERE CS2.Timestamp > ClimateSystem.Timestamp
) AS NextTimestamp,
System
FROM ClimateSystem);
Then add filters as needed:
SELECT SUM(NextTimestamp - Timestamp),
System
FROM (SELECT Timestamp,
(SELECT MIN(Timestamp)
FROM ClimateSystem AS CS2
WHERE CS2.Timestamp > ClimateSystem.Timestamp
) AS NextTimestamp,
System
FROM ClimateSystem)
WHERE Timestamp BETWEEN :StartOfDay AND :EndOfDay
GROUP BY System;
The root problem: I have an application which has been running for several months now. Users have been reporting that it's been slowing down over time (so in May it was quicker than it is now). I need to get some evidence to support or refute this claim. I'm not interested in precise numbers (so I don't need to know that a login took 10 seconds), I'm interested in trends - that something which used to take x seconds now takes of the order of y seconds.
The data I have is an audit table which stores a single row each time the user carries out any activity - it includes a primary key, the user id, a date time stamp and an activity code:
create table AuditData (
AuditRecordID int identity(1,1) not null,
DateTimeStamp datetime not null,
DateOnly datetime null,
UserID nvarchar(10) not null,
ActivityCode int not null)
(Notes: DateOnly (datetime) is the DateTimeStamp with the time stripped off to make group by for daily analysis easier - it's effectively duplicate data to make querying faster).
Also for the sake of ease you can assume that the ID is assigned in date time order, that is 1 will always be before 2 which will always be before 3 - if this isn't true I can make it so).
ActivityCode is an integer identifying the activity which took place, for instance 1 might be user logged in, 2 might be user data returned, 3 might be search results returned and so on.
Sample data for those who like that sort of thing...:
1, 01/01/2009 12:39, 01/01/2009, P123, 1
2, 01/01/2009 12:40, 01/01/2009, P123, 2
3, 01/01/2009 12:47, 01/01/2009, P123, 3
4, 01/01/2009 13:01, 01/01/2009, P123, 3
User data is returned (Activity Code 2) immediate after login (Activity Code 1) so this can be used as a rough benchmark of how long the login takes (as I said, I'm interested in trends so as long as I'm measuring the same thing for May as July it doesn't matter so much if this isn't the whole login process - it takes in enough of it to give a rough idea).
(Note: User data can also be returned under other circumstances so it's not a one to one mapping).
So what I'm looking to do is select the average time between login (say ActivityID 1) and the first instance after that for that user on that day of user data being returned (say ActivityID 2).
I can do this by going through the table with a cursor, getting each login instance and then for that doing a select to say get the minimum user data return following it for that user on that day but that's obviously not optimal and is slow as hell.
My question is (finally) - is there a "proper" SQL way of doing this using self joins or similar without using cursors or some similar procedural approach? I can create views and whatever to my hearts content, it doesn't have to be a single select.
I can hack something together but I'd like to make the analysis I'm doing a standard product function so would like it to be right.
SELECT TheDay, AVG(TimeTaken) AvgTimeTaken
FROM (
SELECT
CONVERT(DATE, logins.DateTimeStamp) TheDay
, DATEDIFF(SS, logins.DateTimeStamp,
(SELECT TOP 1 DateTimeStamp
FROM AuditData userinfo
WHERE UserID=logins.UserID
and userinfo.ActivityCode=2
and userinfo.DateTimeStamp > logins.DateTimeStamp )
)TimeTaken
FROM AuditData logins
WHERE
logins.ActivityCode = 1
) LogInTimes
GROUP BY TheDay
This might be dead slow in real world though.
In Oracle this would be a cinch, because of analytic functions. In this case, LAG() makes it easy to find the matching pairs of activity codes 1 and 2 and also to calculate the trend. As you can see, things got worse on 2nd JAN and improved quite a bit on the 3rd (I'm working in seconds rather than minutes).
SQL> select DateOnly
2 , elapsed_time
3 , elapsed_time - lag (elapsed_time) over (order by DateOnly) as trend
4 from
5 (
6 select DateOnly
7 , avg(databack_time - prior_login_time) as elapsed_time
8 from
9 ( select DateOnly
10 , databack_time
11 , ActivityCode
12 , lag(login_time) over (order by DateOnly,UserID, AuditRecordID, ActivityCode) as prior_login_time
13 from
14 (
15 select a1.AuditRecordID
16 , a1.DateOnly
17 , a1.UserID
18 , a1.ActivityCode
19 , to_number(to_char(a1.DateTimeStamp, 'SSSSS')) as login_time
20 , 0 as databack_time
21 from AuditData a1
22 where a1.ActivityCode = 1
23 union all
24 select a2.AuditRecordID
25 , a2.DateOnly
26 , a2.UserID
27 , a2.ActivityCode
28 , 0 as login_time
29 , to_number(to_char(a2.DateTimeStamp, 'SSSSS')) as databack_time
30 from AuditData a2
31 where a2.ActivityCode = 2
32 )
33 )
34 where ActivityCode = 2
35 group by DateOnly
36 )
37 /
DATEONLY ELAPSED_TIME TREND
--------- ------------ ----------
01-JAN-09 120
02-JAN-09 600 480
03-JAN-09 150 -450
SQL>
Like I said in my comment I guess you're working in MSSQL. I don't know whether that product has any equivalent of LAG().
If the assumptions are that:
Users will perform various tasks in no mandated order, and
That the difference between any two activities reflects the time it takes for the first of those two activities to execute,
Then why not create a table with two timestamps, the first column containing the activity start time, the second column containing the next activity start time. Thus the difference between these two will always be total time of the first activity. So for the logout activity, you would just have NULL for the second column.
So it would be kind of weird and interesting, for each activity (other than logging in and logging out), the time stamp would be recorded in two different rows--once for the last activity (as the time "completed") and again in a new row (as time started). You would end up with a jacob's ladder of sorts, but finding the data you are after would be much more simple.
In fact, to get really wacky, you could have each row have the time that the user started activity A and the activity code, and the time started activity B and the time stamp (which, as mentioned above, gets put down again for the following row). This way each row will tell you the exact difference in time for any two activities.
Otherwise, you're stuck with a query that says something like
SELECT TIME_IN_SEC(row2-timestamp) - TIME_IN_SEC(row1-timestamp)
which would be pretty slow, as you have already suggested. By swallowing the redundancy, you end up just querying the difference between the two columns. You probably would have less need of knowing the user info as well, since you'd know that any row shows both activity codes, thus you can just query the average for all users on any given day and compare it to the next day (unless you are trying to find out which users are having the problem as well).
This is the faster query to find out, in one row you will have current and row before datetime value, after that you can use DATEDIFF ( datepart , startdate , enddate ). I use #DammyVariable and DamyField as i remember the is some problem if is not first #variable=Field in update statement.
SELECT *, Cast(NULL AS DateTime) LastRowDateTime, Cast(NULL As INT) DamyField INTO #T FROM AuditData
GO
CREATE CLUSTERED INDEX IX_T ON #T (AuditRecordID)
GO
DECLARE #LastRowDateTime DateTime
DECLARE #DammyVariable INT
SET #LastRowDateTime = NULL
SET #DammyVariable = 1
UPDATE #T SET
#DammyVariable = DammyField = #DammyVariable
, LastRowDateTime = #LastRowDateTime
, #LastRowDateTime = DateTimeStamp
option (maxdop 1)
How can I make an average between dates in MySQL?
I am more interested in the time values, hours and minutes.
On a table with:
| date_one | datetime |
| date_two | datetime |
Doing a query like:
SELECT AVG(date_one-date_two) FROM some_table WHERE some-restriction-applies;
Edit:
The AVG(date1-date2) works but I have no clue what data it is returning.
This seems a bit hackish, but will work for dates beteen ~ 1970 and 2030 (on 32 bit arch). You are essentially converting the datetime values to integer, averaging them, and converting the average back to a datetime value.
SELECT
from_unixtime(
avg(
unix_timestamp(date_one)-unix_timestamp(date_two)
)
)
FROM
some_table
WHERE
some-restriction-applies
There is likely a better solution out there, but this will get you by in a pinch.
select avg(datediff(date1,date2))
select avg(timediff(datetime,datetime))
SELECT date_one + (date_two - date_one) / 2 AS average_date
FROM thetable
WHERE whatever
You can't sum dates, but you can subtract them and get a time interval that you can halve and add back to the first date.
SELECT TIMESTAMPADD(MINUTE, TIMESTAMPDIFF(MINUTE, '2011-02-12 10:00:00', '2011-02-12 12:00:00')/2, '2011-02-12 10:00:00')
The result is
'2011-02-12 11:00:00'
CREATE TABLE `some_table`
(
`some_table_key` INT(11) NOT NULL AUTO_INCREMENT,
`group_name` VARCHAR(128) NOT NULL,
`start` TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
`finish` TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`some_table_key`)
);
SELECT
group_name,
COUNT(*) AS entries,
SEC_TO_TIME( AVG( TIME_TO_SEC( TIMEDIFF(finish, start) ) ) ) AS average_time
FROM some_table
GROUP BY
some_table.group_name
;
You should always specify the group you want when using group functions, you can end up in some nasty messes with the group functions if you later extend queries with JOIN etc and assume MySql will choose the right group for you.
thinking outloud you could do a datediff in minutes from a set time, average that and then add those minutes back to the set time...
AVG is a grouping function, which means it will sum all the rows in the table and divide by the row count. But it sounds like you want the average of two different columns, reported individually for each row. In that case, you should just compute it yourself: (date1+date2)/2. (MySQL may want some extra syntax to add those columns properly.)
The code you've written will give you the table's average elapsed time between date1 and date2.