Indexing an SQL table by datetime that is scaling

Indexing an SQL table by datetime that is scaling - sql

I have a large table that gets anywhere from 1-3 new entries per minute. I need to be able to find records at specific times which I can do by using a SELECT statement but it's incredibly slow. Lets say the table looks like this:
Device | Date-Time | Data |
-----------------------------------
1 | 2020-01-01 08:00 | 325
2 | 2020-01-01 08:01 | 384
1 | 2020-01-01 08:01 | 175
3 | 2020-01-01 08:01 | 8435
7 | 2020-01-01 08:02 | 784
.
.
.
I'm trying to get data like this:
SELECT *
FROM table
WHERE Date-Time = '2020-01-01 08:00' AND Device = '1'
I also need to get data like this:
SELECT *
FROM table
WHERE Date-Time > '2020-01-01 08:00' Date-Time < '2020-01-10 08:00' AND Device = '1'
But I don't know what the Date-Time will be until requested. In this case, I will have to search the entire table for these times. Can I index the start of the day so I know where dates are?
Is there a way to index this table in order to dramatically decrease the queries? Or is there a better way to achieve this?
I have tried indexing the Date-Time column but I did not decrease the query time at all.

For this query:
SELECT *
FROM mytable
WHERE date_time = '2020-01-01 08:00' AND device = 1
You want an index on mytable(date_time, device). This matches the columns that come into play in the WHERE clause, so the database should be able to lookup the matching rows efficiently.
Note that I removed the single quotes around the literal value given to device: if this is an integer, as it looks like, then it should be treated as such.
The ordering of the column in the index matters; generally, you want the most restrictive column first - from the description of your question, this would probably be date_time, hence the above suggestion. You might want to try the other way around as well (so: mytable(device, date_time)).
Another thing to keep in mind from performance perspective: you should probably enumerate the columns you want in the SELECT clause; if you just want a few additional columns, then it can be useful to add them to the index as well; this gives you a covering index, that the database can use to execute the whole query without even looking back at the data.
Say:
SELECT date_time, device, col1, col2
FROM mytable
WHERE date_time = '2020-01-01 08:00' AND device = 1
Then consider:
mytable(date_time, device, col1, col2)
Or:
mytable(device, date_time, col1, col2)

You can use TimeInMilliseconds as new column and populate it with milliseconds from the year 1970 and create Index on this column. TimeInMilliseconds will always be unique number and it will help the index to search queries faster.

Related

SQL Efficiencies

Say I had the following table
-----------------------------
-- ID | DATE --
-- 01 | 1577836799998 --
-- 02 | 1577836799999 --
-- 03 | 1577836800000 --
-- 04 | 1577836800001 --
-----------------------------
I wish to select all data IDs relative to a timestamp. Is it more efficient to convert the timestamp (1) before or (2) after the operator? How does this impact efficiency?
(1)
SELECT ID
FROM TABLE
WHERE DATEADD(MS, DATE, '1970-01-01') > '2020-01-01'
(2)
SELECT ID
FROM TABLE
WHERE DATE > DATE_PART('EPOCH_MILLISECOND', TO_TIMESTAMP('2020-01-01'))
Would it be the latter (2)? Because it only has to convert the comparison timestamp once without converting every single date in the table to compare?

You are asking which of these is more efficient:
WHERE DATEADD(MS, DATE, '1970-01-01') > '2020-01-01'
WHERE DATE > DATE_PART('EPOCH_MILLISECOND', TO_TIMESTAMP('2020-01-01'))
First, they may result in the same execution plan, so there might be no difference.
That said, the second is much, much preferable. Why? The data has information about columns, such as:
Available indexes
Available partitions
Statistics
These can be used to choose the best query plan. So, you have more options with the second method, because the "bare" column is better understood than the column that is the argument to a function.

Azure Log Analytic > Same query but different results when pinned on a Dashboard

My below query returns a table with the corresponding values
union (traces), (customEvents)
| where timestamp <= now()
| summarize Users=dcount(user_AuthenticatedId) by Country=client_CountryOrRegion
| sort by Users desc
Results:
When pinning the query to the dashboard, I see different results:
The only difference that I can see is the time range set directly on the dashboard. I set this one to custom: 2016-07-06 to now to simulate the same value than in the query. I have checked and I only have logs from 2019 anyway.
Has anyone a clue?

Whenever I have seen this it is due to time slicing. You could add min and max timestamp values to the query in order to understand the exact ranges:
union (traces), (customEvents)
| where timestamp <= now()
| summarize Users=dcount(user_AuthenticatedId), FirstRecord=min(timestamp), LastRecord=max(timestamp) by Country=client_CountryOrRegion
| sort by Users desc

postgreSQL 1 minute average of values

I need to query from my database (postgreSQL) the value average of 1 minute.
The database is recorded in milliseconds and looks like this:
timestamp | value
------------------
1528029265001 | 123
1528029265020 | 232
1528029265025 | 332
1528029265029 | 511
... | ...
1528029265176 | 231
I tried:
SELECT
avg(value),
extract(minutes FROM to_timestamp(timestamp/1000)) as one_min
FROM table GROUP BY one_min
But it seems to be stuck in querying.
I'm sure there is a super easy way to do it.
Any suggestions?

I am guessing that you want:
SELECT floor(timestamp/(60 * 1000)) as timestamp_minute
avg(value)
FROM table
GROUP BY timestamp_minute;
However, if your problem is performance, this will have the same performance issues. For that, you would want a where clause that limits the amount of data being processed.
Because the data is not being collected at even intervals, you might want the simple average of the first and last values, or something like that.

SQL query to get results in a very specific order by date descending AND with multiple 'like' clause

I am trying to create a SQL query but having some real struggle and unsure the best way to write it. I currently have 2 columns, one containing timestamps, the other containing random information. I want the timestamps to be in descending order (easy enough), but for the secondary column, i want to output the result in the following order
NOTE: the % are intentional as they are in the REMAINING column.
'%[EVENT=agentStateEvent]%' - first
%[EVENT=TerminalConnectionCreated]%' - second
what ever else third,forth etc for that time stamp.
for example:
Both columns are strings (varchar max)
TIMESTAMP | REMAINING
TIMESTAMP 10:30 | %[EVENT=agentStateEvent][Agentid=424][Queue=45235]%
TIMESTAMP 10:30 | %[EVENT=TerminalConnectionCreated][Agentid=424][Queue=45235]%
TIMESTAMP:10.31 | %[EVENT=agentStateEvent][Agentid=425][Queue=453635]%
TIMESTAMP 10.31 | %[EVENT=TerminalConnectionCreated][Agentid=425][Queue=45235]%
TIMESTAMP 10.31 | %[EVENT=CallDropped][Agentid=425][Queue=45235]%
TIMESTAMP 10.32 | %[EVENT=TerminalConnectionCreated][Agentid=426][Queue=44235]%
TIMESTAMP 10.32 | %[EVENT=CallDropped][Agentid=426][Queue=45235]%
It would need to be wrapped in a 'like' command as the column REMAINING contains alot more information.
query i have so far is :
select * from TimestampsStorage
order by timestamp desc, remaining desc

You are looking for CASE ... WHEN:
order by
timestamp,
case when remaining like '%[EVENT=agentStateEvent]%' then 1
when remaining like '%[EVENT=TerminalConnectionCreated]%' then 2
else 3
end;
In case '%[EVENT=...' comes always first in remaining, you can look for '%' explicitley. This may speed up the query.
order by
timestamp,
case when remaining like '#%[EVENT=agentStateEvent]%' escape '#' then 1
when remaining like '#%[EVENT=TerminalConnectionCreated]%' escape '#' then 2
else 3
end;

You can use a case expression in the ORDER BY clause
select ...
ORDER BY
[TIMESTAMP]
, CASE WHEN [REMAINING] = '%[EVENT=agentStateEvent]%' then 1
WHEN [REMAINING] = '%[EVENT=TerminalConnectionCreated]%' THEN 2
ELSE 3
END
, [REMAINING]
NB: as I'm not sure which dbms you are using; the [] is used in T-SQL

Select query with date condition

I would like to retrieve the records in certain dates after d/mm/yyyy, or after d/mm/yyyy and before d/mm/yyyy, how can I do it ?
SELECT date
FROM table
WHERE date > 1/09/2008;
and
SELECT date
FROM table
WHERE date > 1/09/2008;
AND date < 1/09/2010
It doesn't work.

Be careful, you're unwittingly asking "where the date is greater than one divided by nine, divided by two thousand and eight".
Put # signs around the date, like this #1/09/2008#

The semicolon character is used to terminate the SQL statement.
You can either use # signs around a date value or use Access's (ACE, Jet, whatever) cast to DATETIME function CDATE(). As its name suggests, DATETIME always includes a time element so your literal values should reflect this fact. The ISO date format is understood perfectly by the SQL engine.
Best not to use BETWEEN for DATETIME in Access: it's modelled using a floating point type and anyhow time is a continuum ;)
DATE and TABLE are reserved words in the SQL Standards, ODBC and Jet 4.0 (and probably beyond) so are best avoided for a data element names:
Your predicates suggest open-open representation of periods (where neither its start date or the end date is included in the period), which is arguably the least popular choice. It makes me wonder if you meant to use closed-open representation (where neither its start date is included but the period ends immediately prior to the end date):
SELECT my_date
FROM MyTable
WHERE my_date >= #2008-09-01 00:00:00#
AND my_date < #2010-09-01 00:00:00#;
Alternatively:
SELECT my_date
FROM MyTable
WHERE my_date >= CDate('2008-09-01 00:00:00')
AND my_date < CDate('2010-09-01 00:00:00');

select Qty, vajan, Rate,Amt,nhamali,ncommission,ntolai from SalesDtl,SalesMSt where SalesDtl.PurEntryNo=1 and SalesMST.SaleDate= (22/03/2014) and SalesMST.SaleNo= SalesDtl.SaleNo;
That should work.

hey guys i think what you are looking for is this one using select command.
With this you can specify a RANGE GREATER THAN(>) OR LESSER THAN(<) IN MySQL WITH THIS:::::
select* from <**TABLE NAME**> where year(**COLUMN NAME**) > **DATE** OR YEAR(COLUMN NAME )< **DATE**;
FOR EXAMPLE:
select name, BIRTH from pet1 where year(birth)> 1996 OR YEAR(BIRTH)< 1989;
+----------+------------+
| name | BIRTH |
+----------+------------+
| bowser | 1979-09-11 |
| chirpy | 1998-09-11 |
| whistler | 1999-09-09 |
+----------+------------+
FOR SIMPLE RANGE LIKE USE ONLY GREATER THAN / LESSER THAN
mysql>
select COLUMN NAME from <TABLE NAME> where year(COLUMN NAME)> 1996;
FOR EXAMPLE
mysql>
select name from pet1 where year(birth)> 1996 OR YEAR(BIRTH)< 1989;
+----------+
| name |
+----------+
| bowser |
| chirpy |
| whistler |
+----------+
3 rows in set (0.00 sec)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Indexing an SQL table by datetime that is scaling - sql

You can use TimeInMilliseconds as new column and populate it with milliseconds from the year 1970 and create Index on this column. TimeInMilliseconds will always be unique number and it will help the index to search queries faster.

Related

SQL Efficiencies

Azure Log Analytic > Same query but different results when pinned on a Dashboard

postgreSQL 1 minute average of values

SQL query to get results in a very specific order by date descending AND with multiple 'like' clause

Select query with date condition

Categories

Resources