In BiQuery Legacy SQL it's possible to get rows inserted
during a time range.
#legacySQL
SELECT COUNT(*) FROM [PROJECT_ID:DATASET.TABLE#time1-time2]
With standard SQL you can do point-in-time query like this
SELECT *
FROM t
FOR SYSTEM_TIME AS OF '2017-01-01 10:00:00-07:00';
But I don't find a way to query for the time range as in Legacy SQL.
Related
I want to run an SQL query involving multiple tables only for the entries created before a specific date. Instead of filtering tables by date one by one, is there a way to filter the whole database? I want something like a time machine for a database.
You can do it like that:
SELECT * FROM table1, table2 WHERE table1.created_at < now() || table2.created_at < now();
I have a partitioned table and am trying to limit my search to a few partitions. To do this I am running a query (using legacy SQL) that looks like the following:
SELECT
*
FROM
[project:dataset.table]
WHERE
_PARTITIONTIME >= "2018-07-10 00:00:00"
AND _PARTITIONTIME < "2018-07-11 00:00:00"
AND col IN (
SELECT
col
FROM
[project:dataset.table]
WHERE
_PARTITIONTIME >= "2018-07-10 00:00:00"
AND _PARTITIONTIME < "2018-07-11 00:00:00"
AND col2 > 0)
I limit the main query and the subquery using _PARTITIONTIME, so big query should only need to search those partitions. When I run this query though I get billed as if I'd just queried the entire table without using _PARTITIONTIME. Why does this happen?
UPDATE
The equivalent query using standard SQL does not have this problem, so use that as a workaround. I'd still like to know why this happens though. If it's just a bug or if legacy SQL actually does attempt to access all the data in a table for a query like this.
As noted in the question, switching to #standardSQL is the right solution. You shouldn't expect any big updates to the legacy SQL dialect - while #standardSQL will keep getting some substantial ones.
Also note that there are 2 types of partitioned tables today:
Tables partitioned by ingestion time
Tables that are partitioned based on a TIMESTAMP or DATE column
If you try to query the second type with legacy SQL:
SELECT COUNT(*)
FROM [fh-bigquery:wikipedia_v2.pageviews_2018]
WHERE datehour BETWEEN "2018-01-01 00:00:00" AND "2018-01-02 00:00:00"
you get the error "Querying tables partitioned on a field is not supported in Legacy SQL".
Meanwhile this works:
#standardSQL
SELECT COUNT(*)
FROM `fh-bigquery.wikipedia_v2.pageviews_2018`
WHERE datehour BETWEEN "2018-01-01 00:00:00" AND "2018-01-02 00:00:00"
I'm adding these points to enhance the message "it's time to switch to #standardSQL to get the best out of BigQuery".
I think this is a BigQuery Legacy SQL specific issue.
There is a list of cases for when Pseudo column queries scan all partitions and there is an explicit mentioning of Legacy SQL - In legacy SQL, the _PARTITIONTIME filter works only when ...
I don't see exactly your case in that list - but the best way is just use Standard SQL here
Is there a way to get all modification dates of a particular table in Oracle?
Query below:
select * from sys.dba_tab_modifications
could include number of inserts and updates however not particular dates of each.
You may be able to use Oracle Flashback Query if you have sufficient privs, and sufficient redo logs exist for the time period in question:
select versions_operation
, versions_starttime
, versions_endtime
, <TableAlias>.*
from <TableName> versions between timestamp systimestamp - numtodsinterval(1,'day')
and systimestamp <TableAlias>
When doing queries on a partitioned table in SQL Server, does one have to do anything special?
The reason I am asking is because we have a fairly large SQL Server table that is partitioned on a `datetime2(2)' column by day.
Each day is mapped to its own file group with a file in that file group named appropriately such as Logs_2014-09-15.ndf.
If I do a query on this table that say, only spans 2 days. I see that in ResourceMonitor that SQL Server is accessing more than 2 of the daily .ndf files. (edit, in fact I have noticed that it goes and searched through every single one. even if i Select from a day that falls in partition1 )
From my understanding with partitioned tables, it should only search amongst the appropriate data /partitions that it needs to?
So my questions:
Is this the case?
does how I compare the DateTime2 column effect the query?
For example, I could query like so:
select * from LogsTable
where [date] like '2014-09-15'
or I could do:
select * from LogsTable
where [date] = CAST('2014-09-15'AS DATETIME2)
Does the partition function automatically look at the [time] element if it is in the query and then send sql to the correct partition?
Have you tried with this:
select * from LogsTable
where Dateadd(D, 0, Datediff(D, 0, [date])) = CAST('2014-09-15'AS DATETIME2)
I'm trying to run the following SQL statement that is taking too long to finish in Oracle.
Here is my query:
SELECT timestamp from data
WHERE (timestamp IN
(SELECT MIN (timestamp) FROM data
WHERE (( TIMESTAMP BETWEEN :t1 AND :t2))
If anyone can help with optimising this query I would be very grateful.
All you need to speed your query is an index on timestamp:
create index data_timestamp on data(timestamp);
If you are expecting only one result, you can also do:
SELECT MIN(timestamp)
FROM data
WHERE TIMESTAMP BETWEEN :t1 AND :t2
I'm not sure why you would want to output the timestamp multiple times.