Is a result set of the following query:
SELECT * FROM Table
WHERE Date >= '20130101'
equals to result set of the following query:
SELECT * FROM Table
WHERE Date = '20130101'
UNION ALL
SELECT * FROM Table
WHERE Date > '20130101'
?
Date is DATETIME field
On the result YES but on the performance NO.
There may have a performance issue. The first one only scans the table once while the second one scans twice because of UNION. (one SELECT statement is more faster than two combine select statement)
So I'd rather go on the first one.
Related
I have a date partitioned table, however costs and speed does not improve when the date condition is fetched from a subquery. The subquery fetches a single value of type DATE, however it is not used to run a partitioned query, instead the whole table is fetched. If I enter the date as a string, it works perfectly, just not from the subquery.
(
SELECT
*
FROM
`mydataset.mydataset.mytable`
WHERE
`datetime` > (
SELECT
DISTINCT updated_at_datetime
FROM
`mydataset.mydataset.my_other_table`
LIMIT
1)
AND `date` >= DATE(DATETIME_TRUNC((
SELECT
DISTINCT updated_at_datetime
FROM
`mydataset.mydataset.my_other_table`
LIMIT
1), DAY)))
From the docs:
To limit the partitions that are scanned in a query, use a constant expression in your filter. If you use dynamic expressions in your query filter, BigQuery must scan all of the partitions.
If you can run your query as a script, an approach is split in two statements:
DECLARE LAST_PARTITION DEFAULT (SELECT MAX(updated_at_datetime) FROM `mydataset.mydataset.my_other_table`);
(
SELECT
*
FROM
`mydataset.mydataset.mytable`
WHERE
`datetime` > LAST_PARTITION
AND `date` >= DATE(DATETIME_TRUNC(LAST_PARTITION));
All tables in a DB have the fields creationdate and revisiondate, which are date fields just as you'd think. Looking for a SQL query to find all instances where creationdate > '2017-02-01'. I'm not able to find an example where you loop through each table to return all new records as of X date in a DB. The DB has 1000 tables so I need to be able to search dynamically. The one table version of the query is (select * from tableA where creationdate > '2017-02-01') I just need to do that against all tables. Thanks!!!!
SELECT schema.column_1, schema.column2
FROM table_name_1
UNION
SELECT schema.column_same_datatype, schema.column2_same_datatype
FROM table_name_2
WHERE creation_date > '2017-02-01';
NOTE: YOu should have precaution about date format. I think the most common date format is DD-MM-YYYY.
I have a table with 3 columns cost, from_date and to_date. I have to select all the rows which do not have the dates from beginning of the month to the end of the month. That is, select rows which do not have the from_date as '1-NOV-2011' and to_date as '30-NOV-2011'. I've written 2 queries.
SELECT * FROM TABLE1 WHERE FROM_DATE <> '1-NOV-2011' OR TO_DATE <> '30-NOV-2011';
and
SELECT * FROM TABLE1 MINUS SELECT * FROM TABLE1 WHERE FROM_DATE = '1-NOV-2011' AND TO_DATE = '30-NOV-2011';
Which one will give a better performance?
Clarification
First off, the two queries are not equivalent. The following sets would produce the same results:
Set 1
Query 1
SELECT * FROM TABLE1
WHERE NOT (FROM_DATE = '1-NOV-2011' AND TO_DATE = '30-NOV-2011');
Query 2
SELECT * FROM TABLE1
MINUS SELECT * FROM TABLE1
WHERE FROM_DATE = '1-NOV-2011' AND TO_DATE = '30-NOV-2011';
Set 2
Query 1
SELECT * FROM TABLE1
WHERE FROM_DATE <> '1-NOV-2011' OR TO_DATE <> '30-NOV-2011';
Query 2
SELECT * FROM TABLE1
MINUS SELECT * FROM TABLE1
WHERE FROM_DATE = '1-NOV-2011' OR TO_DATE = '30-NOV-2011';
Answer
Now to the actual answer. The prima facie answer is that the first query (for either set) will be faster, because it involves only one table access, rather than two. However, that may not be true.
It's possible that the second query will be faster. In the first, the database will need to do a full-table scan, then check each row for the disqualifying values. In the second case, it can do a full table scan without a filter to fulfill the first half off the query. For the second half, if there is an index on FROM_DATE and TO_DATE, it can use an index scan to get the disqualifying rows then perform a set operation to remove those results from the first set.
Whether this is actually faster or not will likely depend a lot on your data. As always, the best way to determine which will be faster for your application is to perform your own benchmarks.
1st one is better, since that involves only a single scan also that does not contains any 'in's or 'not in's. go for 1st first one...
I guess, 1st version will have better performance than 2nd version.
SELECT is happening twice in 2nd query.
The second one will definitely be slower. You're basically pulling two sets in the second one and doing a set difference. Only the smaller set can be pulled with an index (assuming you have indexes, and assuming doesn't do some magical optimization). The first query builds just one set and it is based on indexes.
Disclaimer: That's a simplified explanation, and I know nothing of the inner workings of Oracle, just how I would expect it to work.
Right now I have a SQL query that allows me to select entries in a table that have been inserted over the past day.
The query is:
Select account from mytable where create_date > current_timestamp - 1
But suppose that I wanted to select entries that had been inserted over the base two hours?
How would I write that query?
You can use DATEADD, like so:
SELECT account
FROM MyTable
WHERE create_date > DATEADD(hh, -2, GETDATE())
Consider that a timestamp and a datetime are extremely different in sql server. A timestamp is a unique binary number whereas a datetime is a date and time combination. I think you'll need datetime.
You could try along the lines of
Select account from mytable where create_date > current_timestamp - 0.083333333
I can easily get a random record with this:
SELECT * FROM MyTable ORDER BY NewId()
I can easily get a record with "today's date" with this:
SELECT * FROM MyTable WHERE MyDate = "2010-24-08" -- db doesn't store times
But how would I combind the two?
Get 1 random record... anything with today's date.
If none are found... get 1 random record from yesterday (today-1).
If none are found... get 1 random record from etc, etc, today-2
... until 1 record is found.
Just make the day date the primary order by condition:
select top(1) *
from Table
order by Date desc, newid();
If you store the dates as full day and time, you need to round them out to the day part only: cast (Date as DATE) in SQL 2008 or cast(floor(cast(Date as FLOAT)) as DATETIME) in pre-2008.
Use the TOP operator:
SELECT TOP 1 *
FROM MyTable
WHERE MyDate = "2010-24-08"
ORDER BY NEWID()
...combined with the ORDER BY NEWID(). Without the ORDER BY, you'd get the first inserted row/record of the records returned by the filteration in most cases typically, but the only way to ensure order is with an ORDER BY clause.
SQL Server 2005+ supports brackets on the TOP value, so you can use a variable in the brackets without needing to use dynamic SQL.
Does this give you what you want?
SELECT TOP 1 *
FROM MyTable
ORDER BY MyDate desc, NewId()
This assumes there are no dates later than today.