I have an easy question, however I couldn't find the answer for this. In SQL WHERE clause which is better?
TO_CHAR (DAT, 'YYYYMMDD') BETWEEN '20080101' AND '20131231'
or
DAT BETWEEN TO_DATE('20080101','YYYYMMDD') AND TO_DATE('20131231','YYYYMMDD')
Are the condition values evaluated only once and then tested for every row in the table, or does the SQL engine recalculate it every time?
Any argument that involves constants and literals will only be evaluated once. The second, however, is much better - it allows you to index the dat column and then use this index to improve performance, while the first query will not allow the index to be used.
And here's the BEST :
WHERE DAT BETWEEN '2008-01-01' AND '2013-12-31'
SQL has literals for date/time types too so there just isn't any need to invoke any of these scalar functions.
BTW you tagged the question SQL. That means your question relates to standard SQL, not to any particular engine and/or its implemented dialect (hover over the tag and read what it says). The standard mandates the date format used in this example. Specific engines might support additional formats for date literals, e.g. '12/31/2013' or '31.12.2013'.
Related
I have a date in one field (outcomes_date) that is really a date with zeros due to an ETL process so that it comes out as 01JUN2019:00:00:00 -
The other two fields (admdt and disdt) have the time stamps-
If don't use use the trunc function I will potentially miss some entries where the admdt is the same date as the outcome_date.
According to an another developer here-trunc is expensive in terms of processing, but I read that I should never to_Date a date either.
Any tips on efficiencies appreciated.
i.e.
where trunc(outcome_date) between trunc(admdt) and trunc(disdt)
It is not that TRUNC() is expensive. It is that, if column "X" is indexed, any function (TRUNC() or otherwise) on that column will prevent Oracle from using the index.
So, unless you want to create a function-based index on the expression TRUNC(outcome_date), you are better off not using TRUNC().
An alternative:
WHERE outcome_date BETWEEN trunc(admdt) and trunc(disdt)+1-INTERVAL `1` SECOND
I am practising and experimenting with different syntax of SQL BETWEEN operator in regards to dates from the "https://www.w3schools.com/sql/sql_between.asp"
This is the Order table in my database:
LINK: https://www.w3schools.com/sql/sql_between.asp
The query is fetching the orderdates between a given condition of 2 dates.
These are the two main syntax versions (according to w3schools):
SELECT *
FROM Orders
WHERE OrderDate BETWEEN #01/07/1996# AND #31/07/1996#;
and:
SELECT *
FROM Orders
WHERE OrderDate BETWEEN '1996-07-01' AND '1996-07-31';
The output that we get on typing the above two queries from the Orders table
Number of Records: 22 (out of 196 records). Yes this is correct.
Now I am experimenting with this syntax versions.
CASE #1:
SELECT *
FROM Orders
WHERE OrderDate BETWEEN #1996/07/01# AND #1996/07/31#;
Result of case #1: 22 (same as the above syntax)
In the SQL try it out editor(https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_between_date&ss=-1) they are stating that this SQL statement is not supported in the WebSQL database.The example still works, because it uses a modified version of SQL.
WHY SO?
If you're using the W3Schools Tryit editor in Chrome, you're using WebSQL, which is basically SQLite.
SQLite doesn't have a date/time format, so is probably storing the date values as strings formatted in the ISO-8601 format (see this answer for more information).
Other database systems (e.g. Oracle, Microsoft SQL Server, Postgres, MySQL) have built-in date formats, and you generally represent them as strings (enclosed in single quotes). For example: '1997-07-01' (depending on the specific RDBMS, there might be more specific considerations).
The format that uses pound signs (e.g. #7/1/1997#) is unique to Microsoft Access (see this answer for more information).
Bottom line: Dates are generally enclosed in single quotes. You're best off sticking to the ISO-8601 standard (e.g. 1997-07-01).
If you're learning SQL, there are other resources out there besides W3Schools. I would recommend downloading an open-source RDBMS like Postgres or MySQL, setting up a sample database, and working on some queries. Challenge sites like codewars might also be helpful
One more thing: Don't use BETWEEN for dates. Use >= and <, to make sure you're not excluding dates with a time portion. For more information, read this blog.
I have a TSQL view that processes multiple gigabytes of data in a SQL Server 2016 environment. In this view, there are multiple times where I am comparing if a DateTime value is before/after a static date, traditionally represented as a string literal like '2018-07-11'.
An example comparison would be:
SELECT MyId, MyValue FROM MyTable WHERE MyDate = '2018-07-11'
While looking for a way to use a DateTime literal instead of a string, I came across examples using ODBC DateTime strings like so:
SELECT MyId, MyValue FROM MyTable WHERE MyDate = {d '2018-07-11'}
When I compare the query plan I get the same result, even when I make up more advanced queries.
I started using this format in an attempt to prevent the auto-conversion of string to DateTime in queries, but I haven't been able to find any good documentation explaining any side effects of using ODBC functions. I'm not sure if this acts the same way as a string literal or if it is interpreted as a date.
If this was a UDF or Stored Procedure, I'd have the ability to declare a DateTime variable for use in the query, but in a VIEW this is not possible, nor would it be feasible because there are a lot of DateTime literals in the actual version of the query.
So in conclusion, does someone have any concrete reasons for or against using this {d '2018-07-11'} format (besides it potentially not being valid in a non SQL Server environment)?
I want to ensure that I'm not shooting myself in the foot here on a code review.
PS: I apologize for the vague examples and semi-open-ended question, I am not allowed to disclose any actual source code.
Thanks!
EDIT: I forgot to mention that I could also use DATEFROMPARTS(2018, 07, 11), but I wasn't sure if this would be looked at weirdly by the query optimizer.
The ODBC literal has the slight advantage that it can never be interpreted as YYYY-DD-MM, which is possible with one internationalization setting.
You can avoid ambiguity by using 'YYYYMMDD' format. This format is not affected by settings.
I prefer not using the ODBC, just because it seems to involve more clutter in the query. I admit to also preferring the hyphenated form (consistent with the ISO standard and other databases). But you have three alternatives. Possibly the safest for general purpose, SQL-Server-only code is the unhyphenated form.
A literal is a literal. It is transformed into a value during parsing. The value is used later.
Here is the list of DateTime literals that SQL Server supports. ODBC is a supported format.
So, if only using SQL Server then there is no difference. Different SQL flavors may reject the ODBC syntax. I do not believe it is ANSI SQL, so "less standard"?
I wanted to find out what is the "best practices" approach to a query against a record set of datetime with a date (no time).
I use several queries that return records based on a date range, from a recordset that uses a datetime data type, which means each record needs to be checked using a between range.
Example of a query would be:
Select *
FROM Usages
where CreationDateTime between '1/1/2012' AND '1/2/2012 11:59:59'
I know using BETWEEN is a resource hog, and that checking a datetime data type of a date is always going to be very resource intense, but I would like to hear what others use (or would use) in this situation.
Would I get any type of performance increase converting the datetime record to a Date like:
Select *
FROM Usages
where CONVERT(DATE,CreationDateTime) between '1/1/2012' AND '1/2/2012'
Or possibly doing a check of less then / greater then?
Select *
FROM Usages
where (CreationDateTime > '1/1/2012')
AND (CreationDateTime < '1/2/2012 11:59:59')
What you think you know is not correct.
Neither using BETWEEN or DATETIME data types is a resource hog.
Provided that you index the column, that the column really is a DATETIME and not a VARCHAR(), and that you don't wrap the field in a function, everything will be nice and quick.
That said, I would use >= and < instead. Not for performance, but logical correctness.
WHERE
myField >= '20120101'
AND myField < '20120102'
This will work no matter whether the field contains hours, minutes, or even (with a mythical data type) pico seconds.
With an index on the field it will also give a range scan.
You won't get any faster. No tricks or functions needed.
There are several considerations regarding dates.
First, you want to be sure that relevant indexes get used. In general, this means avoiding functions on the column. This applies to data types other than dates, but functions a prevalant for understanding dates. So, CONVERT() is a bad idea from a performance perspective, assuming that the column is indexed.
Second, you want to avoid unnecessary conversions between formats. So, a call to a function must happen for every row. Instead, converting a constant string to a date/time happens once at compile time. The first is less efficient. Another reason to avoid CONVERT(). However, in many queries, other processing (such as joins) is far more time-consuming than conversions, so this may not be important.
As for the choice between "between" and signed operations. The better practice is to use "<" and ">" and ">=" and "<=". It makes the logic clearer for dates and doesn't have an issue with things like seconds being accurate to 3 ms.
As far as I know, between on dates works as efficiently using indexes as other types of fields. However, for accuracy and portability it is best to do the individual comparisons.
So, the third version would be preferred.
I need a way to determine the number of days between two dates in SQL.
Answer must be in ANSI SQL.
ANSI SQL-92 defines DATE - DATE as returning an INTERVAL type. You are supposed to be able to extract scalars from INTERVALS using the same method as extracting them from DATEs using – appropriately enough – the EXTRACT function (4.5.3).
<extract expression> operates on
a datetime or interval and returns an
exact numeric value representing the
value of one component of the datetime
or interval.
However, this is very poorly implemented in most databases. You're probably stuck using something database-specific. DATEDIFF is pretty well implemented across different platforms.
Here's the "real" way of doing it.
SELECT EXTRACT(DAY FROM DATE '2009-01-01' - DATE '2009-05-05') FROM DUAL;
Good luck!
I can't remember using a RDBMS that didn't support DATE1-DATE2 and SQL 92 seems to agree.
I believe the SQL-92 standard supports subtracting two dates with the '-' operator.
SQL 92 supports the following syntax:
t.date_1 - t.date_2
The EXTRACT function is also ANSI, but it isn't supported on SQL Server. Example:
ABS(EXTRACT(DAY FROM t.date_1) - EXTRACT(DAY FROM t.date_2)
Wrapping the calculation in an absolute value function ensures the value will come out as positive, even if a smaller date is the first date.
EXTRACT is supported on:
Oracle 9i+
MySQL
Postgres