Problems with BETWEEN dates operator - sql

I am practising and experimenting with different syntax of SQL BETWEEN operator in regards to dates from the "https://www.w3schools.com/sql/sql_between.asp"
This is the Order table in my database:
LINK: https://www.w3schools.com/sql/sql_between.asp
The query is fetching the orderdates between a given condition of 2 dates.
These are the two main syntax versions (according to w3schools):
SELECT *
FROM Orders
WHERE OrderDate BETWEEN #01/07/1996# AND #31/07/1996#;
and:
SELECT *
FROM Orders
WHERE OrderDate BETWEEN '1996-07-01' AND '1996-07-31';
The output that we get on typing the above two queries from the Orders table
Number of Records: 22 (out of 196 records). Yes this is correct.
Now I am experimenting with this syntax versions.
CASE #1:
SELECT *
FROM Orders
WHERE OrderDate BETWEEN #1996/07/01# AND #1996/07/31#;
Result of case #1: 22 (same as the above syntax)
In the SQL try it out editor(https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_between_date&ss=-1) they are stating that this SQL statement is not supported in the WebSQL database.The example still works, because it uses a modified version of SQL.
WHY SO?

If you're using the W3Schools Tryit editor in Chrome, you're using WebSQL, which is basically SQLite.
SQLite doesn't have a date/time format, so is probably storing the date values as strings formatted in the ISO-8601 format (see this answer for more information).
Other database systems (e.g. Oracle, Microsoft SQL Server, Postgres, MySQL) have built-in date formats, and you generally represent them as strings (enclosed in single quotes). For example: '1997-07-01' (depending on the specific RDBMS, there might be more specific considerations).
The format that uses pound signs (e.g. #7/1/1997#) is unique to Microsoft Access (see this answer for more information).
Bottom line: Dates are generally enclosed in single quotes. You're best off sticking to the ISO-8601 standard (e.g. 1997-07-01).
If you're learning SQL, there are other resources out there besides W3Schools. I would recommend downloading an open-source RDBMS like Postgres or MySQL, setting up a sample database, and working on some queries. Challenge sites like codewars might also be helpful
One more thing: Don't use BETWEEN for dates. Use >= and <, to make sure you're not excluding dates with a time portion. For more information, read this blog.

Related

Difference between % vs * in string comparison in Hive

When trying to list down all tables names in a database having a specific name format, the following query works fine :
show tables like '*case*';
while the following does not
show tables like '%case%';
On the other hand, when comparing the actual data inside string columns its the vice-versa case
Working query :
select column from database.table where column like '%ABC%' limit 5;
Not working query :
select column from database.table where column like '*ABC*' limit 5;
What's the difference between the 2 operators * and % ?
This is the difference between regular expressions and like patterns.
LIKE is built into the SQL language. It has two wildcards:
% represents any number of characters including zero.
_ represents exactly one character.
Regular expressions are much more flexible for matching almost any pattern in a string.
When SQL was invented, I don't think regular expressions were in common use in computer systems -- at the very least, the folks at IBM who worked on relational databases may not have been familiar with the folks at ATT who were inventing Unix.
Regular expressions are much more powerful than LIKE patterns, of course. And Hive supports them via the RLIKE operator (and some other functions).
The SHOW functionality is not standard SQL. So, the developers of Hive chose the more flexible method for pattern matching.
HiveQL attempts to mimic the SQL, but it does not strictly follow its standards.
The usage of the wildcards are not pertinent to the LIKE clause, but to the statement itself. SHOW statements validate the wildcards based on the Java regular expression whereas when it comes to SELECT statements, Hive tries to stick with the SQL's wildcard validation.

Are there any performance downsides to using ODBC date string-literals in SQL Server? Is it better than a regular string date literal?

I have a TSQL view that processes multiple gigabytes of data in a SQL Server 2016 environment. In this view, there are multiple times where I am comparing if a DateTime value is before/after a static date, traditionally represented as a string literal like '2018-07-11'.
An example comparison would be:
SELECT MyId, MyValue FROM MyTable WHERE MyDate = '2018-07-11'
While looking for a way to use a DateTime literal instead of a string, I came across examples using ODBC DateTime strings like so:
SELECT MyId, MyValue FROM MyTable WHERE MyDate = {d '2018-07-11'}
When I compare the query plan I get the same result, even when I make up more advanced queries.
I started using this format in an attempt to prevent the auto-conversion of string to DateTime in queries, but I haven't been able to find any good documentation explaining any side effects of using ODBC functions. I'm not sure if this acts the same way as a string literal or if it is interpreted as a date.
If this was a UDF or Stored Procedure, I'd have the ability to declare a DateTime variable for use in the query, but in a VIEW this is not possible, nor would it be feasible because there are a lot of DateTime literals in the actual version of the query.
So in conclusion, does someone have any concrete reasons for or against using this {d '2018-07-11'} format (besides it potentially not being valid in a non SQL Server environment)?
I want to ensure that I'm not shooting myself in the foot here on a code review.
PS: I apologize for the vague examples and semi-open-ended question, I am not allowed to disclose any actual source code.
Thanks!
EDIT: I forgot to mention that I could also use DATEFROMPARTS(2018, 07, 11), but I wasn't sure if this would be looked at weirdly by the query optimizer.
The ODBC literal has the slight advantage that it can never be interpreted as YYYY-DD-MM, which is possible with one internationalization setting.
You can avoid ambiguity by using 'YYYYMMDD' format. This format is not affected by settings.
I prefer not using the ODBC, just because it seems to involve more clutter in the query. I admit to also preferring the hyphenated form (consistent with the ISO standard and other databases). But you have three alternatives. Possibly the safest for general purpose, SQL-Server-only code is the unhyphenated form.
A literal is a literal. It is transformed into a value during parsing. The value is used later.
Here is the list of DateTime literals that SQL Server supports. ODBC is a supported format.
So, if only using SQL Server then there is no difference. Different SQL flavors may reject the ODBC syntax. I do not believe it is ANSI SQL, so "less standard"?

Check date value in Netezza

IN Netezza , I am trying to check if date value is valid or not ; something like ISDATE function in SQL server.
I am getting dates like 11/31/2013 which is not valid, how in Netezza I can check if this date is valid so I exclude them from my process.
Thanks
I don't believe there is a built-in Netezza function to check if a date is valid. You may be able to write a LUA function to do this, or you could try joining to a "Date" lookup table, like so:
Create a table with two columns:
DATE_VALUE date
DATE_STRING varchar(10)
Load data into this table for valid dates (generate a file in your favorite tool, excel, unix, whatever). There can even be more than one row per DATE_VALUE (different "valid" formats) if all you use this for is this check. If you fill in from, say, 1900 to 2100, as long as your data is within that range, you'll be fine. And it's a small table, too, for ~200 years only ~7300 rows. Add more if needed. Heck, since the NZ date datatype goes from AD1 to AD 9999, you could fill it completely with only 3.4 million rows (small for NZ).
Then, to isolate rows that have invalid dates, just use a JOIN or an EXISTS / NOT EXISTS to this table, on DATE_STRING. Since the table is so small, netezza will likely broadcast it to all SPUs, making the performance impact trivial.
Netezza Analytics Package 3.0 (free download) comes with a couple LUA functions that verify date values: isdate() and todate(). Very simple to install / compile.

How to create portable inserts from SQL Server?

Now it generates inserts like
INSERT [Bla] ([id], [description], [name], [version])
VALUES (CAST(1 AS Numeric(19, 0)), convert(t...
It's very SQL Server specific. I would like to create a script that everybody can use, database agnostic. I have very simple data types - varchars, numbers, dates, bits(boolean).
I think
insert into bla values (1, 'die', '2001-01-01 11:11:11')
should work in all DBMSs, right?
Some basic rules:
Get rid of the square brackets. In your case they are not needed - not even in SQL Server. (At the same time make sure you never use reserved words or special characters in column or table names).
If you do need to use special characters or reserved words (which is not something I would recommend), then use the standard double quotes (e.g. "GROUP").
But remember that names are case sensitive then: my_table is the same as MY_TABLE but "my_table" is different to "MY_TABLE" according to the standard. Again this might vary between DBMS and their configuration.
The CAST operator is standard and works on most DBMS (although not all support casting in all possible combinations).
convert() is SQL Server specific and should be replaced with an approriate CAST expression.
Try to specify values in the correct data type, never rely on implicit data conversion (so do not use '1' for a number). Although I don't think casting a 1 to a numeric() should be needed.
Usually I also recommend to use ANSI literals (e.g. DATE '2011-03-14') for DATE/TIMESTAMP literals, but SQL Server does not support that. So it won't help you very much.
A quick glance at the Wikipedia article on SQL, will tell you a bit about standardisation of SQL across different implementations, such as MS SQL, PostgreSQL, Oracle etc.
In short, there is a number of ANSI standards but there is varying support for it throught each product.
The general way to support multiple database servers from your software product is to accept there are differences, code for them at the database level, and make your application able to call the same database access code irrespective of database server.
There are a number of problems with number formats which will not port between dbmses however this pales when you look at the problems with dates and date formats. For instance the default DATE format used in an ORACLE DB depends on the whims of whoever installed the software, you can use date conversion functions to get ORACLE to accept the common date formats - but these functions are ORACLE specific.
Besides how do you know the table and column names will be the same on the target DB?
If you are serious about this, really need to port data between hydrogenous DBMSes, and know a bit of perl thn try using SqlFairy which is available from CPAN. The sheer size of this download should be enough to convince you how complex this problem can be.

Number of days between two dates - ANSI SQL

I need a way to determine the number of days between two dates in SQL.
Answer must be in ANSI SQL.
ANSI SQL-92 defines DATE - DATE as returning an INTERVAL type. You are supposed to be able to extract scalars from INTERVALS using the same method as extracting them from DATEs using – appropriately enough – the EXTRACT function (4.5.3).
<extract expression> operates on
a datetime or interval and returns an
exact numeric value representing the
value of one component of the datetime
or interval.
However, this is very poorly implemented in most databases. You're probably stuck using something database-specific. DATEDIFF is pretty well implemented across different platforms.
Here's the "real" way of doing it.
SELECT EXTRACT(DAY FROM DATE '2009-01-01' - DATE '2009-05-05') FROM DUAL;
Good luck!
I can't remember using a RDBMS that didn't support DATE1-DATE2 and SQL 92 seems to agree.
I believe the SQL-92 standard supports subtracting two dates with the '-' operator.
SQL 92 supports the following syntax:
t.date_1 - t.date_2
The EXTRACT function is also ANSI, but it isn't supported on SQL Server. Example:
ABS(EXTRACT(DAY FROM t.date_1) - EXTRACT(DAY FROM t.date_2)
Wrapping the calculation in an absolute value function ensures the value will come out as positive, even if a smaller date is the first date.
EXTRACT is supported on:
Oracle 9i+
MySQL
Postgres