We have a PostgreSQL DB running on AWS (Engine Version 11.8).
For each item, we are storing dates as strings [varchar type] (in the following format - '2020-12-16')
Our app has requirements to do date range queries. Based on that, what is the best and most efficient way to do string date comparisons in PostgreSQL?
I looked at several questions here on SO, but no one talks about the question if there is a difference to store the dates as type "varchar" or type "date".
Also based on the 2 storage types above, what would be the most efficient way to do queries? In particular we are looking at querying for ranges (for example from '2020-12-10' and '2020-12-16')
Thanks a lot for your feedback
no one talks about the question if there is a difference to store the dates as type "varchar" or type "date".
Fix your data model! You should be using the proper datatype to store your data: dates should be stored as date. Using the wrong datatype is wrong for many reasons:
whenever you need to do date arithmetic, you need to convert your strings to a date (eg: how do you add one month to string '2020-12-16'?); this is highly inefficient
data integrity cannot be enforced at the time when your data is stored; even using check constraints is not enough. Eg: how can you tell whether '2021-02-29' is a valid date or not?
what would be the most efficient way to do queries? In particular we are looking at querying for ranges.
That said, the format that you are using makes it possible to do direct string comparison. So for a simple range comparison, I would suggest string operations:
where mystring >= '2020-12-10' and mystring < '2020-12-16'
Before even thinking about performance: use the correct data types.
Try, for instance, to get all the friday the thirteenth in 2020 with your string representation. With dates it is very easy:
CREATE table thisyear(
thedate DATE NOT NULL PRIMARY KEY
);
INSERT INTO thisyear(thedate)
SELECT generate_series('2020-01-01'::date, '2021-01-01'::date -1, '1 day'::interval)
;
SELECT * FROM thisyear
WHERE date_part('dow', thedate) = 5 -- friday
AND date_part('day', thedate) = 13 -- the thirteenth
;
Result:
CREATE TABLE
INSERT 0 366
thedate
------------
2020-03-13
2020-11-13
(2 rows)
Related
I am not getting data between two years, below is between condition
to_char(Wfc.APPLYDTM,'MM/DD/YYYY') between '12/11/2019' and '01/10/2020'
but I am getting data between '12/11/2019' and '12/31/2019' & '01/11/2020' and '01/01/2020' for these dates but not between two different years.
Please help
Try using TO_DATE instead of TO_CHAR, and then compare against valid Oracle date literals:
SELECT *
FROM Wfc
WHERE TO_DATE(APPLYDTM, 'MM/DD/YYYY') BETWEEN date '2019-12-11' AND date '2019-01-10';
Note that if APPLYDTM already be a date, then you don't need to call TO_DATE on it. It doesn't make sense to convert your data to character, if you intend to work with it as a date.
You should convert your data to Date to be able to compare correctly.
The main idea is you should compare date value instead of string value.
to_date(Wfc.APPLYDTM,'MM/dd/yyyy') between to_date('12/11/2019','MM/dd/yyyy') and to_date('01/10/2020','MM/dd/yyyy')
Read here to more details.
Do not convert date/time values to strings! Use the built in functionality.
Your logic is most simply expressed as:
Wfc.APPLYDTMbetween >= DATE '2019-12-11' AND
Wfc.APPLYDTMbetween < DATE '2020-01-11'
Note that the date constants are provided using the DATE keyword. This supposed ISO 8601 standard date formats (happily!).
Also note the use of >= and < rather than BETWEEN. The date data type in Oracle can include a time component -- even if you don't see it when you query the table. This ensures that all date/times are included in the range.
As an added benefit, this can use an index on (APPLYDTMbetween). Using a function usually precludes using an index, unless you have defined a function-based index.
I'm having issues with what I assumed would be a simple problem, but googling isn't helping a great load. Possibly I'm bad at what I am searching for nether the less.
SELECT ORDER_NUMB, CUSTOMER_NUMB, ORDER_DATE
FROM ORDERS
WHERE FORMAT(ORDER_DATE, 'DD-MMM-YYYY') = '07-JUN-2000';
It tells me I am using an invalid identifier. I have tried using MON instead of MMM, but that doesn't help either.
Unsure if it makes any difference but I am using Oracle SQL Developer.
There are multiple issues related to your DATE usage:
WHERE FORMAT(ORDER_DATE, 'DD-MMM-YYYY') = '07-JUN-2000';
FORMAT is not an Oracle supported built-in function.
Never ever compare a STRING with DATE. You might just be lucky, however, you force Oracle to do an implicit data type conversion based on your locale-specific NLS settings. You must avoid it. Always use TO_DATE to explicitly convert string to date.
WHERE ORDER_DATE = TO_DATE('07-JUN-2000','DD-MON-YYYY','NLS_DATE_LANGUAGE=ENGLISH');
When you are dealing only with date without the time portion, then better use the ANSI DATE Literal.
WHERE ORDER_DATE = DATE '2000-06-07';
Read more about DateTime literals in documentation.
Update
It think it would be helpful to add some more information about DATE.
Oracle does not store dates in the format you see. It stores it
internally in a proprietary format in 7 bytes with each byte storing
different components of the datetime value.
BYTE Meaning
---- -------
1 Century -- stored in excess-100 notation
2 Year -- " "
3 Month -- stored in 0 base notation
4 Day -- " "
5 Hour -- stored in excess-1 notation
6 Minute -- " "
7 Second -- " "
Remember,
To display : Use TO_CHAR
Any date arithmetic/comparison : Use TO_DATE
Performance Bottleneck:
Let's say you have a regular B-Tree index on a date column. now, the following filter predicate will never use the index due to TO_CHAR function:
WHERE TO_CHAR(ORDER_DATE, 'DD-MM-YYYY') = '07-06-2000';
So, the use of TO_CHAR in above query is completely meaningless as it does not compare dates, nor does it delivers good performance.
Correct method:
The correct way to do the date comparison is:
WHERE ORDER_DATE = TO_DATE('07-JUN-2000','DD-MON-YYYY','NLS_DATE_LANGUAGE=ENGLISH');
It will use the index on the ORDER_DATE column, so it will much better in terms of performance. Also, it is comparing dates and not strings.
As I already said, when you do not have the time element in your date, then you could use ANSI date literal which is NLS independent and also less to code.
WHERE ORDER_DATE = DATE '2000-06-07';
It uses a fixed format 'YYYY-MM-DD'.
try this:
SELECT ORDER_NUMB, CUSTOMER_NUMB, ORDER_DATE
FROM ORDERS
WHERE trunc(to_date(ORDER_DATE, 'DD-MMM-YYYY')) = trunc(to_date('07-JUN-2000'));
I do not recognize FORMAT as an oracle function.
I think you meant TO_CHAR.
SELECT ORDER_NUMB, CUSTOMER_NUMB, ORDER_DATE
FROM ORDERS
WHERE TO_CHAR(ORDER_DATE, 'DD-MMM-YYYY') = '07-JUN-2000';
try to_char(order_date, 'DD-MON-YYYY')
I am importing records from a DB2 data source into a MS SQL Server destination.
The records are coming in with the date format of 20150302/YYYYMMDD, but I only want the last 14 days based on current server date.
Can some advise on how to select based on this date format against DATEADD(d, - 1, { fn CURDATE() }) please.
Thanks!
It would be better to do this on the DB2 side, to reduce the number of records brought over.
Additionally, it's better from a performance standpoint to convert the static date into a numeric date and compare to the column in your table. Rather than convert the numeric date in your table to a actual date type for comparison.
where numdate >= int(replace(char(current_date - 14 days,iso),'-',''))
Doing it this way will allow you to take advantage of an index over numdate. In addition, DB2 will only need to perform this conversion once.
Depending on your platform & version, you may have an easier way to convert from a date data type to a numeric date. But the above works on DB2 for i and should work on most (all?) DB2 versions and platforms.
You may find it worthwhile to create a UDF to do this conversion for you.
If you want logic in SQL Server, then you are in luck, because you can just convert the YYYYMMDD format to a date:
where cast(datecol as date) >= cast(getdate() - 14 as date)
(This assumes no future dates.)
If you want to do this on the DB2 side, you can use to_date():
where to_date(datecol, 'YYYYMMDD') >= current date - 14 days
I've been tasked to take a calendar date range value from a form front-end and use it to, among other things, feed a query in a Teradata table that does not have a datetime column. Instead the date is aggregated from two varchar columns: one for year (CY = current year, LY = last year, LY-1, etc), and one for the date with format MonDD (like Jan13, Dec08, etc).
I'm using Coldfusion for the form and result page, so I have the ability to dynamically create the query, but I can't think of a good way to do it for all possible cases. Any ideas? Even year differences aside, I can't think of anything outside of a direct comparison on each day in the range with a potential ton of separate OR statements in the query. I'm light on SQL knowledge - maybe there's a better way to script it in the SQL itself using some sort of conversion on the two varchar columns to form an actual date range where date comparisons could then be made?
Here is some SQL that will take the VARCHAR date value and perform some basic manipulations on it to get you started:
SELECT CAST(CAST('Jan18'||TRIM(EXTRACT(YEAR FROM CURRENT_DATE)) AS CHAR(9)) AS DATE FORMAT 'MMMDDYYYY') AS BaseDate_
, CASE WHEN Col1 = 'CY'
THEN BaseDate_
WHEN Col1 = 'LY'
THEN ADD_MONTHS(BaseDate_, -12)
WHEN Col1 = 'LY-1'
THEN ADD_MONTHS(BaseDate_, -24)
ELSE BaseDate_
END AS DateModified_
FROM {MyDB}.{MyTable};
The EXTRACT() function allows you to take apart a DATE, TIME, or TIMESTAMP value.
You have you use TRIM() around the EXTRACT to get rid of the whitespace that is added converting the DATEPART to a CHAR data type. Teradata is funny with dates and often requires a double CAST() to get things sorted out.
The CASE statement simply takes the encoded values you suggested will be used and uses the ADD_MONTHS() function to manipulate the date. Dates are INTEGER in Teradata so you can also add INTEGER values to them to move the date by a whole day. Unlike Oracle, you can't add fractional values to manipulate the TIME portion of a TIMESTAMP. DATE != TIMESTAMP in Teradata.
Rob gave you an sql approach. Alternatively you can use ColdFusion to generate values for the columns you have. Something like this might work.
sampleDate = CreateDate(2010,4,12); // this simulates user input
if (year(sampleDate) is year(now())
col1Value = 'CY';
else if (year(now()) - year(sampleDate) is 1)
col1Value = 'LY'
else
col1Value = 'LY-' & DateDiff("yyyy", sampleDate, now());
col2Value = DateFormat(sampleDate, 'mmmdd');
Then you send col1Value and col2Value to your query as parameters.
I have a value in field called "postingdate" as string in 2009-11-25, 12:42AM IST format, in a table named "Post".
I need the query to fetch the details based on date range. I tried the following query, but it throws an error. Please guide me to fix this issue. Thanks in advance.
select postingdate
from post
where TO_DATE(postingDate,'YYYY-MM-DD')>61689
and TO_DATE(postingDate,'YYYY-MM-DD')<61691
As you've now seen, trying to perform any sort of query against a string column which represents a date is a problem. You've got a few options:
Convert the postingdate column to some sort of DATE or TIMESTAMP datatype. I think this is your best choice as it will make querying the table using this field faster, more flexible, and less error prone.
Leave postingdate as a string and use functions to convert it back to a date when doing comparisons. This will be a performance problem as most queries will turn into full table scans unless your database supports function-based indexes.
Leave postingdate as a string and compare it against other strings. Not a good choice as it's tough to come up with a way to do ranged queries this way, as I think you've found.
If it was me I'd convert the data. Good luck.
In SQL Server you can say
Select postingdate from post
where postingdate between '6/16/1969' and '6/16/1991'
If it's really a string, you're lucky that it's in YYYY-MM-DD format. You can sort and compare that format as a string, because the most significant numbers are on the left side. For example:
select *
from Posts
where StringDateCol between '2010-01-01' and '2010-01-02'
There's no need to convert the string to a date, comparing in this way is not affected by the , 12:42AM IST appendage. Unless, of course, your table contains dates from a different time zone :)
You will need to convert your string into a date before you run date range queries on it. You may get away with just using the string if your not interested in the time portion.
The actual functions will depend on your RDBMS
for strings only
select * from posts
where LEFT(postingDate,10) > '2010-01-21'
or
for datetime ( Sybase example)
select * from posts
where convert(DateTime,postingDate) between '2010-01-21' and '2010-01-31'