I am trying to get only the month of August in my database and then count how many times there has been a performance during August however I can't figure out how to do it.
I have given the code which I have created so far.
SELECT f.FILM_NAME, COUNT(p.PERFORMANCE_DATE), SUM(p.TAKINGS), p.PERFORMANCE_DATE
FROM A2_PERFORMANCE p, A2_FILM f
WHERE p.PERFORMANCE_DATE LIKE TO_DATE('08-2021', 'MM-YY')
GROUP BY f.FILM_NAME, p.PERFORMANCE_DATE
ORDER BY f.FILM_NAME
I am currently trying to achieve this:
-- FILM_NAME Performances Total Takings
-- --------------------------- ------------ ----------------------
-- It Happened One Night 39 £63,571
-- Modern Times 38 £58,332
-- Parasite 23 £37,195
-- Knives Out 22 £34,362
-- Citizen Kane 25 £32,711
-- The Wizard of Oz 18 £21,716
-- Avengers: Endgame 18 £17,081
You can convert your dates to string and compare them to '08-2021' (not in the way you did - you must apply formatting to the dates themselves first), but that is inefficient.
You can also truncate the dates to the beginning of the month and compare to date '2021-08-01', but that is also inefficient.
"Inefficiency" comes from two sources, one smaller and one really big. The smaller one is having to apply functions to data from your table. The really big one has to do with indexing: if your queries often filter based on dates, then you would benefit from indexing the date column - but the index can't be used if your filters apply a function to the date column first.
So, how to do it? Best (especially if your "dates" may have time-of-day component other than midnight) is something like this:
...
where p.performance_date >= date '2021-08-01'
and p.performance_date < date '2021-09-01'
...
Note that it's always two inequalities, the first one is non-strict and the second is strict.
If instead you use performance_date between [Aug 1] and [Sep 1] (in pseudo-code), this will give the wrong answer - this makes the second inequality also non-strict, so you will include performances with a date of Sept. 1, if they are saved in the db with a time-of-day of midnight. And between [Aug 1] and [Aug 31] will miss everything on August 31 with a time-of-day OTHER THAN midnight.
Best not to use BETWEEN and instead to use two explicit inequalities, as I have shown.
You are currently doing:
WHERE p.PERFORMANCE_DATE LIKE TO_DATE('08-2021', 'MM-YY')
You are providing a 4-digit year value (2021) but have only given YY in the format model, not YYYY. Now, Oracle by default is lenient about this sort of thing - unhelpfully so, some might say - so this will work in this case. But it's not good practice, and TO_DATE('08-2021', 'MM-YYYY') would be better. Either way, that will give you midnight on the the first day of that month. You can provide that a bit more simply with a date literal: DATE '2021-08-01'.
LIKE is a pattern-matching condition, and compares strings, not dates. You are forcing an implicit conversion of both your column value and the fixed date you provided to strings, using your session's NLS_DATE_FORMAT setting. You also aren't including any wildcard characters, making it equivalent to =. So with a common NLS setting, you are really doing:
WHERE TO_CHAR(p.PERFORMANCE_DATE, 'DD-MON-RR') = TO_CHAR(DATE '2021-08-01', 'DD-MON-RR')
With that NLS setting you would match any values on August 1st - though it would match data from 1921 etc. as well as 2021, if you had any. With a more precise NLS setting you might be doing:
WHERE TO_CHAR(p.PERFORMANCE_DATE, 'YYYY-MM-DD HH24:MM:SS') = TO_CHAR(DATE '2021-08-01', 'YYYY-MM-DD HH24:MI:SS')
which would only match values at exactly midnight on August 1st.
To match the whole month you could use an explicit mask, instead of the implicit conversion; and you only need to convert the column value, as you can supply the fixed value in the format you want anyway:
WHERE TO_CHAR(p.PERFORMANCE_DATE, 'YYYY-MM') = '2021-08'
or, if you prefer (I don't, but it's not my code):
WHERE TO_CHAR(p.PERFORMANCE_DATE, 'MM-YYYY') = '08-2021'
But that conversion will prevent a normal index on that column being used. I would suggest providing a date range to avoid any conversion:
WHERE p.PERFORMANCE_DATE >= DATE '2021-08-01'
AND p.PERFORMANCE_DATE < DATE '2021-09-01'
which will find all rows where that column is on or after midnight on August 1st, and before midnight on September 1st.
If you are looking for all activities in august regardless of the year, you can convert your column to char of 'MON' and match it with 'AUG' :
SELECT f.FILM_NAME, COUNT(p.PERFORMANCE_DATE), SUM(p.TAKINGS),p.PERFORMANCE_DATE
FROM A2_PERFORMANCE p, A2_FILM f
WHERE TO_CHAR(p.PERFORMANCE_DATE,'MON') ='AUG'
GROUP BY f.FILM_NAME, p.PERFORMANCE_DATE
ORDER BY f.FILM_NAME
Related
I am casting mm/dd/yy strings into dates in redshift using CAST AS DATE CAST(birth_str AS DATE) AS birth_date. The conversion handles the components correctly but the year is being converted into future times whenever it falls below 1970. For example:
birth_str birth_date
07/19/84 1984-07-19
02/07/66 2066-02-07
06/24/84 1984-06-24
01/31/64 2064-01-31
12/08/62 2062-12-08
02/21/36 2036-02-21
02/19/37 2037-02-19
07/01/74 1974-07-01
08/25/50 2050-08-25
08/31/39 2039-08-31
Is there a best practice for getting dates to not fall into the future?
Is there not an argument for this in the cast? (I looked everywhere but I am finding nothing.) Otherwise, I am envisioning the best path forward is testing for the cast date being in the future and then just doing string surgery on the miscreants before recasting them into reasonable dates.
Basically:
if not future date: great.
if future date:
peel out all the date components
slap a 19 onto the yy
glue everything back together
cast into date.
Is this as good as it gets? (I was a bit surprised I could find no one has come up with a better way around this issue already.)
Is there a best practice? Absolutely! Don't store dates as strings. Store dates as date. That is why SQL has native types.
In your case, you could use conditional logic:
select (case when cast(birth_str AS DATE) < current_date
then cast(birth_str AS DATE)
else cast(birth_str AS DATE) - interval '100 year'
end) as birth_date
Or since Redshift can't handle intervals you can go with this:
SELECT (CASE
WHEN birth_str::DATE < CURRENT_DATE
THEN birth_str::DATE
ELSE ADD_MONTHS(birth_str::DATE, -1200)
END) AS birth_date
You can apply a CASE to check the converted DATE IS greater than TODAY or not. If Yes, Just minus 100 years from the results as below.
One Question: Is there any chance of having dates like 02/21/14 which can be belongs to 1900 or 2000?
SELECT
CASE
WHEN CAST('02/21/36' AS DATE) >GETDATE() THEN DATEADD(YY,-100,CAST('02/21/36' AS DATE))
ELSE CAST('02/21/36' AS DATE)
END
I understand that querying a date will fail as its comparing a string to date and that can cause an issue.
Oracle 11.2 G
Unicode DB
NLS_DATE_FORMAT DD-MON-RR
select * from table where Q_date='16-Mar-09';
It can be solved by
select * from table where trunc(Q_date) = TO_DATE('16-MAR-09', 'DD-MON-YY');
What I don't get is why this works.
select* from table where Q_date='07-JAN-08';
If anyone can please elaborate or correct my mindset.
Thanks
Oracle does allow date literals, but they depend on the installation (particularly the value of NLS_DATE_FORMAT as explained here). Hence, there is not a universal format for interpreting a single string as a date (unless you use the DATE keyword).
The default format is DD-MM-YY, which seems to be the format for your server. So, your statement:
where Q_date = '07-JAN-08'
is interpreted using this format.
I prefer to use the DATE keyword with the ISO standard YYYY-MM-DD format:
where Q_Date = DATE '2008-01-07'
If this gets no rows returned:
select * from table where Q_date='16-Mar-09';
but this does see data:
select * from table where trunc(Q_date) = TO_DATE('16-MAR-09', 'DD-MON-YY');
then you have rows which have a time other than midnight. At this point in the century DD-MON-RR and DD-MON-YY are equivalent, and both will see 09 as 2009, so the date part is right. But the first will only find rows where the time is midnight, while the second is stripping the time off via the trunc, meaning the dates on both sides are at midnight, and therefore equal.
And since this also finds data:
select* from table where Q_date='07-JAN-08';
... then you have rows at midnight on that date. You might also have rows with other times, so checking the count with the trunc version might be useful.
You can check the times you actually have with:
select to_char(q_date, 'YYYY-MM-DD HH24:MI:SS') from table;
If you do want to make sure you catch all times within the day you can use a range:
select * from table where
q_date >= date '2009-03-16'
and q_date < date '2009-03-17';
Quick SQL Fiddle demo.
Although it sounds like you're expecting all the times to be midnight, which might indicate a data problem.
First, I am aware that this question has been posted generally Equals(=) vs. LIKE.
Here, I query about date type data on ORACLE database, I found the following, when I write select statment in this way:
SELECT ACCOUNT.ACCOUNT_ID, ACCOUNT.LAST_TRANSACTION_DATE
FROM ACCOUNT
WHERE ACCOUNT.LAST_TRANSACTION_DATE LIKE '30-JUL-07';
I get all rows I'm looking for. but when I use the sign equal = instead :
SELECT ACCOUNT.ACCOUNT_ID, ACCOUNT.LAST_TRANSACTION_DATE
FROM ACCOUNT
WHERE ACCOUNT.LAST_TRANSACTION_DATE = '30-JUL-07';
I get nothing even though nothing is different except the equal sign. Can I find any explanation for this please ?
Assuming LAST_TRANSACTION_DATE is a DATE column (or TIMESTAMP) then both version are very bad practice.
In both cases the DATE column will implicitly be converted to a character literal based on the current NLS settings. That means with different clients you will get different results.
When using date literals always use to_date() with(!) a format mask or use an ANSI date literal. That way you compare dates with dates not strings with strings. So for the equal comparison you should use:
LAST_TRANSACTION_DATE = to_date('30-JUL-07', 'dd-mon-yy')
Note that using 'MON' can still lead to errors with different NLS settings ('DEC' vs. 'DEZ' or 'MAR' vs. 'MRZ'). It is much less error prone using month numbers (and four digit years):
LAST_TRANSACTION_DATE = to_date('30-07-2007', 'dd-mm-yyyy')
or using an ANSI date literal
LAST_TRANSACTION_DATE = DATE '2007-07-30'
Now the reason why the above query is very likely to return nothing is that in Oracle DATE columns include the time as well. The above date literals implicitly contain the time 00:00. If the time in the table is different (e.g. 19:54) then of course the dates are not equal.
To workaround this problem you have different options:
use trunc() on the table column to "normalize" the time to 00:00
trunc(LAST_TRANSACTION_DATE) = DATE '2007-07-30
this will however prevent the usage of an index defined on LAST_TRANSACTION_DATE
use between
LAST_TRANSACTION_DATE between to_date('2007-07-30 00:00:00', 'yyyy-mm-dd hh24:mi:ss') and to_date('2007-07-30 23:59:59', 'yyyy-mm-dd hh24:mi:ss')
The performance problem of the first solution could be worked around by creating an index on trunc(LAST_TRANSACTION_DATE) which could be used by that expression. But the expression LAST_TRANSACTION_DATE = '30-JUL-07' prevents an index usage as well because internally it's processed as to_char(LAST_TRANSACTION_DATE) = '30-JUL-07'
The important things to remember:
Never, ever rely on implicit data type conversion. It will give you problems at some point. Always compare the correct data types
Oracle DATE columns always contain a time which is part of the comparison rules.
You should not compare a date to a string directly. You rely on implicit conversions, the rules of which are difficult to remember.
Furthermore, your choice of date format is not optimal: years have four digits (Y2K bug?), and not all languages have the seventh month of the year named JUL. You should use something like YYYY/MM/DD.
Finally, dates in Oracle are points in time precise to the second. All dates have a time component, even if it is 00:00:00. When you use the = operator, Oracle will compare the date and time for dates.
Here's a test case reproducing the behaviour you described:
SQL> create table test_date (d date);
Table created
SQL> alter session set nls_date_format = 'DD-MON-RR';
Session altered
SQL> insert into test_date values
2 (to_date ('2007/07/30 11:50:00', 'yyyy/mm/dd hh24:mi:ss'));
1 row inserted
SQL> select * from test_date where d = '30-JUL-07';
D
-----------
SQL> select * from test_date where d like '30-JUL-07';
D
-----------
30/07/2007
When you use the = operator, Oracle will convert the constant string 30-JUL-07 to a date and compare the value with the column, like this:
SQL> select * from test_date where d = to_date('30-JUL-07', 'DD-MON-RR');
D
-----------
When you use the LIKE operator, Oracle will convert the column to a string and compare it to the right-hand side, which is equivalent to:
SQL> select * from test_date where to_char(d, 'DD-MON-RR') like '30-JUL-07';
D
-----------
30/07/2007
Always compare dates to dates and strings to strings. Related question:
How to correctly handle dates in queries constraints
The date field is not a string. Internally an implicit conversion is made to a string when you use =, which does not match anything because your string does not have the required amount of precision.
I'd have a guess that the LIKE statement behaves somewhat differently with a date field, causing implicit wildcards to be used in the comparison that eliminates the requirement for any precision. Essentially, your LIKE works like this:
SELECT ACCOUNT.ACCOUNT_ID, ACCOUNT.LAST_TRANSACTION_DATE
FROM ACCOUNT
WHERE ACCOUNT.LAST_TRANSACTION_DATE BETWEEN DATE('30-JUL-07 00:00:00.00000+00:00') AND DATE('30-JUL-07 23:59:59.99999+00:00');
I'm writing an SQL statment that is supposed to do a count based on a date range. But, for some reason no data is being returned. Before I try and filter the count with my date range, everything works fine. Here is that code.
SELECT
CR.GCR_RFP_ID
,S.RFP_RECEIVED_DT
,CR.GCR_RECEIVED_DT
,CT.GCT_LOB_IND
FROM ADM.GROUP_CHANGE_TASK_FACT CT
JOIN ADM.B_GROUP_CHANGE_REQUEST_DIM CR
ON CR.GROUP_CHANGE_REQUEST_KEY = CT.GROUP_CHANGE_REQUEST_KEY
JOIN ADM.B_RFP_WC_COVERAGE_DIM S
ON S. RFP_ID = CR.GCR_RFP_ID
WHERE CT.GCT_LOB_IND = 'WC'
AND CR.GCR_CHANGE_TYPE_ID IN ('10','20','30','50','60','70','80','90','100','110',
'120','130','140', '150','160','170','180','190','200',
'210','220','230','240','260','270','280','300','310',
'320','330','340','350','360','370','371','372')
AND S.RFP_AUDIT_IND = 'N'
AND S.RFP_TYPE_IND = 'A'
The date field I'm using is called CR.GCR_RECIEVED_DT. This is a new field a in the db and all the records are 01-JAN-00. But I'm still doing the count just to make sure I can grab the data. Now, I added this line:
AND CR.GCR_RECEIVED_DT LIKE '01-JAN-00'
just as a random test thing. I know all the dates are the same. And it works fine, no issues. So I remove that line and replace it with this:
AND CR.GCR_RECEIVED_DT BETWEEN '31-DEC-99' AND '02-JAN-00'
I used this small range to keep it simple. But even though 01-JAN-00 deffinetly falls between those two dates, no data is returned. I have no idea why this is happening. I even tried this line to:
AND CR.GCR_RECEIVED_DT = '01-JAN-00'
and I still don't get data returned. It only seems to work with LIKE. I have checked and the field is a date type. Any help wold be much appreciated.
If your NLS_DATE_FORMAT is set to DD-MON-YY then the apparent discrepancy between the first two results can be explained.
When you use LIKE it implicitly converts the date value on the left-hand side to a string for the comparison, using the default format model, and then compares that to the fixed string; and '01-JAN-00' is like '01-JAN-00'. You're effectively doing:
AND TO_CHAR(CR.GCR_RECEIVED_DT, 'DD-MON-YY') LIKE '01-JAN-00'
Using LIKE to compare dates doesn't really make any sense though. When you use BETWEEN, though, the left-hand side is being left as a date, so you're effectively doing:
AND CR.GCR_RECEIVED_DT BETWEEN TO_DATE('31-DEC-99', 'DD-MON-YY')
AND TO_DATE('02-JAN-00', 'DD-MON-YY')
... and TO_DATE('31-DEC-99', 'DD-MON-YY') is December 31st 2099, not 1999. BETWEEN only works when the first value is lower than the second (from the docs, 'If expr3 < expr2, then the interval is empty'). So you're looking for values bwteen 2099 and 2000, and that will always be empty. If your date model was DD-MON-RR, from the NLS parameter or explicitly via TO_DATE, then it would be looking for values between 1999 and 2000, and would find your records.
Your third result is a little more speculative but suggests that the values in your GCR_RECEIVED_DT field have a time component, or are not in the century you think. This is similar to the LIKE version, except this time the fixed string is being converted to a date, rather than the date being converted to a string; effectively:
AND CR.GCR_RECEIVED_DT = TO_DATE('01-JAN-00', 'DD-MON-YY')
If they were at midnight on 2000-01-01 this would work. Because it doesn't that suggests they are either some time after midnight, or maybe more likely - since you're using a 'magic' date in your existing records - they are another date entirely, quite possibly 1900-01-01.
Here are SQL Fiddles for just past midnight and 1900.
If the field will eventually have a time component for new records you might want to structure the condition like this, and use date literals to be a bit clearer (IMO):
AND CR.GCR_RECEIVED_DT >= DATE '2000-01-01'
AND CR.GCR_RECEIVED_DT < DATE '2000-01-02'
That will find any records at any time on 2000-01-01, and can use an index on that column if one is available. BETWEEN is inclusive, so using BETWEEN DATE '2000-01-01' AND '2000-01-02' would include any records that are exactly at midnight on the later date, which you probably don't want.
Whatever you end up doing, avoid relying on implicit conversions using NLS_DATE_FORMAT as one day it might not be set to what you expect, causing potentially data-corrupting or hard to find bugs; and specify the full four-digit year in the model if you can to avoid ambiguity.
try something like this:
WHERE TRUNC(CR.GCR_RECEIVED_DT) = TO_DATE('01-JAN-00','DD-Mon-YY')
TRUNC without parameter removes hours, minutes and seconds from a DATE.
I'm writing some SQL queries in PL/SQL that require me to filter the records based on date. The field is a date/time type, but since I don't really care about the time I figured I'll just omit it from my where clause.
So I'm writing something like
WHERE
f.logdate between to_date('2011/01/01', 'yyyy/mm/dd') and
to_date('2011/01/31', 'yyyy/mm/dd')
To get all the records for january. I read that this is supposed to be equivalent to
WHERE
f.logdate >= to_date('2011/01/01', 'yyyy/mm/dd') and
f.logdate <= to_date('2011/01/31', 'yyyy/mm/dd')
But my final results are not what I expected: there are less records when I use the BETWEEN keyword than when I explicitly state the bounds. Is it because my assumption of what BETWEEN does is wrong?
EDIT: ah nvm, it appears that the date is not the issue. There was a subquery that I was using that was filtering its result set by date as well and was specifying date/time while I'm not.
Could you show the type of the "logdate" field (the sql create sentence could help) ?
In some databases the date type is actually a datetime field, so if you are looking for dates after "Jan 01 2011", you are really looking for dates after "Jan 01 2011 12:00:00 p.m.".
It may be your case.
if the time is set to 0:00 or something strange like that it wont work properly.
The query retrieves the expected rows because the date values in the query and the datetime values stored in the RateChangeDate column have been specified without the time part of the date. When the time part is unspecified, it defaults to 12:00 A.M. Note that a row that contains a time part that is after 12:00 A.M. on 1998-0105 would not be returned by this query because it falls outside the range.
http://msdn.microsoft.com/en-us/library/ms187922.aspx