The speed of 'extract(year from sysdate)' - sql

I have two queries:
with tmp as (
select asy.aim_student_id, ast.aim_test, asq.response
from aim_student_test ast
join aim_student_qst asq on (asq.aps_yr = ast.aps_yr and asq.aim_test = ast.aim_test and asq.aim_id = ast.aim_id)
join aim_student_yr asy on (asy.aps_yr = ast.aps_yr and asy.aim_student_yr_id = ast.aim_student_yr_id)
where asq.aps_yr = '2012'
and asq.qst_num = 1)
select aim_student_id, aim_test, response
from tmp
where response is null
-- execution-time: 0.032 seconds
define this_year = extract(year from sysdate)
with tmp as (
select asy.aim_student_id, ast.aim_test, asq.response
from aim_student_test ast
join aim_student_qst asq on (asq.aps_yr = ast.aps_yr and asq.aim_test = ast.aim_test and asq.aim_id = ast.aim_id)
join aim_student_yr asy on (asy.aps_yr = ast.aps_yr and asy.aim_student_yr_id = ast.aim_student_yr_id)
where asq.aps_yr = &this_year
and asq.qst_num = 1)
select aim_student_id, aim_test, response
from tmp
where response is null
-- execution-time: 82.202 seconds
The only difference is that in one I use '2012' and the other I implement extract(year from sysdate).
I can only imagine that Oracle is computing extract(year from sysdate) for EVERY record it checks, and that I just can't figure out how to make it compute this once and use it as a variable. Searching has not returned me the answers I seek... so I come to the magicians of SO.com. HOW do I properly use
extract(year from sysdate)
as a variable?

Using &this_year in the query causes a substitution of the string extract(year from sysdate), so the second query actually has:
where asq.aps_yr = extract(year from sysdate)
which you can see from the second explain plan. That in itself probably isn't the problem; what's possibly slowing it down is that doing this is changing the plan from an index range scan to an index skip scan against aim_student_qstp1. The real difference is that in the fast version you're comparing asq.aps_yr to a string ('2012'), in the second it's a number (2012), and - as also shown in the explain plan - this is causing it to do to_number(asq.aps_yr) which is stopping the index being used as you expect.
You could fix this in your code by making it:
where asq.aps_yr = to_char(&this_year)
If you want to calculate it once before the query runs and then use it as a variable, there are at least two ways (in SQL*Plus/SQL Developer). Sticking with substitution variables you can use the column commands instead of the define:
column tmp_this_year new_value this_year
set termout off
select extract(year from sysdate) as tmp_this_year from dual;
set termout on
set verify off
... which makes &this_year=2012 (the termout changes just make the actual retrieval invisible, and the verify stops it telling you when it uses the substitution; both so you don't get extra output in your script), and change your query to have:
where asq.aps_yr = '&this_year'
... so the value is treated as a string, making a to_char() unnecessary.
Or you can use bind variables:
var this_year varchar2(4);
set feedback off;
exec :this_year := extract(year from sysdate);
... and then your query has:
where asq.aps_yr = :this_year
Note that in this case you don't need the quotes because the bind variable is defined as a string already - there's an implicit conversion in the exec that sets it.

I doubt the difference is due to extracting the year from the date. I'm pretty sure Oracle would only be extracting the year once, since it is using a variable in the second case.
The difference is due to the execution path used by the query. You would need to post the execution plans to really see the difference. Using an explicit constant gives the optimizer more information for choosing an optimal query plan.
For instance, if the data is partitioned by year, then with a constant year, Oracle can determine which partition has the data. In the second case, Oracle might not recognize the value as a constant, and require reading all data partitions. This is just an example of what might happen -- I'm not sure what Oracle does in this case.

Related

How to write an Open SQL statement with substring in the JOIN ON condition? [duplicate]

I have the following select statement in ABAP:
SELECT munic~mandt VREFER BIS AB ZZELECDATE ZZCERTDATE CONSYEAR ZDIMO ZZONE_M ZZONE_T USAGE_M USAGE_T M2MC M2MT M2RET EXEMPTMCMT EXEMPRET CHARGEMCMT
INTO corresponding fields of table GT_INSTMUNIC_F
FROM ZCI00_INSTMUNIC AS MUNIC
INNER JOIN EVER AS EV on
MUNIC~POD = EV~VREFER(9).
"where EV~BSTATUS = '14' or EV~BSTATUS = '32'.
My problem with the above statement is that does not recognize the substring/offset operation on the 'ON' clause. If i remove the '(9) then
it recognizes the field, otherwise it gives error:
Field ev~refer is unknown. It is neither in one of the specified tables
nor defined by a "DATA" statement. I have also tried doing something similar in the 'Where' clause, receiving a similar error:
LOOP AT gt_instmunic.
clear wa_gt_instmunic_f.
wa_gt_instmunic_f-mandt = gt_instmunic-mandt.
wa_gt_instmunic_f-bis = gt_instmunic-bis.
wa_gt_instmunic_f-ab = gt_instmunic-ab.
wa_gt_instmunic_f-zzelecdate = gt_instmunic-zzelecdate.
wa_gt_instmunic_f-ZZCERTDATE = gt_instmunic-ZZCERTDATE.
wa_gt_instmunic_f-CONSYEAR = gt_instmunic-CONSYEAR.
wa_gt_instmunic_f-ZDIMO = gt_instmunic-ZDIMO.
wa_gt_instmunic_f-ZZONE_M = gt_instmunic-ZZONE_M.
wa_gt_instmunic_f-ZZONE_T = gt_instmunic-ZZONE_T.
wa_gt_instmunic_f-USAGE_M = gt_instmunic-USAGE_M.
wa_gt_instmunic_f-USAGE_T = gt_instmunic-USAGE_T.
temp_pod = gt_instmunic-pod.
SELECT vrefer
FROM ever
INTO wa_gt_instmunic_f-vrefer
WHERE ( vrefer(9) LIKE temp_pod ). " PROBLEM WITH SUBSTRING
"AND ( BSTATUS = '14' OR BSTATUS = '32' ).
ENDSELECT.
WRITE: / sy-dbcnt.
WRITE: / 'wa is: ', wa_gt_instmunic_f.
WRITE: / 'wa-ever is: ', wa_gt_instmunic_f-vrefer.
APPEND wa_gt_instmunic_f TO gt_instmunic_f.
WRITE: / wa_gt_instmunic_f-vrefer.
ENDLOOP.
itab_size = lines( gt_instmunic_f ).
WRITE: / 'Internal table populated with', itab_size, ' lines'.
The basic task i want to implement is to modify a specific field on one table,
pulling values from another. They have a common field ( pod = vrefer(9) ). Thanks in advance for your time.
If you are on a late enough NetWeaver version, it works on 7.51, you can use the OpenSQL function LEFT or SUBSTRING. Your query would look something like:
SELECT munic~mandt VREFER BIS AB ZZELECDATE ZZCERTDATE CONSYEAR ZDIMO ZZONE_M ZZONE_T USAGE_M USAGE_T M2MC M2MT M2RET EXEMPTMCMT EXEMPRET CHARGEMCMT
FROM ZCI00_INSTMUNIC AS MUNIC
INNER JOIN ever AS ev
ON MUNIC~POD EQ LEFT( EV~VREFER, 9 )
INTO corresponding fields of table GT_INSTMUNIC_F.
Note that the INTO clause needs to move to the end of the command as well.
field(9) is a subset operation that is processed by the ABAP environment and can not be translated into a database-level SQL statement (at least not at the moment, but I'd be surprised if it ever will be). Your best bet is either to select the datasets separately and merge them manually (if both are approximately equally large) or pre-select one and use a FAE/IN clause.
They have a common field ( pod = vrefer(9) )
This is a wrong assumption, because they both are not fields, but a field an other thing.
If you really need to do that task through SQL, I'll suggest you to check native SQL sentences like SUBSTRING and check if you can manage to use them within an EXEC_SQL or (better) the CL_SQL* classes.

Snowflake ignores statement in where clause where I'm comparing timestamps

so I'm building a SCD type 2 in snowflake, but it ignores the where clause in which is comparision between "to_timestamp" and "expiry_date". Expiry_date is a variable that is set to '9999-08-17 07:31:29.901000000' (as infinity) and To_timestamp is a column in table. I want to query only the rows that have to_timestamp set to infinity (they are still active) but snowflake seems to ignore this part of where clause. Below is some of the code (it should update the rows that are expired - that means change their "to_timestamp" to current time. and it does but it does to rows with timestamps of all kind - it ignores last line)
SET EXPIRY_DATE_NTZ = '9999-08-17 07:31:29.901000000';
SET CURRENT_DATE_NTZ = TO_TIMESTAMP_NTZ(CURRENT_TIMESTAMP());
UPDATE CUSTOMER_TARGET CT
SET CT.TO_TIMESTAMP = $CURRENT_DATE_NTZ
FROM POC.SNOWFLAKE_POC.CUSTOMER_STAGE CS
WHERE CT.C_CUSTOMER_ID = CS.C_CUSTOMER_ID
AND (CT.C_FIRST_NAME <> CS.C_FIRST_NAME OR CT.C_LAST_NAME <> CS.C_LAST_NAME OR CT.C_BIRTH_YEAR
<> CS.C_BIRTH_YEAR OR CT.C_BIRTH_COUNTRY <> CS.C_BIRTH_COUNTRY OR CT.C_LAST_REVIEW_DATE<>CS.C_LAST_REVIEW_DATE)
AND CT.TO_TIMESTAMP = $EXPIRY_DATE_NTZ;
I have two of these update statements (one for updates and one for deletes) and a merge statement for inserts. And it ignores the comparision in every single one, updating the rows that have "to_timestamp" set to something like "2021-08-24 07:11:53.510000000". I've tried every combination possible (between ... and ..., >= ... <=, <=, >=, comparing in "case" statement of update,...) - nothing. What could be the cause/solution?
As we do not know the structure of CUSTOMER_TARGET I would suggest to explicitly set the data type of EXPIRY_DATE_NTZ variable to match the column data type:
SET EXPIRY_DATE_NTZ = '9999-08-17 07:31:29.901000000';
SELECT $EXPIRY_DATE_NTZ;
DESCRIBE RESULT LAST_QUERY_ID();
to:
-- TIMESTAMP_NTZ as an example
SET EXPIRY_DATE_NTZ = '9999-08-17 07:31:29.901000000'::TIMESTAMP_NTZ;
SELECT $EXPIRY_DATE_NTZ;
DESCRIBE RESULT LAST_QUERY_ID();
By doing that way there are no "implicit conversions" involved in the process.
Another advice is usage of IS DISTINCT FROM instead of <>. IS DISTINCT FROM is NULL safe, which is important if columns are defined as nullable.
UPDATE CUSTOMER_TARGET CT
SET CT.TO_TIMESTAMP = $CURRENT_DATE_NTZ
FROM POC.SNOWFLAKE_POC.CUSTOMER_STAGE CS
WHERE CT.C_CUSTOMER_ID = CS.C_CUSTOMER_ID
AND (CT.C_FIRST_NAME IS DISTINCT FROM CS.C_FIRST_NAME
OR CT.C_LAST_NAME IS DISTINCT FROM CS.C_LAST_NAME
OR CT.C_BIRTH_YEAR IS DISTINCT FROM CS.C_BIRTH_YEAR
OR CT.C_BIRTH_COUNTRY IS DISTINCT FROM CS.C_BIRTH_COUNTRY
OR CT.C_LAST_REVIEW_DATE IS DISTINCT FROM CS.C_LAST_REVIEW_DATE)
AND CT.TO_TIMESTAMP = $EXPIRY_DATE_NTZ;
Your SQL does not have any issues with the filters (ORs are surrounded by the brackets etc). I assume that you have checked the execution profile, and did not see your filter (CT.TO_TIMESTAMP = '9999-08-17 07:31:29.901000000'). In this case, all rows in the target table should have this value in the column TO_TIMESTAMP.
I highly recommend you check the data first. If you are running multiple UPDATE/MERGE commands, you may miss that the data has already updated with this value.

Performance tuning a DECODE() statement in a WHERE clause

I re-wrote a query to reduce the time it takes to pick the records. But still I see that there is a small tuning that needs to be done in the decode line as the cost is high. Can someone please let me know if the same query can be re-written without the decode functionality? The purpose of removing the decode function would be to use the index in v_id column.
What I have tried so far.
Tried creating a function index (Knowing that bind variables cannot be used) and it failed.
Have tried using the OR condition but it would not pick the index. So any suggestion would be of a great help.
Query is given below.:
SELECT SUM(NVL(dd.amt,0))
FROM db,dd
WHERE db.id = dd.dsba_id
AND dd.nd_id = xxxxxxx
AND dd.a_id = 'xxxxx-xx'
AND DECODE (db.v_id , xxxxxxxxx, 'COMPLETE' , db.code ) = 'COMPLETE'
AND db.datet BETWEEN TRUNC ( SYSDATE , 'YEAR' ) AND SYSDATE;
I would suggest writing the code as:
SELECT SUM(dd.amt)
FROM db JOIN
dd
ON db.id = dd.dsba_id
WHERE dd.nd_id = xxxxxxx AND
dd.a_id = 'xxxxx-xx' AND
(db.v_id = xxxxxxxxx OR db.code = 'COMPLETE') AND
db.datet >= trunc(sysdate, 'YEAR');
For this query, I would recommend indexes on:
db(nd_id, a_id, id, datet, code)
dd(dsba_id, datet, v_id)
The changes to the above query:
Never use commas in the FROM clause. Always use proper, explicit, standard, readable JOIN syntax. (This does not affect performance, however.)
decode() is rather hard to follow. A simple boolean or is equivalent.
BETWEEN is unnecessary assuming that datet is not in the future.
SUM(NVL()) is not needed, because NULL values are ignored. If you are concerned about NULL result, I would suggest COALESCE(SUM(dd.amt), 0)

MS Access Query date range when last date not found

I have the following update query in MS Access 2013
UPDATE WXObs SET WXObs.SnowFlag = 1
WHERE (((WXObs.StationID) ="451409") And(
(WxObs.ObsDate) Between #1/3/2003# AND #3/29/2003# OR
(WxObs.ObsDate) Between #11/16/2003# AND #5/7/2004# OR
(WxObs.ObsDate) Between #10/30/2004# AND #4/30/2005#));
This works until the end date in the range is not found. For instance, if 5/7/2004 is not in the data set, then the update continues to the next end date, in this case 4/30/2005.
I would prefer it ended on the last date in the range. For instance, if the data ended on 4/21/2004, that would be last field updated between 11/16/ and 5/7/2004. The query would then continue to update again beginning on 10/30/2004.
I have tried < and <=
Thanks
You're missing some parentheses that are affecting the evaluation order, causing the behavior that you're reporting.
What you want is for each of the BETWEEN portions to be evaluated completely before the OR option is evaluated, and you need to make sure that evaluation is done by surrounding the BETWEEN expressions in parentheses to guarantee the evaluation order.
This should correct it (untested, as you've not provided the test data necessary to create a test case).
UPDATE WXObs SET WXObs.SnowFlag = 1
WHERE
(WXObs.StationID ="451409")
And
(
(WxObs.ObsDate Between #1/3/2003# AND #3/29/2003#) OR
(WxObs.ObsDate Between #11/16/2003# AND #5/7/2004#) OR
(WxObs.ObsDate Between #10/30/2004# AND #4/30/2005#)
);

LINQ to SQL selecting records and converting dates

I'm trying to select records from a table based on a date using Linq to SQL. Unfortunately the date is split across two tables - the Hours table has the day and the related JobTime table has the month and year in two columns.
I have the following query:
Dim qry = From h As Hour In ctx.Hours Where Convert.ToDateTime(h.day & "/" & h.JobTime.month & "/" & h.JobTime.year & " 00:00:00") > Convert.ToDateTime("01/01/2012 00:00:00")
This gives me the error "Arithmetic overflow error converting expression to data type datetime."
Looking at the SQL query in SQL server profiler, I see:
exec sp_executesql N'SELECT [t0].[JobTimeID], [t0].[day], [t0].[hours]
FROM [dbo].[tbl_pm_hours] AS [t0]
INNER JOIN [dbo].[tbl_pm_jobtimes] AS [t1] ON [t1].[JobTimeID] = [t0].[JobTimeID]
WHERE (CONVERT(DateTime,(((((CONVERT(NVarChar,[t0].[day])) + #p0) + (CONVERT(NVarChar,COALESCE([t1].[month],NULL)))) + #p1) + (CONVERT(NVarChar,COALESCE([t1].[year],NULL)))) + #p2)) > #p3',N'#p0 nvarchar(4000),#p1 nvarchar(4000),#p2 nvarchar(4000),#p3 datetime',#p0=N'/',#p1=N'/',#p2=N' 00:00:00',#p3='2012-01-31 00:00:00'
I can see that it's not passing in the date to search for correctly but I'm not sure how to correct it.
Can anyone please help?
Thanks,
Emma
The direct cause of the error may have to do with this issue.
As said there, the conversions you use are a very inefficient way to build a query. On top of that, it is inefficient because the expressions are not sargable. I.e. you are using a computed value from database columns in a comparison which disables the query analyzer to use indexes to jump to individual column values. So, you could try to fix the error by doctoring the direct cause, but I think it's better to rewrite the query in a way that only the single column values are used in comparions.
I've worked this out in C#:
var cfg = new DateTime(12,6,12);
int year = 12, month = 6, day = 13; // Try some more values here.
// Date from components > datetime value?
bool gt = (
year > cfg.Year || (
(year == cfg.Year && month > cfg.Month) || (
year == cfg.Year && month == cfg.Month && day > cfg.Day)
)
);
You see that it's not as straightforward as it may look at first, but it works. There are much more comparisons to work out, but I'm sure that the ability to use indexes will easily outweigh this.
A more straightforward, but not sargable, way is to use sortable dates, like 20120101 and compare those (as integers).