SQL: Difference between "BETWEEN" vs "current_date - number" - sql

I am wondering which of the following is the best way to implement and why.
select * from table1 where request_time between '01/18/2012' and '02/17/2012'
and
select * from table1 where request_time > current_date - 30

I ran the two queries through some of my date tables in my database and using EXPLAIN ANALYZE I found these results:
explain analyze
select * from capone.dim_date where date between '01/18/2012' and '02/17/2012'
Total runtime: 22.716 ms
explain analyze
select * from capone.dim_date where date > current_date - 30
Total runtime: 65.044 ms
So it looks like the 1st option is more optimal. Of course this is biased towards my DBMS but these are still the results I got.
The table has dates ranging from 1900 to 2099 so it is rather large, and not just some dinky little table.

Between has the inclusive ranges i.e when you issue a query like id between 2 and 10 the value of 2 and 10 will also be fetched.If you want to eliminate these values use > and <.
Also when indexes are applied say on date column > and < makes a good use of index than between.

Related

Optimization on large tables

I have the following query that joins two large tables. I am trying to join on patient_id and records that are not older than 30 days.
select * from
chairs c
join data id
on c.patient_id = id.patient_id
and to_date(c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') >= 0
and to_date (c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') < 30
Currently, this query takes 2 hours to run. What indexes can I create on these tables for this query to run faster.
I will take a shot in the dark, because as others said it depends on what the table structure, indices, and the output of the planner is.
The most obvious thing here is that as long as it is possible, you want to represent dates as some date datatype instead of strings. That is the first and most important change you should make here. No index can save you if you transform strings. Because very likely, the problem is not the patient_id, it's your date calculation.
Other than that, forcing hash joins on the patient_id and then doing the filtering could help if for some reason the planner decided to do nested loops for that condition. But that is for after you fixed your date representation AND you still have a problem AND you see that the planner does nested loops on that attribute.
Some observations if you are stuck with string fields for the dates:
YYYYMMDD date strings are ordered and can be used for <,> and =.
Building strings from the data in chairs to use to JOIN on data will make good use of an index like one on data for patient_id, from_date.
So my suggestion would be to write expressions that build the date strings you want to use in the JOIN. Or to put it another way: do not transform the child table data from a string to something else.
Example expression that takes 30 days off a string date and returns a string date:
select to_char(to_date('20200112', 'YYYYMMDD') - INTERVAL '30 DAYS','YYYYMMDD')
Untested:
select * from
chairs c
join data id
on c.patient_id = id.patient_id
and id.from_date between to_char(to_date(c.from_date, 'YYYYMMDD') - INTERVAL '30 DAYS','YYYYMMDD')
and c.from_date
For this query:
select *
from chairs c join data
id
on c.patient_id = id.patient_id and
to_date(c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') >= 0 and
to_date (c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') < 30;
You should start with indexes on (patient_id, from_date) -- you can put them in both tables.
The date comparisons are problematic. Storing the values as actual dates can help. But it is not a 100% solution because comparison operations are still needed.
Depending on what you are actually trying to accomplish there might be other ways of writing the query. I might encourage you to ask a new question, providing sample data, desired results, and a clear explanation of what you really want. For instance, this query is likely to return a lot of rows. And that just takes time as well.
Your query have a non SERGABLE predicate because it uses functions that are iteratively executed. You need to discard such functions and replace them by a direct access to the columns. As an exemple :
SELECT *
FROM chairs AS c
JOIN data AS id
ON c.patient_id = id.patient_id
AND c.from_date BETWEEN id.from_date AND id.from_date + INTERVAL '1 day'
Will run faster with those two indexes :
CREATE X_SQLpro_001 ON chairs (patient_id, from_date);
CREATE X_SQLpro_002 ON data (patient_id, from_date) ;
Also try to avoid
SELECT *
And list only the necessary columns

redshift sql current_date get_date() performance issue

I would like to know whether using the current_date or get_date() function on redshift SQL would lower the query performance compared to using '2016-05-05' directly, for example.
Example 1:
select
*
from
table a
where time >= current_date - 1
and time < current_date
Example 2:
select
*
from table a
where time >='2016-05-08'
and time < '2016-05-09'
Would example 1 or example 2 have better performance? Or would both have the same?
Hope someone could shed some light on it
I just ran up against this, so I shall share my anecdotal experience:
select * from table where ts > current_date limit 20 was running for 10+mins before I got impatient and killed it.
select * from table where ts > '2017-06-08' limit 20 completed in 16s.
table has a sort key on ts, and was freshly analyzed

SQLite query to get the closest datetime

I am trying to write an SQLite statement to get the closest datetime from an user input (from a WPF datepicker). I have a table IRquote(rateId, quoteDateAndTime, quoteValue).
For example, if the user enter 10/01/2000 and the database have only fixing stored for 08/01/2000, 07/01/2000 and 14/01/2000, it would return 08/01/2000, being the closest date from 10/01/2000.
Of course, I'd like it to work not only with dates but also with time.
I tried with this query, but it returns the row with the furthest date, and not the closest one:
SELECT quoteValue FROM IRquote
WHERE rateId = '" + pRefIndexTicker + "'
ORDER BY abs(datetime(quoteDateAndTime) - datetime('" + DateTimeSQLite(pFixingDate) + "')) ASC
LIMIT 1;
Note that I have a function DateTimeSQLite to transform user input to the right format.
I don't get why this does not work.
How could I do it? Thanks for your help
To get the closest date, you will need to use the strftime('%s', datetime) SQLite function.
With this example/demo, you will get the most closest date to your given date.
Note that the date 2015-06-25 10:00:00 is the input datetime that the user selected.
select t.ID, t.Price, t.PriceDate,
abs(strftime('%s','2015-06-25 10:00:00') - strftime('%s', t.PriceDate)) as 'ClosestDate'
from Test t
order by abs(strftime('%s','2015-06-25 10:00:00') - strftime('%s', PriceDate))
limit 1;
SQL explanation:
We use the strftime('%s') - strftime('%s') to calculate the difference, in seconds, between the two dates (Note: it has to be '%s', not '%S'). Since this can be either positive or negative, we also need to use the abs function to make it all positive to ensure that our order by and subsequent limit 1 sections work correct.
If the table is big, and there is an index on the datetime column, this will use the index to get the 2 closest rows (above and below the supplied value) and will be more efficient:
select *
from
( select *
from
( select t.ID, t.Price, t.PriceDate
from Test t
where t.PriceDate <= datetime('2015-06-23 10:00:00')
order by t.PriceDate desc
limit 1
) d
union all
select * from
( select t.ID, t.Price, t.PriceDate
from Test t
where t.PriceDate > datetime('2015-06-23 10:00:00')
order by t.PriceDate asc
limit 1
) a
) x
order by abs(julianday('2015-06-23 10:00:00') - julianday(PriceDate))
limit 1 ;
Tested in SQLfiddle.
Another useful solution is using BETWEEN operator, if you can determine upper and lower bounds for your time/date query. I encountered this solution just recently here in this link. This is what i've used for my application on a time column named t (changing code for date column and date function is not difficult):
select *
from myTable
where t BETWEEN '09:35:00' and '09:45:00'
order by ABS(strftime('%s',t) - strftime('%s','09:40:00')) asc
limit 1
Also, i must correct my comment on above post. I tried a simple examination of speed of these 3 approaches proposed by #BerndLinde, #ypercubeᵀᴹ and me . I have around 500 tables with 150 rows in each and medium hardware in my PC. The result is:
Solution 1 (using strftime) takes around 12 seconds.
Adding index of column t to solution 1 improves speed by around 30% and takes around 8 seconds. I didn't face any improvement for using index of time(t).
Solution 2 also has around 30% of speed improvement over Solution 1 and takes around 8 seconds
Finally, Solution 3 has around 50% improvement and takes around 5.5 seconds. Adding index of column t gives a little more improvement and takes around 4.8 seconds. Index of time(t) has no effect in this solution.
Note: I'm a simple programmer and this is a simple test in .NET code. A real performance test must consider more professional aspects, which i'm not aware of them. There was also some computations in my code, after querying and reading from database. Also, as #ypercubeᵀᴹ states, this result my not work for large amount of data.

SQL to filter for records more than 30 days old

Suppose I have the following query:
select customer_name, origination_date
where origination_date < '01-DEC-2013';
I would like to select all customers that have an origination date older than 30 days. Is there a way in SQL (oracle, if specifics needed) to specify it in a more dynamic approach than manually entering the date so that I don't need to update the query every time I run it?
Thanks!
Sure try something like this:
select customer_name, origination_date where
origination_date >= DATEADD(day, -30, GETUTCDATE());
This basically says where the origination_date is greater or equal to 30 days from now. This works in Microsoft SQL, not sure but there is probably a similar function on Oracle.
in Oracle, when you subtract dates, by default you get the difference in days, e.g.
select * from my_table where (date_1 - date_2) > 30
should return the records whose date difference is greater than 30 days.
To make your query dynamic, you parameterize it, so instead of using hard coded date values, you use:
select * from my_table where (:date_1 - :date_2) > :threshold
If you are using oracle sql developer to run such a query, it will pop up a window for you to specify the values for your paramteres; the ones preceded with colon.

how to get data whose expired within 45 days..?

HI all,
i have one sql table and field for that table is
id
name
expireydate
Now i want only those record which one is expired within 45 days or 30 days.
how can i do with sql query .?
I have not much more exp with sql .
Thanks in advance,
If you are using mysql then try DATEDIFF.
for 45 days
select * from `table` where DATEDIFF(now(),expireydate)<=45;
for 30 days
select * from `table` where DATEDIFF(now(),expireydate)<=30;
In oracle - will do the trick instead of datediff and SYSDATE instead of now().[not sure]
In sql server DateDiff is quite different you have to provide unit in which difference to be taken out from 2 dates.
DATEDIFF(datepart,startdate,enddate)
to get current date try one of this: CURRENT_TIMESTAMP or GETDATE() or {fn NOW()}
You can use a simple SELECT * FROM yourtable WHERE expireydate < "some formula calculating today+30 or 45 days".
Simple comparison will work there, the tricky part is to write this last bit concerning the date you want to compare to. It'll depend of your environment and how you stored the "expireydate" in the database.
Try Below:-
SELECT * FROM MYTABLE WHERE (expireydate in days) < ((CURRENTDATE in days)+ 45)
Do not execute directly! Depending of your database, way of obtaining a date in days will be different. Go look at your database manual or please precise what is your database.