measuring the time of each operation in SQL query - sql

I have a SQL query in the format of tree (A⨝B)⨝(C⨝D)⨝(E⨝F)⨝(G⨝H)⨝(I⨝J) containing different joins.I want to know that is there any method that we can find the time for each join operation separately like how much time sub expression (A⨝B) can take. Or (C⨝D) can take. instead of whole expression. Or how can we find the time for only the sub expression (A⨝B)⨝(C⨝D). I have converted my SQL query into tree by using Java language. Thanks in Advance. I am using SQL server for implementing my queries

I'm not sure if this is what you need but you could try with DATEDIFF if you can split each operation.
DECLARE #timer1 DATETIME
DECLARE #timer2 DATETIME
SET #timer1 = GETDATE()
--stuff to measure here
SET #timer2 = GETDATE()
SELECT DATEDIFF(millisecond,#timer1,#timer2 ) AS time_spent
I think you can compare different queries and see which one does the best.

Related

SQL Query variable vs hardcoded value

I need explanation of something, as I couldn't find in on my own (probably because I don't know how to search for it).
I have a SQL Server query with some Common Table Expressions in it, one of those CTEs is selecting data based on date and user being not null e.g.
WHERE
"dummy"."UsageEnd" >= '20161001'
AND "dummy"."UsageEnd" < '20161101'
AND "Users"."Login" IS NOT NULL
In that form this query executes in ~2 seconds, but I need to change dates to a parameter, as this query will be executed very commonly. But if i change it to:
WHERE
"dummy"."UsageEnd" >= #start
AND "dummy"."UsageEnd" < #end
AND "Users"."Login" IS NOT NULL
Where #start and #end are declared as either datetime or varchar:
declare #datestart datetime
set #datestart = '20161001';
declare #dateend datetime
set #dateend = '20161101';
This subquery executes in 23-24 seconds, and whole query (as reminder this subquery is in CTE) is taking 7-8 minutes, when previously it was taking 12-15 seconds.
Can someone explain it to me why comparing dates to variables increased execution time so dramatically? Also is it possible, that whole query is taking so long because when in CTE is a variable it will re-evaluate it every time instead of just one?
The problem may well be caused by a combination of everyone's comments:
If your UsageEnd column is not a datetime datatype, but a varchar, the optimizer will first need to convert all values to a datetime type in order to make the comparison with the variables.
The first query with the "hardcoded constants" is already in varchar, so the optimizer is able to perform the comparison much faster.
Both plans will look different and give a clear indication of where to find the problem.
When an operator combines two expressions of different data types, the rules for data type precedence specify that the data type with the lower precedence is converted to the data type with the higher precedence. If the conversion is not a supported implicit conversion, an error is returned. MSDN

How to translate this SQL query to tableau?

I have a SQL query which shows time activity of each account. Database is Microsoft SQL Server on Windows Server 2008.
Help me please to translate this query to tableau with using parameters Parameters.Date1 and Parameters.Date2 instead of #time.
The result of the query:
USER,Date,Total time
USER1,2016-09-22,07:00:00.0000000
USER2,2016-09-22,08:00:00.0000000
USER3,2016-09-22,05:00:00.0000000
SQL query:
DECLARE #time datetime
set #time = '08.09.2016'
SELECT
[User],
CAST(DATEADD(SECOND, sum(datediff(DAY, #time, [Start])), #time) AS date) 'Date',
CAST(DATEADD(SECOND, sum(datediff(SECOND, '00:00:00',[Period])), '00:00:00') AS time) 'Total time'
FROM
[User].[dbo].[UserAction]
WHERE
[Start] >= #time+'00:00:00' and [Start] <= #time+'23:59:59'
GROUP BY
[USER]
input data to build the query:
USER, Start,End,Period
USER1,2016-09-22 09:00:00.000,2016-09-22 12:00:00.000,03:00:00
USER1,2016-09-22 12:00:00.000,2016-09-22 13:00:00.000,01:00:00
USER1,2016-09-22,13:00:00.000,2016-09-22 16:00:00.000,03:00:00
USER2,2016-09-22,09:00:00.000,2016-09-22 13:00:00.000,04:00:00
USER2,2016-09-22,13:00:00.000,2016-09-22 17:00:00.000,04:00:00
USER3,2016-09-22,09:00:00.000,2016-09-22 10:00:00.000,01:00:00
USER3,2016-09-22,10:00:00.000,2016-09-22 12:00:00.000,02:00:00
USER3,2016-09-22,12:00:00.000,2016-09-22 14:00:00.000,02:00:00
I don't have enough imaginary stack overflow points yet to make a comment instead of an answer, but I would agree with Gordon Linoff.
A table valued function in sql can be used directly in a Tableau data source, and it's treated just like a table.
Note I did not test the below, but here is what the equivalent function might look like:
CREATE FUNCTION dbo.MyFuntion (#time datetime)
RETURNS TABLE
AS
RETURN
(
SELECT
[User]
,cast(DATEADD(SECOND, sum(datediff(DAY, #time,[Start])),#time) as date)'Date'
,cast(DATEADD(SECOND, sum(datediff(SECOND, '00:00:00',[Period])),'00:00:00') as time)'Total time'
FROM
[User].[dbo].[UserAction]
WHERE
[Start] >= #time+'00:00:00' and [Start] <= #time+'23:59:59'
GROUP BY [USER]
);
Tableau 9 (haven't tried 10) seems to discourage custom SQL (it warns anyone that opens your workbook) and stored procedures (slow vs. same sql in a function).
Alternatively, adding the pure dbo.UserAction table to a data source and making calculated fields for the second two columns might work: Tableau Documentation. It seems to have all the functions needed to manipulate dates. However, there may be some crazy limitation associated with parameters that might limit it, honestly can't remember off the top of my head.
You don't need custom SQL for this. Keep it simple. Connect Tableau directly to your UserAction table.
You can either:
Put Day(Start) on the filter shelf, Make sure it is a continuous Date truncated to the Day. Show the filter and set the filter to let you pick a single value at a time - I would choose a slider UI.
Or write a calculated field to put on the filters shelf that references a parameter such as day(Start) = day(Date1)
Put User on one shelf, such as rows, and Sum(Period) on another such as columns. That should do it unless Tableau has trouble interpreting your Period field datatype. If so, try changing the datatype to Number inside Tableau to see if it converts durations to numbers automatically, if not you may need to write a calculated field for the conversion.

Strange behaviour of Sql query with between operator

There is this strange error in sql query.
The query is something like this.
select * from student where dob between '20150820' and '20150828'
But in the database the column of dob is varchar(14) and is in yyyyMMddhhmmss format,Say my data in the row is (20150827142545).If i fire the above query it should not retrive any rows as i have mentioned yyyyMMdd format in the query.But it retrives the row with yesterday date (i.e 20150827112535) and it cannot get the records with today's date (i.e 20150828144532)
Why is this happening??
Thanks for the help in advance
You can try like this:
select * from student
where convert(date,LEFT(dob,8)) between
convert(date'20150820') and convert(date,'20150828'))
Also as others have commented you need to store your date as Date instead of varchar to avoid such problems in future.
As already mentioned you would need to use the correct date type to have between behave properly.
select *
from student
where convert(date,LEFT(dob,8)) between '20150820' and '20150828'
Sidenote: You don't have to explicitly convert your two dates from text as this will be done implicitly as long as you use an unambiguous date representation, i.e. the ISO standard 'YYYYMMDD' or 'YYYY-MM-DD'. Of course if you're holding the values in variables then use date | datetime datatype
declare #startdate date
declare #enddate date
select *
from student
where convert(date,LEFT(dob,8)) between #startdate and #enddate
Sidenote 2: Performing the functions on your table dob column would prevent any indexes on that column from being used to their full potential in your execution plan and may result in slower execution, if you can, define the correct data type for the table dob column or use a persistent computed column or materialised view if your performance is a real issue.
Sidenote 3: If you need to maintain the time portion in your data i.e. date and time of birth, use the following to ensure all records are captured;
select *
from student
where
convert(date,LEFT(dob,8)) >= '20150820'
and convert(date,LEFT(dob,8)) < dateadd(d,1,'20150828')
All you have to do is to convert first the string to date.
select *
from student
where dob between convert(date, '20150820') and convert(date, '20150828')
Why is this happening?
The comparison is executed from left to right and the order of characters is determined by the codepage in use.
Sort Order
Sort order specifies the way that data values are sorted, affecting
the results of data comparison. The sorting of data is accomplished
through collations, and it can be optimized using indexes.
https://msdn.microsoft.com/en-us/library/ms143726.aspx
There are problems with between in T-SQL.
But if you want a fast answer convert to date first and use >= <= or even datediff to compare - maybe write a between function yourself if you want the easy use like between and no care about begin and start times ...
What do BETWEEN and the devil have in common?

Select data in date format

I have a query in which I want to select data from a column where the data is a date. The problem is that the data is a mix of text and dates.
This bit of SQL only returns the longest text field:
SELECT MAX(field_value)
Where the date does occur, it is always in the format xx/xx/xxxx
I'm trying to select the most recent date.
I'm using MS SQL.
Can anyone help?
Try this using ISDATE and CONVERT:
SELECT MAX(CONVERT(DateTime, MaybeDate))
FROM (
SELECT MaybeDate
FROM MyTable
WHERE ISDATE(MaybeDate) = 1) T
You could also use MAX(CAST(MaybeDate AS DateTime)). I got in the (maybe bad?) habit of using CONVERT years ago and have stuck with it.
To do this without a conversion error:
select max(case when isdate(col) = 1 then cast(col as date) end) -- or use convert()
from . . .
The SQL statement does not specify the order of operations. So, even including a where clause in a subquery will not guarantee that only dates get converted. In fact, the SQL Server optimizer is "smart" enough to do the conversion when the data is brought in and then do the filtering afterwards.
The only operation that guarantees sequencing of operations is the case statement, and there are even exceptions to that.
Another solution would be using PATINDEX in WHERE clause.
SELECT PATINDEX('[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]', field_value)
Problem with this approach is you really are not sure if something is date (e.g. 99/99/9999 is not date).
And problem with IS_DATE is it depends on configuration (e.g. DATEFORMAT).
So, use an appropriate option.

split string in sql query

I have a value in field called "postingdate" as string in 2009-11-25, 12:42AM IST format, in a table named "Post".
I need the query to fetch the details based on date range. I tried the following query, but it throws an error. Please guide me to fix this issue. Thanks in advance.
select postingdate
from post
where TO_DATE(postingDate,'YYYY-MM-DD')>61689
and TO_DATE(postingDate,'YYYY-MM-DD')<61691
As you've now seen, trying to perform any sort of query against a string column which represents a date is a problem. You've got a few options:
Convert the postingdate column to some sort of DATE or TIMESTAMP datatype. I think this is your best choice as it will make querying the table using this field faster, more flexible, and less error prone.
Leave postingdate as a string and use functions to convert it back to a date when doing comparisons. This will be a performance problem as most queries will turn into full table scans unless your database supports function-based indexes.
Leave postingdate as a string and compare it against other strings. Not a good choice as it's tough to come up with a way to do ranged queries this way, as I think you've found.
If it was me I'd convert the data. Good luck.
In SQL Server you can say
Select postingdate from post
where postingdate between '6/16/1969' and '6/16/1991'
If it's really a string, you're lucky that it's in YYYY-MM-DD format. You can sort and compare that format as a string, because the most significant numbers are on the left side. For example:
select *
from Posts
where StringDateCol between '2010-01-01' and '2010-01-02'
There's no need to convert the string to a date, comparing in this way is not affected by the , 12:42AM IST appendage. Unless, of course, your table contains dates from a different time zone :)
You will need to convert your string into a date before you run date range queries on it. You may get away with just using the string if your not interested in the time portion.
The actual functions will depend on your RDBMS
for strings only
select * from posts
where LEFT(postingDate,10) > '2010-01-21'
or
for datetime ( Sybase example)
select * from posts
where convert(DateTime,postingDate) between '2010-01-21' and '2010-01-31'