How to include date range in SQL query for Data Studio BigQuery community connector - sql

I'm trying to make a community connector using the advanced services from Google's Data Studio to connect to my BigQuery data table. The connector is all set up and my getData function returns a query which looks like:
var sqlString = "SELECT * FROM `PROJECT.DATASET.TABLE` WHERE " +
"DATE(timestamp) >= #startDate AND DATE(timestamp) <= #endDate;"
where PROJECT, DATASET, and TABLE are filled in with their respective IDs. The 'timestamp' field is a BigQuery field in my data table of type TIMESTAMP.
In my getConfig function, I'm setting the configuration to add a daterange object to the request passed into getData:
function getConfig() {
...
config.setDateRangeRequired(true);
...
}
I'm then returning the community connector object (defined as 'cc' variable in code below) in my getData function, setting the sql string, query parameters for startDate and endDate, and some other necessary info:
function getData(request) {
...
return cc
.newBigQueryConfig()
.setAccessToken(accessToken) // defined earlier
.setBillingProjectId(billingProjectId) // defined earlier
.setUseStandardSql(true)
.setQuery(sqlString)
.addQueryParameter('startDate', bqTypes.STRING,
request.dateRange.startDate)
.addQueryParameter('endDate', bqTypes.STRING,
request.dateRange.endDate)
}
When I run this connector in a report, it connects to BigQuery and even queries the table, but it does not return any data. When I replace #startDate and #endDate with string literals of format 'yyyy-mm-dd', it works as expected, so it seems like my only problem is that I can't figure out how to set the date range parameters in the query (which I assume I'm supposed to do to allow date range control in data studio reports). How do I configure this daterange object so that people can control daterange tags in data studio reports?
Edit: For clarification, I know how to add the date range control on a report. The problem is that the query does not return any data even when the date range query parameters are passed in.

I ended up fixing my SQL query. I made my WHERE condition as
WHERE DATE(requestTimestamp) BETWEEN #startDate AND #endDate
and it actually returned data correctly. I didn't mention another parameter I was using in my query because I thought it was irrelevant, but I had quotes around another conditioned parameter, which may have screwed up the query. The condition before was more like:
WHERE id = '#id' AND DATE(requestTimestamp) BETWEEN #startDate AND #endDate
I think putting quotes around #id was the problem, because changing the query to:
WHERE id = #id AND DATE(requestTimestamp) BETWEEN #startDate AND #endDate
worked perfectly

You can use a Date range control and configured the timestamp field to it. It should automatically pick the timestamp type field.
Go to Insert and select Date range control to add it to your report.
You can select the date range in view mode.
Like this,

Related

SSRS Report Builder not detecting all parameters

I have a report with StartDate and EndDate date parameters but when I pass it through the report builder, it doesn't seem to map the EndDate param. I know this is because of the IF block at the beginning however I've used this in a previous report without issue.As you can see from this image, although this is not the exact query this has the same issue.
Is there a way around this, or is there a way to update the parameter within an expression to get around the IF block?
The problem is that you have SET the end date.
If you want the report to use both the parameter, you need to unset it.
But if all you are looking for is to check if they are the same and set the end date parameter, then you need to do that in the where clause
Select *
from your table
Where you_column between #start_date and case when #end_date = #start_date then dateadd(day,1,#start_date) else #end_date end

Date calculation with parameter in SSIS is not giving the correct result

I want to load data from the last n days from a data source. To do this, I have a project parameter "number_of_days". I use the parameter in an OleDB data source with a SQL Command, with a clause
WHERE StartDate >= CAST(GETDATE() -? as date)
This parameter is mapped to a project parameter, an Int32. But, if I want to load the last 10 days, it is only giving me the last 8 days.
Version info:
SQL Server Data Tools 15.1.61710.120
Server is SQL Server 2017 standard edition.
I set up a test package, with as little data as possible. There is this data source:
Parameter:
Parameter mapping:
The T-SQL expression (wrong result):
CAST(GETDATE() -? as date)
The SSIS expression for date_calc (correct result):
(DT_DBTIMESTAMP) (DT_DBDATE) DATEADD("DD", - #[$Project::number_of_days] , GETDATE())
I would think that the T-SQL expression and the SSIS expression give the same result (today minus 10 days) but that is not the case when I run the package and store the results in a table. See column date_diff, which gives 8 days instead of 10:
If I replace the parameter by the actual value, I do get the correct result.
A data viewer also shows the incorrect date. When I deploy the package, I get the same result as from the debugger.
Is this a bug, or am I missing something here?
I think the main problem is how OLEDB source detect the parameter data type, i didn't find an official documentation that mentioned that, but you can do a small experiment to see this:
Try to write the following Query in the SQL Command in the OLEDB Source:
SELECT ? as Column1
And then try to parse the query, you will get the following error:
The parameter type for '#P1' cannot be uniquely deduced; two possibilities are 'sql_variant' and 'xml'.
Which means that the query parser try to figure out what is the data type of these parameter, it is not related to the variable data type that you have mapped to it.
Then try to write the following query:
SELECT CAST(? AS INT) AS Column1
And then try to parse the query, you will get:
The SQL Statement was successfully parsed.
Now, let's apply these experiment to your query:
Try SELECT CAST(GETDATE() - ? AS DATE) as Column1 and you will get a wrong value, then try SELECT CAST(GETDATE() - CAST(? AS INT) AS DATE) AS Column1 and you will get a correct value.
Update 1 - Info from official documentation
From the following OLEDB Source - Documentation:
The parameters are mapped to variables that provide the parameter values at run time. The variables are typically user-defined variables, although you can also use the system variables that Integration Services provides. If you use user-defined variables, make sure that you set the data type to a type that is compatible with the data type of the column that the mapped parameter references.
Which implies that the parameter datatype is not related to the variable data type.
Update 2 - Experiments using SQL Profiler
As experiments, i created an SSIS package that export data from OLEDB Source to Recordset Destination. The Data source is the result of the following query:
SELECT *
FROM dbo.DatabaseLog
WHERE PostTime < CAST(GETDATE() - ? as date)
And The Parameter ? is mapped to a Variable of type Int32 and has the Value 10
Before executing the package, i started and SQL Profiler Trace on the SQL Server Instance, after executing the package the following queries are recorded into the trace:
exec [sys].sp_describe_undeclared_parameters N'SELECT *
FROM dbo.DatabaseLog
WHERE PostTime < CAST(GETDATE() -#P1 as date)'
declare #p1 int
set #p1=1
exec sp_prepare #p1 output,N'#P1 datetime',N'SELECT *
FROM dbo.DatabaseLog
WHERE PostTime < CAST(GETDATE() -#P1 as date)',1
select #p1
exec sp_execute 1,'1900-01-09 00:00:00'
exec sp_unprepare 1
The first command exec [sys].sp_describe_undeclared_parameters is to describe the parameter type, if we run it separately it returns the following information:
It shows that the parameter data type is considered as datetime.
The other commands shows some weird statement:
First, the value of #P1 is set to 1
The final query is executed with the following value 1900-01-09 00:00:00
Discussion
In SQL Server database engine the base datetime value is 1900-01-01 00:00:00 which can be retrieved by executing the folloing query:
declare #dt datetime
set #dt = 0
Select #dt
On the other hand, in SSIS:
A date structure that consists of year, month, day, hour, minute, seconds, and fractional seconds. The fractional seconds have a fixed scale of 7 digits.
The DT_DATE data type is implemented using an 8-byte floating-point number. Days are represented by whole number increments, starting with 30 December 1899, and midnight as time zero. Hour values are expressed as the absolute value of the fractional part of the number. However, a floating point value cannot represent all real values; therefore, there are limits on the range of dates that can be presented in DT_DATE.
On the other hand, DT_DBTIMESTAMP is represented by a structure that internally has individual fields for year, month, day, hours, minutes, seconds, and milliseconds. This data type has larger limits on ranges of the dates it can present.
Based on that, i think that there is a difference between the datetime base value between SSIS date data type (1899-12-30) and the SQL Server datetime (1900-01-01), which leads to a difference in two days when performing an implicit conversion to evaluate the parameter value.
References
Integration Services Data Types
Parsing Data
Data type conversion (Database Engine)

Defining an EXTRACT range from a SELECT statement

I intend to process a dataset from EventHub stored in ADLA, in batches. It seems logical to me to process intervals, where my date is between my last execution datetime and the current execution datetime.
I thought about saving the execution timestamps in a table so I can keep track of it, and do the following:
DECLARE #my_file string = #"/data/raw/my-ns/my-eh/{date:yyyy}/{date:MM}/{date:dd}/{date:HH}/{date:mm}/{date:ss}/{*}.avro";
DECLARE #max_datetime DateTime = DateTime.Now;
#min_datetime =
SELECT (DateTime) MAX(execution_datetime) AS min_datetime
FROM my_adldb.dbo.watermark;
#my_json_bytes =
EXTRACT Body byte[],
date DateTime
FROM #my_file
USING new Microsoft.Analytics.Samples.Formats.ApacheAvro.AvroExtractor(#"{""type"":""record"",""name"":""EventData"",""namespace"":""Microsoft.ServiceBus.Messaging"",""fields"":[{""name"":""SequenceNumber"",""type"":""long""},{""name"":""Offset"",""type"":""string""},{""name"":""EnqueuedTimeUtc"",""type"":""string""},{""name"":""SystemProperties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},{""name"":""Properties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes"",""null""]}},{""name"":""Body"",""type"":[""null"",""bytes""]}]}");
How do I properly add this interval to my EXTRACT query? I tested it using a common WHERE clause with interval defined by hand and it worked, but when I attempt to use #min_datetime it doesn't work, since its result is a rowset.
I thought about applying some filtering in a subsequent query, but I am afraid this means #my_json_bytes will extract my whole dataset and filter it aftewards, resulting in a suboptimized query.
Thanks in advance.
You should be able to apply the filter as part of a later SELECT. U-SQL can push up predicates in certain conditions but I haven't been able to test this yet. Try something like this:
#min_datetime =
SELECT (DateTime) MAX(execution_datetime) AS min_datetime
FROM my_adldb.dbo.watermark;
#my_json_bytes =
EXTRACT Body byte[],
date DateTime
FROM #my_file
USING new Microsoft.Analytics.Samples.Formats.ApacheAvro.AvroExtractor(#"{""type"":""record"",""name"":""EventData"",""namespace"":""Microsoft.ServiceBus.Messaging"",""fields"":[{""name"":""SequenceNumber"",""type"":""long""},{""name"":""Offset"",""type"":""string""},{""name"":""EnqueuedTimeUtc"",""type"":""string""},{""name"":""SystemProperties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},{""name"":""Properties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes"",""null""]}},{""name"":""Body"",""type"":[""null"",""bytes""]}]}");
#working =
SELECT *
FROM #my_json_bytes AS j
CROSS JOIN
#min_datetime AS t
WHERE j.date > t.min_datetime;

Invalid Date/Time TimeStamp For Query On Dataset using Query Builder

I'm using Query Builder in Visual Studio and coding in VB.NET
When creating a query in Query Builder my date gets reformatted to a date that my driver cannot process (Relativity ODBC Driver).
SELECT *
FROM Consignments
WHERE Consignments.DATE_ACTUAL >= '2017-03-27'
AND Consignments.DATE_ACTUAL <= '2017-03-27'
If I then preview the query the query will be reformatted to and run as:
SELECT *
FROM Consignments
WHERE Consignments.DATE_ACTUAL >= '27/03/2017'
AND Consignments.DATE_ACTUAL <= '27/03/2017'
Then I get a invalid date/time reply from the driver.
If I manually enter the date in the CommandText on designer view this works so I know the query is correct.
Though if I want to parse parameters as the date through the queries e.g #FROMDATE - #TODATE I get the same error. So how do I change the default date format for the Dataset?
I checked parameter properties and have tried selecting date.
Thanks

How to translate this SQL query to tableau?

I have a SQL query which shows time activity of each account. Database is Microsoft SQL Server on Windows Server 2008.
Help me please to translate this query to tableau with using parameters Parameters.Date1 and Parameters.Date2 instead of #time.
The result of the query:
USER,Date,Total time
USER1,2016-09-22,07:00:00.0000000
USER2,2016-09-22,08:00:00.0000000
USER3,2016-09-22,05:00:00.0000000
SQL query:
DECLARE #time datetime
set #time = '08.09.2016'
SELECT
[User],
CAST(DATEADD(SECOND, sum(datediff(DAY, #time, [Start])), #time) AS date) 'Date',
CAST(DATEADD(SECOND, sum(datediff(SECOND, '00:00:00',[Period])), '00:00:00') AS time) 'Total time'
FROM
[User].[dbo].[UserAction]
WHERE
[Start] >= #time+'00:00:00' and [Start] <= #time+'23:59:59'
GROUP BY
[USER]
input data to build the query:
USER, Start,End,Period
USER1,2016-09-22 09:00:00.000,2016-09-22 12:00:00.000,03:00:00
USER1,2016-09-22 12:00:00.000,2016-09-22 13:00:00.000,01:00:00
USER1,2016-09-22,13:00:00.000,2016-09-22 16:00:00.000,03:00:00
USER2,2016-09-22,09:00:00.000,2016-09-22 13:00:00.000,04:00:00
USER2,2016-09-22,13:00:00.000,2016-09-22 17:00:00.000,04:00:00
USER3,2016-09-22,09:00:00.000,2016-09-22 10:00:00.000,01:00:00
USER3,2016-09-22,10:00:00.000,2016-09-22 12:00:00.000,02:00:00
USER3,2016-09-22,12:00:00.000,2016-09-22 14:00:00.000,02:00:00
I don't have enough imaginary stack overflow points yet to make a comment instead of an answer, but I would agree with Gordon Linoff.
A table valued function in sql can be used directly in a Tableau data source, and it's treated just like a table.
Note I did not test the below, but here is what the equivalent function might look like:
CREATE FUNCTION dbo.MyFuntion (#time datetime)
RETURNS TABLE
AS
RETURN
(
SELECT
[User]
,cast(DATEADD(SECOND, sum(datediff(DAY, #time,[Start])),#time) as date)'Date'
,cast(DATEADD(SECOND, sum(datediff(SECOND, '00:00:00',[Period])),'00:00:00') as time)'Total time'
FROM
[User].[dbo].[UserAction]
WHERE
[Start] >= #time+'00:00:00' and [Start] <= #time+'23:59:59'
GROUP BY [USER]
);
Tableau 9 (haven't tried 10) seems to discourage custom SQL (it warns anyone that opens your workbook) and stored procedures (slow vs. same sql in a function).
Alternatively, adding the pure dbo.UserAction table to a data source and making calculated fields for the second two columns might work: Tableau Documentation. It seems to have all the functions needed to manipulate dates. However, there may be some crazy limitation associated with parameters that might limit it, honestly can't remember off the top of my head.
You don't need custom SQL for this. Keep it simple. Connect Tableau directly to your UserAction table.
You can either:
Put Day(Start) on the filter shelf, Make sure it is a continuous Date truncated to the Day. Show the filter and set the filter to let you pick a single value at a time - I would choose a slider UI.
Or write a calculated field to put on the filters shelf that references a parameter such as day(Start) = day(Date1)
Put User on one shelf, such as rows, and Sum(Period) on another such as columns. That should do it unless Tableau has trouble interpreting your Period field datatype. If so, try changing the datatype to Number inside Tableau to see if it converts durations to numbers automatically, if not you may need to write a calculated field for the conversion.