Azure stream analytics query - how to set even timestamp based on a javascript udf function? - azure-stream-analytics

I am timestamping data stream input events by a property "TS" in the message. However before I timestamp it using TS, I want to ensure that TS is ISO8601 compliant. If TS is not ISO8601 ocmpliant, I want to use EventEnqueuedUtcTime which is the arrival time of the message as timestamp.
My query looks something like this
SELECT
T.*
FROM
input TIMESTAMP BY PARTITION BY PartitionId TIMESTAMP BY udf.getEventTimestamp(T)
Here udf.getEventTimestamp(T) returns the TS property in message(T) if it is ISO8601-compliant otherwise it will return EventEnqueuedUtcTime( arrival time of message in Iot Hub).
Running this script locally gives me the exception -
Error : Unexpected hosted function call
I also tried to use CASE construct to accomplish this
SELECT
T.*
FROM
input TIMESTAMP BY PARTITION BY PartitionId TIMESTAMP BY
CASE
WHEN udf.isValid(T.TS) THEN T.TS
ELSE T.EventEnqueuedUtcTime
END
where udf.isValid(T.TS) returns true if the property TS is a valid ISO8601 compliant timestamp.
Again running this locally returns - Error : Unexpected hosted function call
As per Microsoft Azure docs, After you add a JavaScript user-defined function to a job, you can use the function anywhere in the query, like a built-in scalar function
Does this mean that we cannot use udfs in TIMESTAMP BY and CASE constructs?
Can you suggest any workaround?

At this time we can't use UDF within the TIMESTAMP BY clause.
However we case use TRY_CAST to solve your requirement.
Here's the query with the workaround:
SELECT
T.*
FROM
input PARTITION BY PartitionId TIMESTAMP BY
CASE
WHEN TRY_CAST(T.TS AS DateTime) is not null THEN T.TS
ELSE T.EventEnqueuedUtcTime
END
Let me know if you have any further question.
Thanks,
JS

Related

BigQuery #run_date used as different types

I have a scheduled query using the #run_date parameter in BigQuery.
SELECT
#run_date AS run_date,
timestamp,
event
FROM
`ops-data.usage.full_user_dataset`
WHERE
DATE(timestamp) < #run_date
timestamp is of type TIMESTAMP
I am unable to schedule it - the schedule option is greyed out in the new UI and unavailable in the classic UI (it says it requires valid SQL). If I try and run the query then I receive error message Undeclared parameter 'run_date' is used assuming different types (DATE vs INT64) at [2:3]
After trying various things I was able to schedule the query below. The idea was to force BigQuery to treat #run_date as a date without changing it
SELECT
DATE_SUB(#run_date, INTERVAL 0 DAY) AS run_date,
timestamp,
event
FROM
`ops-data.usage.full_user_dataset`
WHERE
DATE(timestamp) < #run_date
Why does this error occur and why does the fix work?
I think it is a bug around #run_date, below workaround should work for you until it is fixed.
DECLARE run_date DATE DEFAULT #run_date;
SELECT
run_date,
timestamp,
event
FROM
`ops-data.usage.full_user_dataset`
WHERE
DATE(timestamp) < run_date
BTW, since the workaround utilizes Scripting and not being able to set a destination table, if you do need a destination table, it has to be written as:
CREATE OR REPLACE TABLE <yourDestinationTable>
AS SELECT ... -- your query

No matching signature for function TIMESTAMP_DIFF.... error on BigQuery

I'm trying to get the timestamp difference between two date fields in my table. I know both dates are on TIMESTAMP format, but when I tried using TIMESTAMP_DIFF function I get an error saying "No matching signature for function TIMESTAMP_DIFF for argument types: STRING, STRING, DATE_TIME_PART. Supported signature: TIMESTAMP_DIFF(TIMESTAMP, TIMESTAMP, DATE_TIME_PART) at [27:8]"
I also tried formatting them again in the query(as done on the example for FIRST_VALUE(): https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#datetime_diff , and then it showed the same error but for FORMAT_TIMESTAMP.
Any ideas what I could do to fix this or to get the time difference between two fields for each row?
use below (BigQuery Standard SQL)
#standardSQL
SELECT TIMESTAMP_DIFF(PARSE_TIMESTAMP('%Y-%m-%d %H:%M:%S', prev_time), PARSE_TIMESTAMP('%Y-%m-%d %H:%M:%S', event_datetime), MINUTE)
FROM `project.dataset.messages`

Bigquery - select timestamp as human readable datetime

How to select timestamp(stored as seconds) as human readable datetime in Google Bigquery?
schema
id(STRING) | signup_date(TIMESTAMP)
I wrote a query using DATE function, but getting error
SELECT DATE(create_date) FROM [accounts]
Error: Invalid function name: DATE; did you mean CASE?
Thanks!
I think I found a working solution from Bigquery reference page. Basically BigQuery stores TIMESTAMP data internally as a UNIX timestamp with microsecond precision.
SELECT SEC_TO_TIMESTAMP(date) FROM ...

Cannot use calculated offset in BigQuery's DATE_ADD function

I'm trying to create a custom query in Tableau to use on Google's BigQuery. The goal is to have an offset parameter in Tableau that changes the offsets used in a date based WHERE clause.
In Tableau it would look like this:
SELECT
DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),<Parameters.Offset>-1,"MONTH") as month_index,
COUNT(DISTINCT user_id, 1000000) as distinct_count
FROM
[Orders]
WHERE
order_date >= DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),<Parameters.Offset>-12,"MONTH")
AND
order_date < DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),<Parameters.Offset>-1,"MONTH")
However, BigQuery always returns an error:
Error: DATE_ADD 2nd argument must have INT32 type.
When I try the same query in the BigQuery editor using simple arithmetic it fails with the same error.
SELECT
DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),5-3,"MONTH") as month_index,
FROM [Orders]
Any workaround for this? My only option so far is to make multiple offsets in Tableau, it seems.
Thanks for the help!
I acknowledge that this is a hole in functionality of DATE_ADD. It can be fixed, but it will take some time until fix is rolled into production.
Here is a possible workaround. It seems to work if the first argument to DATE_ADD is a string. Then you can truncate the result to a month boundary and convert it from a timestamp to a string.
SELECT
FORMAT_UTC_USEC(UTC_USEC_TO_MONTH(DATE_ADD(CURRENT_DATE(),5-3,"MONTH"))) as month_index;

Sql Query using 'Like' is giving results but using '=' does not returns any result in Oracle

The Query using LIKE :(This query when fired gives the desired result)
select * from catissue_audit_event where event_timestamp like '16-DEC-14'
But when using query with '=' results in an empty resultset
select * from catissue_audit_event where event_timestamp='16-DEC-14'
Here event_timestamp is of type Date
Strange thing is that the query runs for other dates such as:
select * from catissue_audit_event where event_timestamp='15-DEC-14'
What can be the issue? I already checked for leading and trailing spaces in the data
Output after running the first query:
In Oracle a DATE (and of course a TIMESTAMP) column contains a time part as well.
Just because your SQL client is hiding the time, doesn't mean it isn't there.
If you want all rows from a specific day (ignoring the time) you need to use trunc()
select *
from catissue_audit_event
where trunc(event_timestamp) = DATE '2014-12-16';
Be aware that this query will not use an index on the event_timestamp column.
You should also not rely on implicit data type conversion as you do with the expression event_timestamp = '16-DEC-14. That statement is going to fail if I run it from my computer because of different NLS settings. Always use a proper DATE literal (as I have done in my statement). If you don't like the unambiguous ISO date, then use to_date():
where trunc(event_timestamp) = to_date('16-12-2014', 'dd-mm-yyyy');
You should avoid using month names unless you know that all environments (which includes computers and SQL clients) where your SQL statement is executed are using the same NLS settings. If you are sure, you can use e.g. to_date('16-DEC-14', 'dd-mon-yy')
The reason why this is different is different to the solution to your issue.
The solution to your issue is to stop performing date comparisons by implicit conversion to a string. Convert your string to a date to perform a date comparison:
select * from catissue_audit_event where event_timestamp = date '2014-12-16'
I cannot stress this enough; when performing a date comparison only compare dates.
Your column EVENT_TIMESTAMP is being implicitly (this is bad) converted to a date in accordance with your NLS_DATE_FORMAT, which you can find as follows:
select * from nls_session_parameters
This governs how date-data is displayed and implicitly converted. The reason why LIKE works and and = doesn't is because your NLS_DATE_FORMAT is masking additional data. In other words, your date has a time component.
If you run the following and then re-select the data from your table you'll see the additional time component
alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss'
Thus, if you want all the data for a specific date without constraint on time you'll need to remove the time component:
select * from catissue_audit_event where trunc(event_timestamp) = date '2014-12-16'
have you tried matching the event_timestamp format example: DD-MMM-YY with the date that you are passing?