Apache Drill Timestampdiff on Oracle DB - sql

Hey everyone im relativly new to Apache Drill and im having troubles converting my Oracle specific sql scripts (pl/sql) to Drill based querys.
For example i have a Scripts who checks for processed data in the last X Days.
In this script im using the the sysdate function.
Here is my old script:
SELECT i.id,i.status,status_text,i.kunnr,i.bukrs,i.belnr,i.gjahr,event,i.sndprn,i.createdate,executedate,tstamp,v.typ_text,i.docnum,i.description, i.*
FROM in_job i JOIN vstatus_injob v ON i.id= v.id
WHERE 1=1
AND i.createdate > sysdate - 30.5
order by i.createdate desc;
When i looked up in terms of drill specific Datetime Diff functions i found "TIMESTAMPDIFF".
So here is my "drillified" script:
SELECT i.id, i.status, status_text, i.kunnr, i.bukrs, i.belnr, i.gjahr, i.event, i.sndprn, i.createdate, i.executedate, i.tstamp,v.typ_text,i.docnum,i.description,i.*
FROM SchemaNAME.IN_JOB i JOIN SchemaNAME.VSTATUS_INJOB v ON i.id=v.id
WHERE TIMESTAMPDIFF(DAY, CURRENT_TIMESTAMP, i.createdate) >=30
And the Error that is returned reads like this:
DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query.
By further inspection i can see the Oracle specific error that reads:
Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "TIMESTAMPDIFF": invalid ID
So now my question:
I thought apache drill replaces the function "TIMSTAMPDIFF" at runtime. But from what i can see in the logs its more like that Drill Hands over the Function Call "TIMESTAMPDIFF" to the Oracle database.
If thats true, how could i change my script to calculate the time difference (in days) and compare it to an int (ie 30 in the script).
If i use sysdate like above Apache Drill jumps in and says it doesnt know "sysdate".
How would you guyes handle that?
Thanks in advance and so long
:)

I have found a solution...
Just in Case someone (or even me in the future) is having a similar problem.
{
"queryType": "SQL",
"query": "select to_char(SELECT CURRENT_TIMESTAMP - INTERVAL XX MONTH FROM (VALUES(1)),'dd.MM.yy')"
}
With some to_char and the use of the CURRENT_TIMESTAMP - Interval Function Calls i can get everything i needed.
I took the query above packed it into an Grafana Variable, named it "timeStmpDiff" and then queried everything with an json Api Call to my Drill instance.
Basically:
"query" : "SELECT i.id, i.status, status_text, i.kunnr, i.bukrs, i.belnr, i.gjahr, i.event, i.sndprn, i.createdate, i.executedate, i.tstamp,v.typ_text,i.docnum,i.description,i.* FROM ${Schema}.IN_JOB i JOIN ${Schema}.VSTATUS_INJOB v ON i.id=v.id WHERE i.createdate >= '${timeStmpDiff}' order by i.createdate desc"
You can, of course query it in on go with an subselect.
But because i use grafana it made sense to me to bundle that in a Variable.

Related

Bigquery job failed with error: Encountered " "FROM" "FROM ""

I'm calling a SQL query with a BigQuery API with Airflow. This query works perfectly fine in the BigQuery workspace but says I'm writing FROM FROM even though I'm not...
The logs say line 4, character 20 is where the error occurs which corresponds to:
, EXTRACT(DATE FROM event_time) AS session_date.
My overall query structure looks something like:
SELECT * FROM
((SELECT
fields_here
FROM table_name
LEFT JOIN UNNEST(sub_table) AS s
WHERE 1=1
UNION ALL
(SELECT
fields_here
FROM table_name
LEFT JOIN UNNEST(sub_table) AS s
WHERE 1=1
ORDER BY 1, 2))
ORDER BY 1, 2
I'm also using the LEAD() window function and COALESCE() but not sure if that matters. Really confused why this error is occurring...
Issue was not adding use_legacy_sql=False argument in Airflow

TrinoUserError (type=USER_ERROR, name=SYNTAX_ERROR, message="line 7:26: mismatched input 'COUNT'. Expecting: '*', <expression>")

I am using dbt-trino and for some reason, it doesn't understand the MySQL query that works fine by executing it directly on MySQL. In this query, I want to select and group records that have been created during the previous month.
The Query:
SELECT order_location, COUNT(*) as order_count
FROM {{ ref('x_stg_order_fields') }}
WHERE
created_at >= DATE_FORMAT( CURRENT_DATE - INTERVAL 1 MONTH, '%Y/%m/01' )
AND
created_at < DATE_FORMAT( CURRENT_DATE, '%Y/%m/01' )
GROUP BY order_location
While this query works fast and successfully directly on MySQL, it returns this error when executing with dbt run:
TrinoUserError(type=USER_ERROR, name=SYNTAX_ERROR, message="line 7:53: mismatched input 'COUNT'. Expecting: '*', <expression>")
Does this mean that dbt-trino doesn't support all MySQL functions?
That error is coming from your database, not from dbt itself. dbt does not parse your SQL commands, it just passes them through to your connected database.
My guess is that {{ ref('x_stg_order_fields' }} may be referring to an ephemeral model that contains a syntax error, or possibly a field named count that isn't quoted?
You can confirm or disprove that by looking at the SQL that dbt tried to run in your database, by inspecting the target directory in your project. Specifically, target/run/path/to/your_model.sql will show you the actual command being executed. You should be able to check line 7, col 53 in that file, and you will see the code that trino is erroring about.

Extracting hour by using coalese in SQL on a timestamp

I am trying to update a query to extract the hour from a timestamp and I keep getting an error. The error I get is due to the FROM clause I was using.
SELECT
analytics_platform_data_type
, activity_date_pt
, activity_timestamp_pt
, analytics_platform_timestamp_utc
, analytics_platform_timestamp_utc_iso
--This is the clause that is causing the problem (Begin)
, extract(hour from coalesce(activity_timestamp_pt)) as latd_hour_pt
--Clause above is the issue; Line above is line 9 (End)
, analytics_platform_ platform
, ad_channel_name
, publisher_name
, ip_address
, analytics_platform_unique_activity_id
, click_id
, latd_custom_fields
FROM table_date_range([AllData_AnalyticsMobileData_], timestamp('2018-09-
25'), timestamp('2018-09-27'))
where 1=1
and analytics_platform_data_type = 'CLICK'
and partner_name = 'ABC123'
If I remove the extract hour piece the query works fine. When I add it I get the error: Encountered " "FROM" "from "" at line 9, column 16. Was expecting: ")" ...
I have seen the clause I am trying to use in the above query used before, but it was a much more complex query that was using sub queries. Really not sure what the issue is. (Using Google Big Query Legacy SQL)
Your query is mixing Legacy Syntax (table_date_range) with Standard Syntax (Extract)
If for some reason you need to stick with Legacy SQL - use HOUR() instead of EXTRACT()
But it is much recommended to migrate stuff to Standard SQL - where you should use wildcard functions instead of table_date_range
Something like
FROM `project.dataset.AllData_AnalyticsMobileData_*`
WHERE _TABLE_SUFFIX BETWEEN '2018-09-25' AND '2018-09-27'
see more at https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy-sql#table_decorators_and_wildcard_functions in Migrating to Standard SQL doc

U-sql error: Expected one of: AS EXCEPT FROM GROUP HAVING INTERSECT OPTION ORDER OUTER UNION UNION WHERE ';' ')' ','

I have a following table:
EstimatedCurrentRevenue -- Revenue column value of yesterday
EstimatedPreviousRevenue --- Revenue column value of current day
crmId
OwnerId
PercentageChange.
I am querying two snapshots of the similarly structured data in Azure data lake and trying to query the percentage change in Revenue.
Following is my query i am trying to join on OpportunityId to get the difference between the revenue values:
#opportunityRevenueData = SELECT (((opty.EstimatedCurrentRevenue - optyPrevious.EstimatedPreviousRevenue)*100)/opty.EstimatedCurrentRevenue) AS PercentageRevenueChange, optyPrevious.EstimatedPreviousRevenue,
opty.EstimatedCurrentRevenue, opty.crmId, opty.OwnerId From #opportunityCurrentData AS opty JOIN #opportunityPreviousData AS optyPrevious on opty.OpportunityId == optyPrevious.OpportunityId;
But i get the following error:
E_CSC_USER_SYNTAXERROR: syntax error. Expected one of: AS EXCEPT FROM
GROUP HAVING INTERSECT OPTION ORDER OUTER UNION UNION WHERE ';' ')'
','
at token 'From', line 40
near the ###:
This expression is having the problem i know but not sure how to fix it.
(((opty.EstimatedCurrentRevenue - optyPrevious.EstimatedPreviousRevenue)*100)/opty.EstimatedCurrentRevenue)
Please help, i am completely new to U-sql
U-SQL is case-sensitive (as per here) with all SQL reserved words in UPPER CASE. So you should capitalise the FROM and ON keywords in your statement, like this:
#opportunityRevenueData =
SELECT (((opty.EstimatedCurrentRevenue - optyPrevious.EstimatedPreviousRevenue) * 100) / opty.EstimatedCurrentRevenue) AS PercentageRevenueChange,
optyPrevious.EstimatedPreviousRevenue,
opty.EstimatedCurrentRevenue,
opty.crmId,
opty.OwnerId
FROM #opportunityCurrentData AS opty
JOIN
#opportunityPreviousData AS optyPrevious
ON opty.OpportunityId == optyPrevious.OpportunityId;
Also, if you are completely new to U-SQL, you should consider working through some tutorials to establish the basics of the language, including case-sensitivity. Start at http://usql.io/.
This same crazy sounding error message can occur for (almost?) any USQL syntax error. The answer above was clearly correct for the provided code.
However since many folks will probably get to this page from a search for 'AS EXCEPT FROM GROUP HAVING INTERSECT OPTION ORDER OUTER UNION UNION WHERE', I'd say the best advice to handle these is look closely at the snippet of your code that the error message has marked with '###'.
For example I got to this page upon getting a syntax error for a long query and it turned out I didn't have a casing issue, but just a malformed query with parens around the wrong thing. Once I looked more closely at where in the snippet the ### symbol was, the error became clear.

PostgreSQL : Operator does not exist: Integer < Interval

I have wrote a query so I can view people who are overdue an order based on their average order dates. The query is to be ran on a PostgreSQL database, and will be executed from a Java process.
However in the line :
CASE WHEN
(max(date_trunc('day', dateordered))-
min(date_trunc('day', dateordered)) ) /
count(distinct dateordered) + 5 <
date_trunc('day',now()) -
max(date_trunc('day', dateordered)) THEN 'ORDEROVERDUE' ELSE
null
END
I receive the error message :
Operator does not exist : integer < interval
I have read a lot of questions which have a similar issue, but none which seem to fix my particular issue.
If I alter my query to this :
CASE WHEN
(max(dateordered::date) - min(dateordered::date) )/
count(distinct dateordered) + 5 <
now()::date - max(dateordered::date) THEN
'ORDEROVERDUE' ELSE null
END
Then it runs on the database, however I can't get this syntax to work in my process in eclipse.
My understanding of SQL is letting me down. I understand the general reason behind the error, but I am unable to create a solution.
Is there a way of altering this line in a way which removes the error and I can still get the desired result?