Impala query cannot retrieve result with NullPointerException - hive

I have the following query that I run over hive/impala:
select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower
from post p
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused'
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;
but for some reason I get the following error result
Your query has the following error(s):
Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)
I checked the query and I cannot find a problem with it, can anyone please help and guide to where the problem is?why do I have this error instead of result set

Consider carefully reformatting the query, as in some cases, Impala crashes with SEGV when the SQL parse itself fails due to simple issues like spaces. If you're running Cloudera, you'll find logs in /run/cloudera-scm-agent/process on the node that ran the query.
We have resolved these issues by being careful about SQL formatting (which is also just good practice as it make query errors easier to spot), e.g.
SELECT
COUNT(p.id) AS tweet_count,
p.author AS author,
p.profile_image_url AS profile_image_url,
p.screen_name AS screen_name,
concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
p.message AS MESSAGE,
af.followerid AS follower
FROM
post p
INNER JOIN
author_follower af
ON
af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
author,
profile_image_url,
screen_name,
MESSAGE,
follower
ORDER BY
CAST(MIN(postyear) AS INT),
CAST(MIN(postmonth) AS INT),
CAST(MIN(postday) AS INT),
CAST(MIN(posthour) AS INT) ASC;
(By the way, I used dbVisualizer to validate and reformat the query syntax -- great tool to consider)

Related

Bigquery job failed with error: Encountered " "FROM" "FROM ""

I'm calling a SQL query with a BigQuery API with Airflow. This query works perfectly fine in the BigQuery workspace but says I'm writing FROM FROM even though I'm not...
The logs say line 4, character 20 is where the error occurs which corresponds to:
, EXTRACT(DATE FROM event_time) AS session_date.
My overall query structure looks something like:
SELECT * FROM
((SELECT
fields_here
FROM table_name
LEFT JOIN UNNEST(sub_table) AS s
WHERE 1=1
UNION ALL
(SELECT
fields_here
FROM table_name
LEFT JOIN UNNEST(sub_table) AS s
WHERE 1=1
ORDER BY 1, 2))
ORDER BY 1, 2
I'm also using the LEAD() window function and COALESCE() but not sure if that matters. Really confused why this error is occurring...
Issue was not adding use_legacy_sql=False argument in Airflow

Can't join FREETEXTTABLE with typeORM functions

I'm working on a graphQL backend with TypeORM
I found FREETEXT, CONTAiNS, FREETEXTTABLE and CONTAINSTABLE options for fulltext searching in my SQL database.
As FREETEXTTABLE has a "RANK column, it is more useful and I’m using this option.
I added required settings to the database and by below query it’s working correctly when applying the query directly into the database:
SELECT * FROM content
INNER JOIN freetexttable( content , *, 'test text') newtable ON newtable.[KEY] = "id"
ORDER BY newtable.RANK desc
OFFSET 0 ROWS FETCH NEXT 12 ROWS ONLY
But when I try to use it with TypeORM like below I have some errors.
import { getRepository } from "typeorm";
let repo = getRepository(Content)
return repo.createQueryBuilder("qb")
.innerJoin(FREETEXTTABLE( qb , *, 'light'), newtable,newtable.[KEY] = qb.id)
.skip(0)
.take(12)
.getManyAndCount();
And it returns this error:
"Error: Invalid object name 'FREETEXTTABLE( qb , *, 'light')'."
Could you please let me know if there is a problem with my code or maybe you know a better option for fulltext search in SQL database
I was searching for the same issue. I was unable to find a solution to use FREETEXTTABLE with typeorm, so I had to implement it using raw SQL. using below code
repo.query(" RAW SQL ")

Apache Drill Timestampdiff on Oracle DB

Hey everyone im relativly new to Apache Drill and im having troubles converting my Oracle specific sql scripts (pl/sql) to Drill based querys.
For example i have a Scripts who checks for processed data in the last X Days.
In this script im using the the sysdate function.
Here is my old script:
SELECT i.id,i.status,status_text,i.kunnr,i.bukrs,i.belnr,i.gjahr,event,i.sndprn,i.createdate,executedate,tstamp,v.typ_text,i.docnum,i.description, i.*
FROM in_job i JOIN vstatus_injob v ON i.id= v.id
WHERE 1=1
AND i.createdate > sysdate - 30.5
order by i.createdate desc;
When i looked up in terms of drill specific Datetime Diff functions i found "TIMESTAMPDIFF".
So here is my "drillified" script:
SELECT i.id, i.status, status_text, i.kunnr, i.bukrs, i.belnr, i.gjahr, i.event, i.sndprn, i.createdate, i.executedate, i.tstamp,v.typ_text,i.docnum,i.description,i.*
FROM SchemaNAME.IN_JOB i JOIN SchemaNAME.VSTATUS_INJOB v ON i.id=v.id
WHERE TIMESTAMPDIFF(DAY, CURRENT_TIMESTAMP, i.createdate) >=30
And the Error that is returned reads like this:
DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query.
By further inspection i can see the Oracle specific error that reads:
Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "TIMESTAMPDIFF": invalid ID
So now my question:
I thought apache drill replaces the function "TIMSTAMPDIFF" at runtime. But from what i can see in the logs its more like that Drill Hands over the Function Call "TIMESTAMPDIFF" to the Oracle database.
If thats true, how could i change my script to calculate the time difference (in days) and compare it to an int (ie 30 in the script).
If i use sysdate like above Apache Drill jumps in and says it doesnt know "sysdate".
How would you guyes handle that?
Thanks in advance and so long
:)
I have found a solution...
Just in Case someone (or even me in the future) is having a similar problem.
{
"queryType": "SQL",
"query": "select to_char(SELECT CURRENT_TIMESTAMP - INTERVAL XX MONTH FROM (VALUES(1)),'dd.MM.yy')"
}
With some to_char and the use of the CURRENT_TIMESTAMP - Interval Function Calls i can get everything i needed.
I took the query above packed it into an Grafana Variable, named it "timeStmpDiff" and then queried everything with an json Api Call to my Drill instance.
Basically:
"query" : "SELECT i.id, i.status, status_text, i.kunnr, i.bukrs, i.belnr, i.gjahr, i.event, i.sndprn, i.createdate, i.executedate, i.tstamp,v.typ_text,i.docnum,i.description,i.* FROM ${Schema}.IN_JOB i JOIN ${Schema}.VSTATUS_INJOB v ON i.id=v.id WHERE i.createdate >= '${timeStmpDiff}' order by i.createdate desc"
You can, of course query it in on go with an subselect.
But because i use grafana it made sense to me to bundle that in a Variable.

MS Access query produces invalid procedure call

My query in access works fine if i only use the below query with a select statement. As soon as it becomes an append query it produces an "invalid procedure call" error.
I have narrowed down the offending columns as being "Publ" and "PublLong". Both are long text strings. If I remove these two columns the query updates without an error.
Here is a sample data point found in the [Bezeichung] Field:
publications.bank.com/publ-dl-ch/pdf/WhatsUp_20181113_en.pdf
I checked the table that it is being inserted to and the data types are the same nor did i see any other setting that would block the insertion.
How can i get it to work?
INSERT INTO tbl_MatomoRaw ( DownloadDate, IntExt, Publ, PublLong,
PublDate, [Language], Download_Visits, PublMonth )
SELECT
Date() AS DownloadDate,
Left([Bezeichnung],InStr([Bezeichnung],".")-1) AS IntExt,
Nz(Mid([Bezeichnung],InStrRev([Bezeichnung],"/")+1,InStr([Bezeichnung],"_")-
InStrRev([Bezeichnung],"/")-1),"") AS Publ,
Mid([Bezeichnung],InStrRev([Bezeichnung],"/")+1,InStrRev([Bezeichnung],"_")-
InStrRev([Bezeichnung],"/")-1) AS PublLong,
Mid([Bezeichnung],InStr([Bezeichnung],"_")+1,8) AS PublDate,
Mid([Bezeichnung],Len([Bezeichnung])-5,2) AS [Language],
xlsx_Output.[Eindeutige Downloads] AS Download_Visits,
Mid([Bezeichnung],InStr([Bezeichnung],"_")+1,6) AS PublMonth
FROM xlsx_Output
WHERE
(((Nz(Mid([Bezeichnung],InStrRev([Bezeichnung],"/")+1,InStr([Bezeichnung],"_")-
InStrRev([Bezeichnung],"/")-1),"")) Not Like "#Func!"));
#Func! indicates one of your functions is causing an error.
Your query uses multiple functions that run into trouble when your input doesn't meet that format, pre-filter instead of filtering on an error, since you can't filter on an error when appending:
INSERT INTO tbl_MatomoRaw ( DownloadDate, IntExt, Publ, PublLong,
PublDate, [Language], Download_Visits, PublMonth )
SELECT
Date() AS DownloadDate,
Left([Bezeichnung],InStr([Bezeichnung],".")-1) AS IntExt,
Nz(Mid([Bezeichnung],InStrRev([Bezeichnung],"/")+1,InStr([Bezeichnung],"_")-
InStrRev([Bezeichnung],"/")-1),"") AS Publ,
Mid([Bezeichnung],InStrRev([Bezeichnung],"/")+1,InStrRev([Bezeichnung],"_")-
InStrRev([Bezeichnung],"/")-1) AS PublLong,
Mid([Bezeichnung],InStr([Bezeichnung],"_")+1,8) AS PublDate,
Mid([Bezeichnung],Len([Bezeichnung])-5,2) AS [Language],
[Eindeutige Downloads] AS Download_Visits,
Mid([Bezeichnung],InStr([Bezeichnung],"_")+1,6) AS PublMonth
FROM (SELECT * FROM xlsx_Output WHERE Len(Bezeichnung) > 5 AND Bezeichnung LIKE "*?.?*" AND Bezeichnung LIKE "*_????????*" AND Bezeichnung LIKE "*?\?*")
WHERE
(((Nz(Mid([Bezeichnung],InStrRev([Bezeichnung],"/")+1,InStr([Bezeichnung],"_")-
InStrRev([Bezeichnung],"/")-1),"")) Not Like "#Func!"));
Since I don't know exactly where the errors occur, I can't write up a proper filter to identify them, but judging by your query they should include a slash and a symbol after that slash, an underscore and at least 8 symbols after that underscore, and a dot with at least one symbol before and after the dot.

U-sql error: Expected one of: AS EXCEPT FROM GROUP HAVING INTERSECT OPTION ORDER OUTER UNION UNION WHERE ';' ')' ','

I have a following table:
EstimatedCurrentRevenue -- Revenue column value of yesterday
EstimatedPreviousRevenue --- Revenue column value of current day
crmId
OwnerId
PercentageChange.
I am querying two snapshots of the similarly structured data in Azure data lake and trying to query the percentage change in Revenue.
Following is my query i am trying to join on OpportunityId to get the difference between the revenue values:
#opportunityRevenueData = SELECT (((opty.EstimatedCurrentRevenue - optyPrevious.EstimatedPreviousRevenue)*100)/opty.EstimatedCurrentRevenue) AS PercentageRevenueChange, optyPrevious.EstimatedPreviousRevenue,
opty.EstimatedCurrentRevenue, opty.crmId, opty.OwnerId From #opportunityCurrentData AS opty JOIN #opportunityPreviousData AS optyPrevious on opty.OpportunityId == optyPrevious.OpportunityId;
But i get the following error:
E_CSC_USER_SYNTAXERROR: syntax error. Expected one of: AS EXCEPT FROM
GROUP HAVING INTERSECT OPTION ORDER OUTER UNION UNION WHERE ';' ')'
','
at token 'From', line 40
near the ###:
This expression is having the problem i know but not sure how to fix it.
(((opty.EstimatedCurrentRevenue - optyPrevious.EstimatedPreviousRevenue)*100)/opty.EstimatedCurrentRevenue)
Please help, i am completely new to U-sql
U-SQL is case-sensitive (as per here) with all SQL reserved words in UPPER CASE. So you should capitalise the FROM and ON keywords in your statement, like this:
#opportunityRevenueData =
SELECT (((opty.EstimatedCurrentRevenue - optyPrevious.EstimatedPreviousRevenue) * 100) / opty.EstimatedCurrentRevenue) AS PercentageRevenueChange,
optyPrevious.EstimatedPreviousRevenue,
opty.EstimatedCurrentRevenue,
opty.crmId,
opty.OwnerId
FROM #opportunityCurrentData AS opty
JOIN
#opportunityPreviousData AS optyPrevious
ON opty.OpportunityId == optyPrevious.OpportunityId;
Also, if you are completely new to U-SQL, you should consider working through some tutorials to establish the basics of the language, including case-sensitivity. Start at http://usql.io/.
This same crazy sounding error message can occur for (almost?) any USQL syntax error. The answer above was clearly correct for the provided code.
However since many folks will probably get to this page from a search for 'AS EXCEPT FROM GROUP HAVING INTERSECT OPTION ORDER OUTER UNION UNION WHERE', I'd say the best advice to handle these is look closely at the snippet of your code that the error message has marked with '###'.
For example I got to this page upon getting a syntax error for a long query and it turned out I didn't have a casing issue, but just a malformed query with parens around the wrong thing. Once I looked more closely at where in the snippet the ### symbol was, the error became clear.