Running hive script in oozie - hive

I am trying to create table from union of two tables. its working well in hive console but giving error while I run it through Oozie.
query is:
CREATE TABLE reconcile_table AS
SELECT t1.* FROM
(SELECT * FROM base_table
UNION ALL
SELECT * FROM incremental_table) t1
JOIN
(SELECT EMPNO, max(modified_date) max_modified FROM
(SELECT * FROM base_table
UNION ALL
SELECT * FROM incremental_table) t2
GROUP BY EMPNO) s
ON t1.EMPNO = s.EMPNO AND t1.modified_date = s.max_modified;
and error is :
ERROR org.apache.hadoop.hive.ql.Driver - FAILED: IllegalArgumentException java.net.URISyntaxException: Relative path in absolute URI: file:./tmp/yarn/1c052ddc-1fff-48d0-8eec-a0581fe1f5e4/hive_2016-04-20_10-25-36_116_4089571356350445156-1
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:./tmp/yarn/1c052ddc-1fff-48d0-8eec-a0581fe1f5e4/hive_2016-04-20_10-25-36_116_4089571356350445156-1
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
Please tell me if you have an idea.

Related

ParseException line cannot recognize input near '(' 'select' 'from' in joinSource

I am trying to execute a query in hive and getting the error. I check over and over again but I cannot see any problem.
select
a.phone_no,
a.app_name
from
(select * from (select app_name,phone_no from lc_app_flag) ) a
inner join
(select * from (select phone_no,city_id_day,city_id_night,lat_day,lng_day,lat_night,lat_night from TW_FEATS_FIN_LCEXT01 where month_id='202205' ) b where city_id_day='440100' or city_id_night='440100') c
on a.phone_no=c.phone_no
You missed aliasing first innermost subquery. Correct SQL below -
select
a.phone_no,
a.app_name
from
(select * from (select app_name,phone_no from lc_app_flag) subq) a --inner subquery as subq
inner join
(select * from (select phone_no,city_id_day,city_id_night,lat_day,lng_day,lat_night,lat_night from TW_FEATS_FIN_LCEXT01 where month_id='202205' ) b where city_id_day='440100' or city_id_night='440100') c
on a.phone_no=c.phone_no

Oracle SQL COUNT in an EXISTS SELECT

I want to create a query where I get all posts from my table INV where the INNUM exists more than 2 times in table INVS.
This is my query right now, but it fails with the typical "missing right parenthesis" error.
But when I run the EXISTS Query isolated, it works....
SELECT WO.WONUM, WO.DESCRIPTION, INV.INNUM, INV.STATUSDATE
FROM INV LEFT OUTER JOIN WO ON INV.WOID = WO.WOID
WHERE EXISTS (
SELECT COUNT(*) FROM INVS WHERE INVS.INNUM = INV.INNUM and INVS.SITEID='ARZ' GROUP BY INVS.INNUM
HAVING COUNT(*) > 2 ORDER BY INVS.INNUM
);
I dont really know why!?
Hmmm . . . use a scalar subquery to calculate the count and compare to "2" in the outer query:
SELECT WO.WONUM, WO.DESCRIPTION, INV.INNUM, INV.STATUSDATE
FROM INV LEFT OUTER JOIN
WO
ON INV.WOID = WO.WOID
WHERE (SELECT COUNT(*)
FROM INVS
WHERE INVS.INNUM = INV.INNUM AND
INVS.SITEID = 'ARZ'
) > 2;
Your query is relying on a doubly nested correlation clause which Oracle does not support.
You could also move the subquery to the FROM clause, but this version is more in the spirit of how you have written the query.
You get ORA-00907: missing right parenthesis while using the ORDER BYclause in the subquery.
Remove it and you get a valid syntax
Example
with tab as (select rownum id from dual connect by level <= 5
union all
select 3 from dual union all
select 5 from dual)
select * from tab t
where exists
(select count(*) from tab
where id = t.id
group by id
having count(*) > 1)
;
ID
----------
3
5
3
5
This is not a valid syntax --> ORA-00907: missing right parenthesis
select * from tab t
where exists
(select count(*) from tab
where id = t.id
group by id
having count(*) > 1 order by id)

How to solve Invalid object name when naming a select/

I named my query and when I tried use it I got: Invalid object name 'AssetsTenDays'. How do I solve that issue?
WITH AssetsTenDays AS(
SELECT DISTINCT top 2
a.name,
ir.number
FROM install i
INNER JOIN asset a ON a.pws_assetId = i.pws_AssetId
WHERE a.days = 10
)
when I try to use it I get the Invalid object name -
SELECT distinct *
from AssetsTenDays
Try using them together in a single SQL statement, as such:
WITH AssetsTenDays AS(
SELECT DISTINCT top 2
a.name,
ir.number
FROM install i
INNER JOIN asset a ON a.pws_assetId = i.pws_AssetId
WHERE a.days = 10
)
SELECT distinct *
from AssetsTenDays
Common Table Expression should be used immediate after declaration :
WITH AssetsTenDays AS(
SELECT DISTINCT top 2 a.name, ir.number
FROM install i INNER JOIN
asset a
ON a.pws_assetId = i.pws_AssetId
WHERE a.days = 10
)
SELECT ad.*
FROM AssetsTenDays ad;
You can't use other SELECT statement between declaration & calling common table expression. It should immediate called after declaration.

INNER JOINING THE TABLE ITSELF GIVES No column name was specified for column 2

SELECT *
FROM
construction AS T2
INNER JOIN
(
SELECT project,MAX(report_date)
FROM construction
GROUP BY project
) AS R
ON T2.project=R.project AND T2.report_date=R.report_date
getting this error. plz help
No column name was specified for column 2 of 'R'
You need to add alias for MAX(report_date):
SELECT *
FROM construction AS T2
INNER JOIN
(
SELECT project,MAX(report_date) AS report_date
FROM construction
GROUP BY project
) AS R
ON T2.project = R.project
AND T2.report_date = R.report_date;
In SQL Server you can use syntax:
SELECT *
FROM construction AS T2
INNER JOIN
(
SELECT project,MAX(report_date)
FROM construction
GROUP BY project
) AS R(project, report_date)
ON T2.project = R.project
AND T2.report_date = R.report_date;
You should specific the MAX(report_date) with an alias report_date.
Because your table R have two columns project,MAX(report_date).
You are getting this error because you have not specified column name for inner query
You have to write your query as
SELECT *
FROM construction
INNER JOIN
(
SELECT project,MAX(report_date)"Max_ReportDate"
FROM construction
GROUP BY project
) Max_construction
ON construction.project = Max_construction .project
AND construction.report_date = Max_construction .Max_ReportDate

DB2 Alternate to EXISTS Function

I have the below query in my application which was running on DB2:
SELECT COD.POST_CD,CLS.CLASS,COD2.STATUS_CD
FROM DC01.POSTAL_CODES COD
INNER JOIN DC02.STATUS_CODES COD2
ON COD.ORDER=COD2.ORDER
INNER JOIN DC02.VALID_ORDERS ORD
ON ORD.ORDER=COD.ORDER
WHERE
(
( EXISTS (SELECT 1 FROM DC00.PROCESS_ORDER PRD
WHERE PRD.ORDER=COD.ORDER
AND PRD.IDNUM=COD.IDNUM
)
) OR
( EXISTS (SELECT 1 FROM DC00.PENDING_ORDER PND
WHERE PND.ORDER=COD.ORDER
AND PND.IDNUM=COD.IDNUM
)
)
)
AND EXISTS (SELECT 1 FROM DC00.CUSTOM_ORDER CRD
WHERE CRD.ORDER=COD.ORDER
)
;
When we changed to UDB (LUW v9.5) we are getting the below warning:
IWAQ0003W SQL warnings were found
SQLState=01602 Performance of this complex query might be sub-optimal.
Reason code: "3".. SQLCODE=437, SQLSTATE=01602, DRIVER=4.13.111
I know this warning is due to the EXISTS () OR EXISTS statements. But I am not sure any other way I can write this query to replace. If it is AND, I could have made an INNER JOIN, but I am not able to change this condition as it is OR. Can any one suggest better way to replace these EXISTS Statements?
SELECT COD.POST_CD,CLS.CLASS,COD2.STATUS_CD
FROM DC01.POSTAL_CODES COD
INNER JOIN DC02.STATUS_CODES COD2
ON COD.ORDER=COD2.ORDER
INNER JOIN DC02.VALID_ORDERS ORD
ON ORD.ORDER=COD.ORDER
WHERE
(
EXISTS SELECT 1 FROM
(SELECT ORDER,IDNUM FROM DC00.PROCESS_ORDER PRD UNION
SELECT ORDER,IDNUM FROM DC00.PENDING_ORDER PND) PD
WHERE PD.ORDER=COD.ORDER
AND PD.IDNUM=COD.IDNUM
)
AND EXISTS (SELECT 1 FROM DC00.CUSTOM_ORDER CRD
WHERE CRD.ORDER=COD.ORDER
)
;