I have a query need to be converted into HIVE query ..the query is like below...
select (some columns)
LEFT JOIN Acercross.Acercross_db_client_master dp
ON
TRIM(UPPER(dp.cm_blsavingcd)) = TRIM(UPPER(CM.Cl_Code))
OR
TRIM(UPPER(dp.cm_cd))=TRIM(UPPER(CM.Cl_Code ))
OR
TRIM(UPPER(dp.cm_cd)) = TRIM(UPPER(CM.CltDpId1))
OR
TRIM(UPPER(dp.cm_cd)) = TRIM(UPPER(CM.CltDpId2))
OR
TRIM(UPPER(dp.cm_cd)) = TRIM(UPPER(CM.CltDpId3))
when i am running this query i am getting
Line 35:54 OR not supported in JOIN currently 'CltDpId3'
How to write above query in HIVE.
Related
I’m using multiple joins for a specific logic , but encountered a problem . Some of the records has 1-2 relation in one of the tables which mess up my output. I want to concat all these string so it will appear it one record, but I don’t know how to do it in sql server . In oracle and MySQL it’s easy but I tried playing with online examples and failed miserably.
My query:
SELECT c.customerName,c.Guid,p.campaignTitle ,
(SELECT k.campaignTitle FROM [DEV_TEST2].[dbo].campaigns l JOIN [DEV_TEST2].[dbo].campaignstitle k on k.campaignname = l.campaignname where l.campaignid = t.referrerurl) as Referrertitle,
t.activitydate,t.type
FROM [DEV_TEST2].[dbo].campaignknowncustomers c
join [DEV_TEST2].[dbo].[CampaignCustomerMatch] t ON(c.guid = t.visitorexternalid)
join [DEV_TEST2].[dbo].campaigns s ON(t.url = s.campaignid)
join [DEV_TEST2].[dbo].campaignstitle p on(s.campaignname = p.campaignname)
order by customername,activitydate
My problem is with campaigntitle column and referrertitle correlated query. Both come from the same table. I need to concat it and show only 1 row per ‘customername, guid, activitydate’
I am doing a query build in hive, the query is given below.
*
Select * from CSS407
LEFT OUTER JOIN PROD_CORE.SERV_ACCT_ISVC_LINK SASP
ON CSS407.TABLE_ABBRV_CODE = 'SACT'
AND CSS407.EVENT_ITEM_REF_NUM = SASP.Serv_Acct_Id
AND to_date(CSS407.EVENT_RTS_VAL) >= SASP.Acct_Serv_Pnt_Strt_Dt
AND to_date(CSS407.EVENT_RTS_VAL) < SASP.Acct_Serv_Pnt_End_Dt
LEFT OUTER JOIN PROD_CORE.CUST_ACCT_SA_LINK ASA
ON CSS407.TABLE_ABBRV_CODE = 'SACT'
AND CSS407.EVENT_ITEM_REF_NUM = ASA.Serv_Acct_Id
AND CSS407.EVENT_RTS_VAL_UTC_DTTM >= ASA.Acct_Relt_Strt_Dttm
AND CSS407.EVENT_RTS_VAL_UTC_DTTM < ASA.Acct_Relt_End_Dttm
LEFT OUTER JOIN PROD_CORE.CUST_SA_LINK ASAT
ON CSS407.TABLE_ABBRV_CODE = 'TACT'
AND CSS407.EVENT_ITEM_REF_NUM = ASAT.Serv_Acct_Id
AND CSS407.EVENT_RTS_VAL_UTC_DTTM >= ASAT.Acct_Relt_Strt_Dttm
AND CSS407.EVENT_RTS_VAL_UTC_DTTM < ASAT.Acct_Relt_End_Dttm
*
When I am executing the above table in hive I am getting the below error
"Both left and right aliases encountered in JOIN 'SASP'"
On further investigation I founded that we cannot use date between filter in the join on condition. In every post everyone is asking to insert that filter in where condition.
But in our case if we are moving that date between filter to where condition then we are not getting any data since left outer join is not satisfying.
I am getting this issue while executing in HIVE, it is working fine in Teradata and oracle
Please help.
Only equality(=) works in join condition in Hive.Move <= to where clause.
I have the similar issue earlier.Please check below thread.
Hive Select MAX() in Join Condition
Hope this helps.
There might be some common column between CSS407 and SERV_ACCT_ISVC_LINK which might be creating this error.
I currently have an R program that pulls data from a warehouse and creates a set of reports. The data querying uses the RODBC library in R and I currently have just the following query code
manifest_query <- "
select
bkg.syn_sail_key,
bkg.sail_ship_cd,
bkg.sail_sail_dte,
bkg.sail_dura_days,
bkg.book_num,
bkg.pax_seq_num,
bkg.rdate,
bkg.cabin,
bkg.cabin_book_num,
bkg.meta_category_code,
bkg.rate_code,
bkg.person_num,
bkg.rdate,
person.past_guest_num,
person.first_name,
person.last_name,
person.gender,
person.date_of_birth,
person.tier_code
from
dw.bkg_contact_fact bkg
inner join dw_crm.person_dim person
on person.person_num = bkg.person_num
where
bkg.cancelled_pax_ind = 'N' and
bkg.sail_sail_dte >= to_date('%start_date%', 'YYYY-MM-DD') and
bkg.sail_sail_dte <= to_date('%end_date%', 'YYYY-MM-DD')
"
I need to add 3 more columns to this output table from 2 additional queries that I have executed in the SAS code below. The first one actually only adds a MIN computed column to the result from the above query
PROC SQL;
CREATE TABLE WORK."NEXT_SAIL"n AS
SELECT t1.SYN_SAIL_KEY,
t1.SAIL_DURA_DAYS,
t1.SAIL_SAIL_DTE,
t1.SAIL_SHIP_CD,
t1.RDATE,
t1.CABIN_BOOK_NUM,
t1.BOOK_NUM,
t1.PAX_SEQ_NUM,
t1.CABIN,
t1.RATE_CODE,
t1.PERSON_NUM,
t1.FIRST_NAME,
t1.LAST_NAME,
t1.PAST_GUEST_NUM,
t1.GENDER,
t1.DATE_OF_BIRTH,
t1.TIER_CODE,
/* MIN_of_SAIL_SAIL_DTE */
(MIN(t2.SAIL_SAIL_DTE)) FORMAT=DATETIME20. AS MIN_of_SAIL_SAIL_DTE
FROM WORK.QUERY_FOR_BKG_CONTACT_FACT t1
LEFT JOIN DW.BKG_CONTACT_FACT t2 ON (t1.PERSON_NUM = t2.PERSON_NUM) AND (t1.SAIL_SAIL_DTE < t2.SAIL_SAIL_DTE)
GROUP BY t1.SYN_SAIL_KEY,
t1.SAIL_DURA_DAYS,
t1.SAIL_SAIL_DTE,
t1.SAIL_SHIP_CD,
t1.RDATE,
t1.CABIN_BOOK_NUM,
t1.BOOK_NUM,
t1.PAX_SEQ_NUM,
t1.CABIN,
t1.RATE_CODE,
t1.PERSON_NUM,
t1.FIRST_NAME,
t1.LAST_NAME,
t1.PAST_GUEST_NUM,
t1.GENDER,
t1.DATE_OF_BIRTH
ORDER BY t1.SAIL_SAIL_DTE;
QUIT;
And the final one adds the two final columns but the 'PAX_CANCEL...' must be included as a join criterion
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_NEXT_SAIL_0002 AS
SELECT t1.SYN_SAIL_KEY,
t1.SAIL_DURA_DAYS,
t1.SAIL_SAIL_DTE,
t1.SAIL_SHIP_CD,
t1.RDATE,
t1.CABIN_BOOK_NUM,
t1.BOOK_NUM,
t1.PAX_SEQ_NUM,
t1.CABIN,
t1.RATE_CODE,
t1.PERSON_NUM,
t1.FIRST_NAME,
t1.LAST_NAME,
t1.PAST_GUEST_NUM,
t1.GENDER,
t1.DATE_OF_BIRTH,
t1.MIN_of_SAIL_SAIL_DTE,
t2.SAIL_SHIP_CD AS SAIL_SHIP_CD1,
t2.RATE_CODE AS RATE_CODE1
FROM WORK.NEXT_SAIL t1
LEFT JOIN DW.BKG_CONTACT_FACT t2 ON (t1.PERSON_NUM = t2.PERSON_NUM) AND (t1.MIN_of_SAIL_SAIL_DTE =
t2.SAIL_SAIL_DTE AND (t2.PAX_CANCEL_DTE IS MISSING ));
QUIT;
Its easy to do these in SAS as the process flow lets you use 'work' tables easily but my questions are:
How do I convert the SAS queries to correct SQL syntax that R will read?
In that code, can I nest the queries into one block statement so I only need to run the query command once?
or how do I execute several queries to get the same table result that I am getting in SAS?
I've got a query running that pulls out the records I need.
I want to run another query that pulls out all the other records (excluding the ones in the first query).
I've read up on NOT IN and NOT LIKE but can't seem to get them to work.
The first query is named: qryHunnersPatients
Here's the code for the second query that I have so far:
Right now this is just pulling all the records - but I want to exclude those records in the qryHunnersPatients query
SELECT
tblPatientHistoryBaseline.ID,
tblPatientHistoryBaseline.Age,
[tblPatientHistoryBaseline].[Age]-[tblPatientHistoryBaseline].[UrinarySxBegan] AS Duration,
tblPatientHistoryBaseline.IBS,
tblQuestionnaires.UPOINTTotal,
tblQuestionnaires.U,
tblQuestionnaires.P,
tblQuestionnaires.O,
tblQuestionnaires.I,
tblQuestionnaires.N,
tblQuestionnaires.T,
tblQuestionnaires.ICSITotal,
tblQuestionnaires.ICPITotal
FROM
tblPatientHistoryBaseline
INNER JOIN
tblQuestionnaires
ON
(tblPatientHistoryBaseline.Visit = tblQuestionnaires.Visit)
AND
(tblPatientHistoryBaseline.ID = tblQuestionnaires.ID);
UPDATE:
I just tried the WHERE NOT EXISTS using the code below:
SELECT
tblPatientHistoryBaseline.ID,
tblPatientHistoryBaseline.Age,
[tblPatientHistoryBaseline].[Age]-[tblPatientHistoryBaseline].[UrinarySxBegan] AS Duration,
tblPatientHistoryBaseline.IBS,
tblQuestionnaires.UPOINTTotal,
tblQuestionnaires.U,
tblQuestionnaires.P,
tblQuestionnaires.O,
tblQuestionnaires.I,
tblQuestionnaires.N,
tblQuestionnaires.T,
tblQuestionnaires.ICSITotal,
tblQuestionnaires.ICPITotal
FROM
tblPatientHistoryBaseline
INNER JOIN
tblQuestionnaires
ON
(tblPatientHistoryBaseline.Visit = tblQuestionnaires.Visit)
AND
(tblPatientHistoryBaseline.ID = tblQuestionnaires.ID)
WHERE NOT EXISTS
(SELECT ID
FROM qryHunnersPatients AS hunners
WHERE hunners.ID = tblPatientHistoryBaseline.ID);
You need a SubQuery. As In understand that your Query qryHunnersPatients gives you the list of records that you do not wish to see, you need to include that in the NOT IN part of the Query.
SELECT
tblPatientHistoryBaseline.ID,
tblPatientHistoryBaseline.Age,
[tblPatientHistoryBaseline].[Age]-[tblPatientHistoryBaseline].[UrinarySxBegan] AS Duration,
tblPatientHistoryBaseline.IBS,
tblQuestionnaires.UPOINTTotal,
tblQuestionnaires.U,
tblQuestionnaires.P,
tblQuestionnaires.O,
tblQuestionnaires.I,
tblQuestionnaires.N,
tblQuestionnaires.T,
tblQuestionnaires.ICSITotal,
tblQuestionnaires.ICPITotal
FROM
tblPatientHistoryBaseline
INNER JOIN
tblQuestionnaires
ON
(tblPatientHistoryBaseline.Visit = tblQuestionnaires.Visit)
AND
(tblPatientHistoryBaseline.ID = tblQuestionnaires.ID)
WHERE
tblPatientHistoryBaseline.ID
NOT IN
(SELECT qryHunnersPatients.ID FROM qryHunnersPatients);
Assuming ID is unique, you can use WHERE NOT EXISTS:
SELECT {FieldList}
FROM tblPatientHistoryBaseline AS baseline
INNER JOIN tblQuestionnaires AS quest
ON (baseline.Visit = quest.Visit)
AND (baseline.ID = quest.ID);
WHERE NOT EXISTS (
SELECT ID
FROM qryHunnersPatients AS hunners
WHERE hunners.ID = baseline.ID
)
You don't need to bother using the aliases I've added to; I've just added them for readability.
I'm working on a DB2 stored procedure and am having a little trouble getting the results I want. The problem with the following query is that it does not return rows from table A that don't pass the final where clause. I would like to receive all rows from table A that meet the first WHERE clause (WHERE A.GENRC_CD_TYPE = 'MDAA'). Then, add an email column from table B for each of those rows(WHERE (A.DESC) = B.MATL_PLNR_ID).
SELECT A.GENRC_CD,
A.DESC_30,
A.DOL,
A.DLU,
A.LU_LID,
B.EMAIL_ID_50
FROM GENRCCD A,
MPPLNR B
WHERE A.GENRC_CD_TYPE = 'MDAA'
AND (A.DESC_30) = B.MATL_PLNR_ID;
Any help is much appreciated, thanks!
Then what you need is a LEFT JOIN:
SELECT A.GENRC_CD,
A.DESC_30,
A.DOL,
A.DLU,
A.LU_LID,
B.EMAIL_ID_50
FROM GENRCCD A LEFT JOIN
MPPLNR B on A.DESC_30=B.MATL_PLNR_ID
WHERE A.GENRC_CD_TYPE = 'MDAA'