How to do Hive SQL IF/ELSE query? - hive

In my Hive SQL script I want to do:
IF {$user_choice = 1}
SELECT a,b,c FROM Table1;
ELSE
SELECT d,e,f FROM Table1;
Where user_choice is a parameter that Hive prompts for when the query is run.
What's the right syntax for this please? Or is there another way to achieve the same thing if I'm thinking about it wrong?
I'm trying to do this via the Hue editor, if that makes a difference.
Thanks

If column types are the same, you can use CASE statements or UNION ALL in Hive:
SELECT
case when ${user_choice} = 1 the a else d end col1
case when ${user_choice} = 1 the b else e end col2
case when ${user_choice} = 1 the c else f end col3
FROM table1;
OR
SELECT a, b, c FROM table1 WHERE ${user_choice} = 1
UNION ALL
SELECT d, e, f FROM table1 WHERE ${user_choice} = 2
And if the data types of columns are different, or completely different scripts then call hive from shell
if if [[ ${user_choice} == "1" ]] ; then
hive -e "query one"
else
hive -e "query two"
fi
Column names also can be parametrized: https://stackoverflow.com/a/55805764/2700344

Related

SQL Update using Case results

I need to update a value in file1 with the contents of a field in another file2 with a matching key, but only if a row is found in file2 that matches. Otherwise, update file1 field with a 'Q' literal.
This works, but seems redundant, and takes too long? Suggestions?
update ZXU
set XUATTN = case when (select count(*) from ZXK
where XKUSER = 'TOMTEST') > 0
then (select XKAUTH from ZXK
where XKUSER = 'TOMTEST')
else 'Q'
end
where XUUSER='TOMTEST'
You can use COALESCE():
update ZXU
set XUATTN = COALESCE( (select k.XKAUTH from ZXK k where k.XKUSER = ZHU.XUUSER), 'Q')
where XUUSER = 'TOMTEST';
I will note that this (and your code) will generate an error if the subquery returns more than one row.

Nested case conditionals in PostgreSQL, syntax error?

I use PostgreSQL and I would like to combine these two case conditions, but when I uncomment the code, I get a syntax error. This is not a difficult instruction. I want to divide or multiply the obtained number depending on the condition. How can I enter it so that the code is compiled?
SELECT
SUM(
CASE
WHEN transactions.recipient_account_bill_id = recipientBill.id AND recipientBill.user_id = 2
THEN 1
ELSE -1
END * transactions.amount_money
/* CASE
WHEN senderBillCurrency.id = recipientBillCurrency.id
THEN NULL
ELSE
CASE
WHEN recipientBillCurrency.base = true
THEN /
ELSE *
END senderBillCurrency.current_exchange_rate
END */
) as TOTAL
FROM transactions
LEFT JOIN bills AS senderBill ON senderBill.id = transactions.sender_account_bill_id
LEFT JOIN bills AS recipientBill ON recipientBill.id = transactions.recipient_account_bill_id
LEFT JOIN currency as senderBillCurrency ON senderBillCurrency.id = senderBill.currency_id
LEFT JOIN currency as recipientBillCurrency ON recipientBillCurrency.id = recipientBill.currency_id
WHERE 2 IN (senderBill.id, recipientBill.id)
You cannot create dynamic SQL expressions like that. A CASE expression cannot return an operator as if SQL was some sort of macro language. You can only return an expression.
You already used the following approach using 1 and -1 with your multiplication. Why not also use it with N and 1/N:
<some expression> * CASE WHEN condition THEN 1 / N ELSE N END
Or in your case:
<some expression> * CASE
WHEN senderBillCurrency.id = recipientBillCurrency.id THEN 1
WHEN recipientBillCurrency.base THEN 1 / senderBillCurrency.current_exchange_rate
ELSE senderBillCurrency.current_exchange_rate
END
Notice, you can put several WHEN clauses in a CASE expression. No need to nest them

Difference between querying from Impala and querying from Hive?

I have a Hive source table which contains:
select count(*) from dev_lkr_send.pz_send_param_ano;
--25283 lines
I am trying to get all of the table lines and put them into a dataframe using Spark2-Scala. I did the following:
val dfMet = spark.sql(s"""SELECT
CD_ANOMALIE,
CD_FAMILLE,
libelle AS LIB_ANOMALIE,
to_date(substr(MAJ_DATE, 1, 19), 'YYYY-MM-DD HH24:MI:SS') AS DT_MAJ,
CLASSIFICATION,
NB_REJEUX,
case when indic_cd_erreur = 'O' then 1 else 0 end AS TOP_INDIC_CD_ERREUR,
case when invalidation_coordonnee = 'O' then 1 else 0 end AS TOP_COORDONNEE_INVALIDE,
case when typ_mvt = 'S' then 1 else 0 end AS TOP_SUPP,
case when typ_mvt = 'S' then to_date(substr(dt_capt, 1, 19), 'YYYY-MM-DD HH24:MI:SS') else null end AS DT_SUPP
FROM ${use_database}.pz_send_param_ano""")
When I execute dfMet.count() it returns: 46314
Any ideas about the source of the difference?
EDIT1:
Trying the same query from Hive returns the same value as in the dataframe (I was querying from Impala UI before).
Someone can explain the difference please? I am working on Hue4.
A potential source of difference is your Hive query is returning the result from the metastore which is out of date rather than running a fresh count against the table.
If you have hive.compute.query.using.stats set to true and the table has stats computed then it will be returning the result from the metastore. If this is the case then it could be your stats are out of date and you need to recompute them.

CASE with IN clause

I have a condition (in Business objects ) as below
=If([Actual ]=0;Sum([Applied] In ([Project];[ Name];[Number];[Sub ])))
which I need to convert it to CASE statement for my OBIEE
Below is the query I had tried but it doesn't work:
SELECT
CASE
WHEN F.ACTUAL_EQP_COST = 0
THEN SUM((F.HRS_APPLIED) IN(F.PROJECT,F.NAME,F.NUMBER,F.SUB))
ELSE 0
END
FROM F
Probably
select SUM(F.HRS_APPLIED)
from DTS_OSC_WIP_REP_CST_ANALYSIS_A F
where F.ACTUAL_EQP_COST = 0
group by F.PROJECT,F.WIP_ENTITY_NAME,F.OPERATION_NUMBER,F.OPERATION_SUB_SEQ
You can't use "in" inside a "then" statement.
I couldn't understand exactly but something in this direction might help:
select
CASE WHEN F.ACTUAL_EQP_COST = 0
THEN (SELECT SUM(F1.HRS_APPLIED)
FROM DTS_OSC_WIP_REP_CST_ANALYSIS_A F1
WHERE F1.column_name IN (F.PROJECT,F.WIP_ENTITY_NAME,F.OPERATION_NUMBER,F.OPERATION_SUB_SEQ))
ELSE 0 END
from DTS_OSC_WIP_REP_CST_ANALYSIS_A F

returning empty set when group by is using (ORACLE SQL)

I am creating a script that takes as input three parameters from a user
and I would like to check if parameters are given and if table_name exists in database.
The problem I have is that because I am using group by function if
there is no columns of given table result is empty
My code is
SELECT COUNT(1),
Case
WHEN COUNT(1) > 0 THEN
NVL2(:a,
NVL2(:b,
NVL2(:name,
TO_CLOB('code1')
,'Error : name is required')
,'Error : b is required')
,'Error : a is required')
ELSE
TO_CLOB('Error : table name does not exist')
END
FROM USER_TAB_COLUMNS
WHERE TABLE_NAME=UPPER(:name)
GROUP BY TABLE_NAME;
Could you help me please ?
Thanks in advance
You'd need to create a dummy table that contains one row that outputs the parameter passed in, and then left join the above query to that. E.g.:
select count(utc.table_name),
case when count(utc.table_name) > 0 then
nvl2(:a, nvl2(:b, nvl2(:name, to_clob('code1'),
'Error : name is required'),
'Error : b is required'),
'Error : a is required')
else to_clob('Error : table name does not exist')
end
from (select upper(:name) table_name from dual) d
left outer join user_tab_columns utc on (d.table_name = utc.table_name)
group by d.table_name;