Selecting Value Between a Range of Values in hive - hive

Select rates in rate tables and the values are between two currencies
I use subquery in where condition, but Hive does not support them
select r1.currency_code,r1.rate
from rate r1
where r1.rate >= (select r2.rate from rate r2 where r2.currency_code='GBP' )
and r1.rate<=(select r3.rate from rates r3 where r3.currency_code='ILS')
The rate between GBP and ILS should be select, but hive does not support the query

Using conditional aggregation to get gbp and ils rates into columns before filtering.
select base_currency,currency_code,rate
from (select r.*
,max(case when currency_code='GBP' then rate end) over(partition by base_currency) as gbp_rate
,max(case when currency_code='ILS' then rate end) over(partition by base_currency) as ils_rate
from rate r
where base_currency = 'USD'
) r
where rate >= gbp_rate and rate <= ils_rate

Related

SQL: LEFT JOIN based on effective date period of another table

I have two tables: [transaction_table] (t) and [rate_table] (r)
I want to FROM [transaction_table] LEFT JOIN [rate_table] according to the t.transaction_date and r.effective_date and the product.
Anyone know how? Thanks in advance.
Here's my code:
but it returns undesired outcome
SELECT t.*, r.rate
FROM [transaction_table] t
LEFT JOIN [rate_table] r on (t.product = r.product and t.transaction_date >= r.effective_date)
Desired Outcome: Transaction Table LEFT JOIN Rate Table, with rate according to the effective_date
transaction_date
product
amt
rate
2020-01-01
A
200
0.2
2020-04-01
A
200
0.3
2020-04-01
B
100
0.1
2021-01-01
A
200
0.5
[Transaction_Table]: contains all transactions of different products
transaction_date
product
amt
2020-01-01
A
200
2020-04-01
A
200
2020-04-01
B
100
2021-01-01
A
200
[Rate_Table]: contains rate adjustement of different products with an "effective_date"
effective_date
product
rate
2019-01-01
A
0.2
2019-01-01
B
0.1
2020-04-01
A
0.3
2020-09-01
A
0.5
You are joining all rates before the transaction date while you only want to get the newest of these. You can achieve this with a TOP(1) query in an OUTER APPLY
select t.*, r.rate
from transaction_table t
outer apply
(
select top(1) *
from rate_table r
where r.product = t.product
and r.effective_date <= t.transaction_date
order by r.effective_date desc
);
or in a subquery in the SELECT clause:
select
t.*,
(
select top(1) r.rate
from rate_table r
where r.product = t.product
and r.effective_date <= t.transaction_date
order by r.effective_date desc
) as rate
from transaction_table t;
You can use the APPLY operator to get the latest rate by product and based on latest effective_date
SELECT t.*, r.rate
FROM [transaction_table] t
CROSS APPLY
(
SELECT TOP (1) r.rate
FROM [rate_table] r
WHERE t.product = r.product
AND t.transaction_date >= r.effective_date
ORDER BY r.effective_date DESC
) r
You may also want to use OUTER APPLY instead of CROSS APPLY if there are possibility of a non matching rate in the rate_table
You can either use a derived table in which you define an end date for the rate by getting the next effective date using LEAD(), i.e.
SELECT tt.transaction_date,
tt.product,
tt.amt,
rt.rate
FROM Transaction_Table AS tt
LEFT JOIN
( SELECT rt.effective_date,
rt.product,
rt.rate,
end_date = LEAD(rt.effective_date)
OVER(PARTITION BY rt.product ORDER BY rt.effective_date)
FROM rate_table AS rt
) AS rt
ON rt.product = tt.product
AND rt.effective_date <= tt.transaction_date
AND (rt.end_date > tt.transaction_date OR rt.end_date IS NULL);
Or you can use OUTER APPLY with TOP 1 and then order by effective_date to get the latest rate prior to the transaction date:
SELECT tt.transaction_date,
tt.product,
tt.amt,
rt.rate
FROM Transaction_Table AS tt
OUTER APPLY
( SELECT TOP (1) rt.rate
FROM rate_table AS rt
WHERE rt.product = tt.product
AND rt.effective_date <= tt.transaction_date
ORDER BY rt.effective_date DESC
) AS rt;
I would typically approach this using the first method as it is more likely that your rate table is significantly smaller than transaction table, but depending on your overall data and indexes you may find OUTER APPLY performs better.
If you are dealing with very high volumes of data and performance is an issue then materialising your rate table will probably help, e.g.
IF OBJECT_ID(N'tempdb..#rate', 'U') IS NOT NULL
DROP TABLE #rate;
CREATE TABLE #rate
(
Product CHAR(1) NOT NULL, --Change type as necessary
FromDate DATE NOT NULL,
ToDate DATE NULL,
Rate DECIMAL(10, 2) NOT NULL, -- Change type as necessary
PRIMARY KEY (Product, FromDate)
);
INSERT #rate(Product, FromDate, ToDate, Rate)
SELECT rt.product,
rt.effective_date,
end_date = LEAD(rt.effective_date)
OVER(PARTITION BY rt.product ORDER BY rt.effective_date),
rt.rate
FROM rate_table AS rt;
SELECT tt.transaction_date,
tt.product,
tt.amt,
rt.rate
FROM Transaction_Table AS tt
LEFT JOIN #rate AS rt
ON rt.product = tt.product
AND rt.FromDate <= tt.transaction_date
AND (rt.ToDate > tt.transaction_date OR rt.ToDate IS NULL);

SQL case statement, filter by date, group by id

I have two tables: transactions and currency. Transactions table contains transactions:
id date client_id currency amount
2 '2017-07-18' 29 'EURO' 340
3 '2018-08-09' 34 'RUB' 5000
Currency table contains currency exchange rates EURO or USD to RUB - 1st row for example means that 1 EURO = 70 RUB. For weekends the are no values as banks are closed and for calculations I need to use Friday exchange rates:
date currency value
'2017-08-07' 'EURO' 70
'2018-08-07' 'USD' 60
'2018-09-09' 'USD' NULL
So I need to calculate amount spent by every client in RUB. And if possible not use window functions.
I tried to use case when and group by client_id but then I need to consider currency rates every time they made a transaction and I don't know how to provide for that.
select t.*, amount * coalesce((select value
from currency c
where c.currency = t.currency
and c.date <= t.date order by c.date desc limit 1),
1)
from transactions t
Assumes if no currency is found it is RUB so it uses 1 as exchange rate.
You can express this with a lateral join:
select t.*,
t.amount * c.value as rub
from transactions t left join lateral
(select c.*
from currency c
where c.currency = t.currency and
c.value is not null and
c.date <= t.date
order by c.date desc
fetch first 1 row only
) c;

Q) Write a query to return Territory and corresponding Sales Growth (compare growth between periods Q4-2019 vs Q3-2019)

Q) Write a query to return Territory and corresponding Sales Growth (compare growth between periods Q4-2019 vs Q3-2019).
Tables given-
Cust_Sales: -Cust_id,product_sku,order_date,order_value,order_id,month
Cust_Territory: cust_id,territory_id,customer_city,customer_pincode
Use tables FCT_CUSTOMER_SALES (which has sales for each Customer) and MAP_CUSTOMER_TERRITORY (which provides Territory-to-Customer mapping) for this question.
Output format-
TERRITORY_ID | SALES_GROWTH
My solution-
Select ((q2.claims - q1.claims)/q1.claims * 100) AS SALES_GROWTH , c.territory_id
From
(select sum(s.order_value) from FCT_CUSTOMER_SALES s inner join MAP_CUSTOMER_TERRITORY c on s.customer_id=c.customer_id where s.order_datetime between 1/07/2019 and 30/09/2019 group by c.territory_id) as q1.claims,
(select sum(s.order_value) from FCT_CUSTOMER_SALES s inner join MAP_CUSTOMER_TERRITORY c on s.customer_id=c.customer_id where s.order_datetime between 1/10/2019 and 31/12/2019 group by c.territory_id) as q2.claims
Group by c.territory_id
My solution is showing up as incorrect I would request anyone who can help me out with the solution and let me know where my mistake is
One option uses conditional aggregation. The idea is to filter the table on the two quarters at once, then use case expressions within the sum() aggregate function to compute the sales of each of them:
select
c.territory_id,
( sum(case when s.order_date >= date '2020-01-01' then s.order_value end)
- sum(case when s.order_date < date '2020-01-01' then s.order_value end)
) / (sum(case when s.order_date < date '2020-01-01' then s.order_value end)) * 100.0 as sales_growth
from fct_customer_sales s
inner join map_customer_territory c on s.customer_id = c.customer_id
where s.order_datetime >= date '2020-01-07' and s.order_datetime < '2020-01-01'
group by c.territory_id
You did not tell which database you are using, while date features are highly vendor-dependent. This uses the standard DATE syntax to declare the literal dates - you might need to adapat that if your database does not support it.

Adding a currency conversion to a SQL query

I have a database with a list of user purchases.
I'm trying to extract a list of users whose last successful purchase had a value of £100 or greater, which I have done:
SELECT
t.purchase_id
t.user_id,
t.purchase_date,
t.amount,
t.currency,
FROM
transactions t
INNER JOIN
(SELECT user_id, MAX(purchase_date) AS first_transaction
FROM transactions
GROUP BY user_id) frst ON t.user_id = frst.user_id
AND t.created_date = frst.first_transaction
WHERE
amount >= 100
ORDER BY
user_id;
The problem is that some of my purchases are in USD and some are in CAD. I would like to ensure that the value of the latest purchase is over £100 GBP despite the purchase currency.
Luckily I have another table with exchange rates:
base_currency currency exchange_rate
-----------------------------------------------
GBP USD 1.220185624
GBP CAD 1.602048721
So technically I just need to convert the amount using the exchange rate. I've hit a roadblock on how I can incorporate that into my current query. I'm thinking I need to create an extra column for amount_in_gbp but am not sure how to incorporate the case logic into my query?
You can avoid any JOIN statement:
SELECT t.purchase_id
,t.user_id
,t.purchase_date
,t.amount
,t.currency
FROM transactions t
INNER JOIN (
SELECT user_id
,MAX(purchase_date) AS first_transaction
FROM transactions
GROUP BY user_id
) frst ON t.user_id = frst.user_id
AND t.created_date = frst.first_transaction
WHERE (
SELECT t.amount / e.exchange_rate
FROM exchange AS e
WHERE t.currency = e.currency
) >= 100
ORDER BY user_id;
So that your column will be converted in GBP currency.
You join to the table:
SELECT t.*,
(t.amount / exchange_rate) as amoung_gbp
FROM transactions t LEFT JOIN
exchange e
ON t.currency = e.currency AND e.base_currency = 'GBP'
If you want to put this in a where clause, you need to repeat the expression:
where (t.amount / exchange_rate) > 100

how to use columns from outer query in subquery to get result from another table?

I am trying to get an aggregate result using a subquery from a table against each row of another table in hive. I understand that hive does not support subquery in SELECT clause so I'm trying to use the subquery in FROM clause, but it seems that hive does not support correlated subqueries as well.
Here's the example: table A contains data of accounts transactions with columns of dates(d1 and d2) and a currency column along with other columns, what I want to do is get the sum of exchange rate values in table B(which contains currency rates for each day of the year) between dates d1 and d2 for each account. I'm trying something like this:
SELECT
account_no, currn, balance,
trans_date as d2, last_trans_date as d1, exchng_rt
FROM
acc AS A,
(SELECT sum(rate) exchng_rt
FROM currency
WHERE curr_type = A.currn
AND banking_date BETWEEN A.d1 AND A.d2) AS B
Here is sample, the table A has account transactions and dates like:
account balance trans_date last_trans_date currency
abc 100 20-12-2016 20-11-2016 USD
abc 200 25-12-2016 20-12-2016 USD
def 500 15-11-2015 10-11-2015 AUD
def 600 20-11-2015 15-11-2015 AUD
and the table B is something like:
curr_type rate banking_date
USD 50.9 01-01-2016
USD 50.2 02-01-2016
USD 50.5 03-01-2016
AUD 50.9 01-01-2016
AUD 50.2 02-01-2016
AUD 50.5 03-01-2016 and so on...
so table contains daily rates of currencies for each type of currency
I think you can do what you want using JOIN and GROUP BY:
SELECT a.account_no, a.currn, a.balance, a.trans_date as d2, a.last_trans_date as d1,
SUM(rate) as exchng_rt
FROM acc a LEFT JOIN
currency c
ON c.curr_type = a.currn and banking_date between A.d1 and A.d2
GROUP BY a.account_no, a.currn, a.balance, a.trans_date, a.last_trans_date;
You should specify the filter after joining the two tables, something like the following:
SELECT A.account_no,
A.currn,
A.balance,
A.trans_date as d2,
A.last_trans_date as d1,
B.exchng_rt
FROM acc as A
JOIN (SELECT sum(rate) as exchng_rt,
curr_type,
banking_date
FROM currency group by curr_type,
banking_date ) as B
ON A.currn = curr_type
WHERE B.banking_date between A.d1 and A.d2</code>