SAP HANA: Create a join with a max on date - hana

I am busy with a SAP HANA development but ran into an issue with currency conversion.
On the left of a join, I have a projection with the Sales Order Number, Customer Requested Delivery Date and order value in Document Currency (from VBAK/VBAP). On the right of the join, I have a projection containing the TCURR table (from SAP), filtered on MAER (monthly average exchange rate) and the "from currency" joined to document currency from the Sales Order. I need to convert the value in document currency say to EUR but must select the latest exchange rate available in TCURR. How do I do the join? So effectively, I need to join the date from the Sales Order to max( Exchange Rate Date) but must be less than or equal to the Sales Order date.

Could you please check following HANA db SQLScript
I used multiple SQL CTE expressions on HANA SQLScript to get the most recent entry for each currency conversion to EUR
And then join this CTE tables (last one CTE3) to VBAK table
I actually did not do the amount convertion using the currency rate, I think you can handle it using multiplication or division, etc on the SELECT list
with cte as (
select
to_date( to_nvarchar(99999999 - gdatu) ) gdate,
*
from "SAPS4S".TCURR
where tcurr = 'EUR'
), cte2 as (
select
row_number() over (partition by fcurr, YEAR(gdate), MONTH(gdate) order by gdate desc) as rn,
YEAR(gdate) as gdate_year,
MONTH(gdate) as gdate_month,
*
from cte
), cte3 as (
select * from cte2 where rn = 1
)
select
vbeln,
erdat,
netwr,
waerk,
cte3.*
from "SAPS4S".VBAK as vbak
left join cte3
on
vbak.waerk = cte3.fcurr and
YEAR(vbak.erdat) = cte3.gdate_year and
MONTH(vbak.erdat) = cte3.gdate_month;
Hello Ernie,
According to your second comment, I changed the SQLScript query a bit as follows
with cte as (
select
to_date( to_nvarchar(99999999 - gdatu) ) gdate,
*
from "SAPABAP1".TCURR
where tcurr = 'EUR'
), cte2 as (
select
vbeln,
erdat,
netwr,
waerk,
sum(1) over (partition by vbeln order by gdate desc rows unbounded preceding) as rownum,
cte.*
from "SAPABAP1".VBAK as vbak
left join cte
on
vbak.waerk = cte.fcurr and
vbak.erdat >= cte.gdate
)
select *
from cte2
where ifnull(rownum,1) = 1
I'll be happy if it works on your database and get your feedback
There are NULL records coming from TCURR table because there is no currency rate entry or the document currency is already defined as EUR (actually the rates should be then equal to 1)

Related

Select only the forecast data from Table A that has a more recent date than the actuals from Table B

I have a Forecast and an Actuals table with Table structures as such:
YearNb, WeekNb, Country, Product, Volume
Now I am working on a third Table with the same structure that combines the two.
I already have a query that is simply importing all the actuals. Now I need to import all the Forecasts that are relevant. This leads to my problem. I only need the Forecasts that have a more recent Date than the actuals. The Forecasts table includes all historic forecasts, most of which are not relevant. I need to make this check on a country level, since we receive this data on a country level and different countries can have more or less recent actuals.
What I already did:
WITH cte AS
(
SELECT Country, YearNb, WeekNb, (YearNb*100 + WeekNb) AS Date,
ROW_NUMBER() OVER (PARTITION BY Country ORDER BY (YearNb*100 + WeekNb) DESC) AS rn
FROM Actuals
)
SELECT *
FROM cte
WHERE rn = 1
This gives me a grouped list per country with the latest date of actual data.
But now I am kind of stuck how I could use this to select the data from the forecast table that has a more recent date.
Country YearNb WeekNb Date
A 2018 29 201829
B 2019 5 201905
C 2018 34 201834
One important thing, I need this data on the product level, so to be in the same structure as the original two tables.
So as final output I need all the Forecast per product for country A after the date 201829, all the data from Country B after the Date 201905 etc.
Try to JOIN by year field and add a condition to get earlier dates:
SELECT
*
FROM Actuals act
INNER JOIN
(
SELECT
(
SELECT
Country, YearNb, WeekNb, (YearNb*100 + WeekNb) AS Date,
ROW_NUMBER() OVER (PARTITION BY Country ORDER BY (YearNb*100 + WeekNb) DESC) AS rn
FROM Actuals
WHERE ROW_NUMBER() OVER (PARTITION BY Country ORDER BY (YearNb*100 + WeekNb) DESC) = 1
)
WHERE RN = 1
)q ON act.YearNb = q.YearNb and (act.YearNb*100 + act.WeekNb) < q.Date
I would use a dependent query with NOT EXISTS
select YearNb, WeekNb, Country, Product, Volume
from Forecast f
where not exists (
select 1
from Actual a
where a.country = f.country and
a.YearNb * 100 + a.WeekNb >= f.YearNb * 100 + f.WeekNb
)
This select relevant data from your Forecast table. If you are considered about the performance, then EXISTS can perform better if you have an index on country attribute.
EDIT
If you want to omit forecats of countries that are not in actual then use a semi-join
select f.*
from Forecast f
where not exists (
select 1
from Actual a
where a.country = f.country and
a.YearNb * 100 + a.WeekNb >= f.YearNb * 100 + f.WeekNb
) and
exists(
select 1
from Actual a
where a.country = f.country
)
Using your own CTE you can get it
WITH cte AS
(
SELECT Country, YearNb, WeekNb, (YearNb*100 + WeekNb) AS Date,
ROW_NUMBER() OVER (PARTITION BY Country ORDER BY (YearNb*100 + WeekNb) DESC) AS rn
FROM Actuals
)
SELECT f.*
FROM forecast f
JOIN cte ON f.Country = cte.Country AND cte.date < (f.YearNb*100 + f.WeekNb)
WHERE cte.rn = 1
I would use cross apply:
select f.*, a.*
from (select a.*,
row_number() over (partition by country order by yearnb desc, weeknb desc) as seqnum
from actuals a
) a cross apply
(select f.*
from forecast f
where f.country = a.country and
(f.yearnb > a.yearnb or
f.yearnb = a.yearnb and f.weeknb > a.weeknb
)
) f
where a.seqnum = 1;
This makes it easy to choose columns from both tables.

Selecting Date and ID from a table with max value of Amount

I am trying to fetch the Date and VisitorID from a payment table having the maximum value of amount. I know how to find max value by each date but I am unable to fetch the Date and VisitorID having the maximum value of amount.
I tried to use the attached code below but I just get one value having max value. I am trying to get date and visitor ID from each day with maximum value of amount.
SELECT Date, visitorID
FROM payment
WHERE Amount =
(
SELECT MAX(Amount)
FROM payment
)
You can try using correlated subquery
SELECT Date, visitorID,amount
FROM payment a
WHERE exists
(
SELECT 1
FROM payment b where a.date=b.date group by b.date having max(b.amount)=a.amount
)
One option uses a subquery:
SELECT p1.Date, p1.visitorID
FROM payment p1
INNER JOIN
(
SELECT Date, MAX(Amount) AS max_amount
FROM payment
GROUP BY Date
) p2
ON p1.Date = p2.Date AND p1.Amount = p2.max_amount;
The subquery finds, for each date, the maximum amount. Then, we join this to your original table to effectively filter off any record for a given date which does not have the maximum amount.
If your version of SQL supports analytic functions, then we can use them instead:
SELECT Date, visitorID
FROM
(
SELECT p.*, ROW_NUMBER() OVER (PARTITION BY Date ORDER BY Amount DESC) rn
FROM payment p
) t
WHERE rn = 1;
If you also wanted to have all records for a given date which are tied for the highest amount, then replace ROW_NUMBER above with either RANK or DENSE_RANK.
After seeing your comments it seems you need row_number()
with cte as
(
SELECT *,row_number() over(partition by date order by amount desc)rn
from payment t1
) select * from cte where cte.rn=1
You were close, you need reference from the outerquery to make it correlated subquery :
SELECT p.Date, p.visitorID
FROM payment p
WHERE p.Amount = (SELECT MAX(p1.Amount) FROM payment p1 WHERE p1.Date = p.Date);

Should I put a row number filter in join condition or in a prior CTE?

I have a subscription table and a payments table that I need to join.
I am trying to decide between 2 options and performance is a key consideration.
Which of the two OPTIONS below will perform better?
I am using Impala, and these tables are large (multiple millions of rows) I am needing to only get one row for every id and date grouping (hence the row_number() analytic function).
I have shortened the queries to illustrate my question:
OPTION 1:
WITH cte
AS (
SELECT *
, SUM(amount) OVER (PARTITION BY id, date)
AS sameday_total
, ROW_NUMBER() OVER (PARTITION BY id, date ORDER BY purchase_number DESC)
AS sameday_rownum
FROM payments
),
payment
AS (
SELECT *
FROM cte
WHERE sameday_rownum = 1
)
SELECT s.*
, p.sameday_total
FROM subscription
INNER JOIN payment ON s.id = p.id
OPTION 2:
WITH payment
AS (
SELECT *
, SUM(payment_amount) OVER (PARTITION BY id, date)
AS sameday_total
, ROW_NUMBER() OVER (PARTITION BY id, date ORDER BY purchase_number DESC)
AS sameday_rownum
FROM payments
)
SELECT s.*
, p.sameday_total
FROM subscription
INNER JOIN payment ON s.id = p.id
AND p.sameday_rownum = 1
An "Option 0" also exists. A far more traditional "derived table" which simply does not require use of any CTE.
SELECT s.*
, p.sameday_total
FROM subscription
INNER JOIN (
SELECT *
, SUM(payment_amount) OVER (PARTITION BY id, date)
AS sameday_total
, ROW_NUMBER() OVER (PARTITION BY id, date ORDER BY purchase_number DESC)
AS sameday_rownum
FROM payments
) p ON s.id = p.id
AND p.sameday_rownum = 1
All options 0,1 and 2 are likely to produce identical or very similar explain plans (although I'm more confident about that statement for SQL Server than Impala).
Adopting a CTE does - in itself - not make a query more efficient or better performing, so the syntax alteration between option 1 and 2 isn't major. I prefer option 0 myself as I prefer to use CTEs for specific tasks (e.g. recursion).
What you should do is use explain plans to study what each option produces.

How to query specific values for some columns and sum of values in others SQL

I'm trying to query some data from SQL such that it sums some columns, gets the max of another column and the corresponding row for a third column. For example,
|dataset|
|shares| |date| |price|
100 05/13/16 20.4
200 05/15/16 21.2
300 06/12/16 19.3
400 02/22/16 20.0
I want my output to be:
|shares| |date| |price|
1000 06/12/16 19.3
The shares have been summed up, the date is max(date), and the price is the price at max(date).
So far, I have:
select sum(shares), max(date), max(price)
but that gives me an incorrect price.
EDIT:
I realize I was unclear in my OP, all the other relevant data is in one table, and the price is in other. My full code is:
select id, stock, side, exchange, max(startdate), max(enddate),
sum(shares), sum(execution_price*shares)/sum(shares), max(limitprice), max(price)
from table1 t1
INNER JOIN table2 t2 on t2.id = t1.id
where location = 'CHICAGO' and startdate > '1/1/2016' and order_type = 'limit'
group by id, stock, side, exchange
You can do this with window functions and aggregation. Here is an example:
select sum(shared), max(date), max(case when seqnum = 1 then price end) as price
from (select t.*, row_number() over (order by date desc) as seqnum
from t
) t;
EDIT:
If the results that you are looking at are in fact the result of a query, you can do:
with t as (<your query here>)
select sum(shared), max(date), max(case when seqnum = 1 then price end) as price
from (select t.*, row_number() over (order by date desc) as seqnum
from t
) t;
Heres one way to do it .... the join would obviously include the ticker symbol for the share also
select
a.sum_share,
a.max_date
b.price
FROM
(
select ticker , sum(shares) sum_share, max(date) max_date from table where ticker = 'MSFT' group by ticker
) a
inner join table on a.max_date = b.date and a.ticker = b.ticker

sql, return only the max version of a record from a table

I have the following table which has a few key fields. the most important being the version and the dates.
I need a query that will allow me to display the active prices in the system for each of the company and products.
so show all dates between start and end, easy enough
show only the maximum version with those results - this is where I am stuck.
I have created a fiddle to show my example
http://sqlfiddle.com/#!6/e0d4f/3
how can I return only the record for each company and product that has the highest version within the date ranges?
this is what I have so far but incomplete:
select * from
prices
where getdate() between [start] and [end]
--and max(version)
;WITH PricesCTE AS
(
SELECT *,
ROW_NUMBER()OVER(PARTITION BY companyid,product ORDER BY version DESC) AS rn
FROM prices
WHERE GETDATE() BETWEEN [start] AND [end]
)
SELECT *
FROM PricesCTE
WHERE rn = 1
SQLFiddle Demo
First find the highest version per product for the desired date. Then join with your table to get that record.
select *
from
(
select companyid, product, max(version) as max_version
from prices
where getdate() between [start] and [end]
group by companyid, product
) this_date
inner join prices
on prices.companyid = this_date.companyid
and prices.product = this_date.product
and prices.version = this_date.max_version;
Here is the SQL fiddle: http://sqlfiddle.com/#!6/e0d4f/32.