Get the price of purchase at the moment of the sale - sql

I'm trying to get the price of purchase at the moment of the sale. There are differents prices of purchase for the same product in my table.
The price of purchase is defined by two dates. One date for the start of the validity and one other for the end of validity : DATE_DEB_VALID and DATE_FIN_VALID.
If I want to know how many I won at the time of sale (FLIGHT_DATE), I have to know the purchase price to the same period.
My flight date must be between "DATE_DEB_VALID" and "DATE_FIN_VALID"
The two tables :
TB_DW_VAB_SALES
- ID_TEC_SALES
- TRANSACTION_NUMBER
- CARRIER_CODE
- MASTER_FLIGHT_NUMBER
- FLIGHT_NUMBER
- FLIGHT_DATE
- FLIGHT_CITY_DEP
- FLIGHT_CITY_ARR
- PRODUCT_CODE
- QUANTITY
- UNIT_SALES_PRICE
- PROMOTION_CODE
- CREW_PRICE
- COMPLEMENTARY
- SALES_TYPE
- DATE_CHGT
TB_DW_VAB_PURCHASE
- PRODUCT_CODE
- PRODUCT_CODE_GUEST
- LIB_PRODUCT
- PRICE DATE_DEB_VALID
- DATE_FIN_VALID
- DATE_CHGT
My request :
SELECT (TB_DW_VAB_PURCHASE.PRODUCT_CODE),PRICE
FROM TB_DW_VAB_PURCHASE,TB_DW_VAB_SALES
WHERE FLIGHT_DATE BETWEEN DATE_DEB_VALID AND DATE_FIN_VALID AND to_char(FLIGHT_DATE,'YYYY')= '2016' OR to_char(FLIGHT_DATE,'YYYY')= '2015'
GROUP BY TB_DW_VAB_PURCHASE.PRODUCT_CODE, PRICE
Here We have the whole request (for informations) :
SELECT to_char(DATE_VENTE,'MM/YYYY'),sum(MARGE_TOTALE) FROM (
SELECT
CTE1.CA AS CHIFFRE_AFFAIRE_TOTAL,
CTE2.PRICE AS COUT_UNITAIRE,
CTE1.FLIGHT_DATE as DATE_VENTE,
CTE1.QTE*CTE2.PRICE AS COUT_ACHAT,
(CTE1.CA-CTE1.QTE*CTE2.PRICE) AS MARGE,
sum((CTE1.CA-CTE1.QTE*CTE2.PRICE)) as MARGE_TOTALE
FROM (
SELECT PRODUCT_CODE,
sum(QUANTITY*UNIT_SALES_PRICE) AS CA,
FLIGHT_DATE,
sum(QUANTITY) as QTE
FROM TB_DW_VAB_SALES
where SALES_TYPE = 'SALES' and to_char(FLIGHT_DATE,'YYYY')= '2015' OR to_char(FLIGHT_DATE,'YYYY')= '2016'
group by to_char(FLIGHT_DATE,'MM'),FLIGHT_DATE,PRODUCT_CODE
ORDER BY to_char(FLIGHT_DATE,'MM') ASC
)CTE1
inner join
(
SELECT (TB_DW_VAB_PURCHASE.PRODUCT_CODE),PRICE
FROM TB_DW_VAB_PURCHASE,TB_DW_VAB_SALES
WHERE FLIGHT_DATE BETWEEN DATE_DEB_VALID AND DATE_FIN_VALID AND to_char(FLIGHT_DATE,'YYYY')= '2016' OR to_char(FLIGHT_DATE,'YYYY')= '2015'
GROUP BY TB_DW_VAB_PURCHASE.PRODUCT_CODE, PRICE
) CTE2
on CTE1.PRODUCT_CODE=CTE2.PRODUCT_CODE
group by to_char(FLIGHT_DATE,'MM'), FLIGHT_DATE, 'MM', CTE1.FLIGHT_DATE, (CTE1.CA-CTE1.QTE*CTE2.PRICE),
CTE1.CA, CTE1.QTE, CTE2.PRICE, CTE1.QTE*CTE2.PRICE
) group by to_char(DATE_VENTE,'MM/YYYY') ORDER BY to_char(DATE_VENTE,'MM/YYYY') ASC;
Thank you !

As Nemeros already pointed out, you have a cross-join in your CTE2 subquery, as you aren't linking sales and purchases together - other than by the flight/valid dates, but across all products, which can't be what you intended.
It looks like you're calculating things in your inline views (naming them CTE1/2 is slightly confusing as that usually refers to common table expressions or subquery factoring) which you then use for further calculations, but I don't think the intermediate steps or values are needed.
It (looks* like your query can be simplified to something like:
select to_char(trunc(s.flight_date, 'MM'),'MM/YYYY') as mois_du_sales,
sum(s.quantity*(s.unit_sales_price - p.price)) as marge_totale
from tb_dw_vab_sales s
join tb_dw_vab_purchase p
on p.product_code = s.product_code
and s.flight_date between p.date_deb_valid and p.date_fin_valid
where s.sales_type = 'SALES'
and s.flight_date >= date '2015-01-01'
and s.flight_date < date '2017-01-01'
group by trunc(s.flight_date, 'MM')
order by trunc(s.flight_date, 'MM') asc;
Of course without sample data or expected results I can't verify I've interpreted it correctly. It may also not solve all of your performance problems - you still need to look at the tables and indexes and the execution plan to see what it's actually doing, and you might want to consider partitioning if you have a large volume of data and a convenient partition key (like sales year).
Using a filter like to_char(FLIGHT_DATE,'YYYY')= '2015' forces every value in that column (or at least those that match sales_type, if that is indexed and selective enough) to be converted to strings and then compared to the fixed value - twice as you're checking both years separately - and that stops any index on flight_date being used. Using a date range allows the index to be used, though unless you have data spanning many years it may still not be selective enough to be used, and the optimiser may still choose full table scans. But only one of each table, not two as you were potentially doing.

I will just fix your initial query till you just forgotted your basic algebra (AND have more priority than OR) :
SELECT (TB_DW_VAB_PURCHASE.PRODUCT_CODE),PRICE
FROM TB_DW_VAB_PURCHASE,TB_DW_VAB_SALES
WHERE FLIGHT_DATE BETWEEN DATE_DEB_VALID AND DATE_FIN_VALID AND
(to_char(FLIGHT_DATE,'YYYY')= '2016' OR to_char(FLIGHT_DATE,'YYYY')= '2015')
GROUP BY TB_DW_VAB_PURCHASE.PRODUCT_CODE, PRICE
Now if you had used normalized join, you would have also seen that you had forgotten a join clause on the product_code, so finally :
SELECT PUR.PRODUCT_CODE, PRICE
FROM TB_DW_VAB_PURCHASE PUR
inner join TB_DW_VAB_SALES SAL ON PUR.PRODUCT_CODE = PUR.PRODUCT_CODE
AND FLIGHT_DATE BETWEEN DATE_DEB_VALID AND DATE_FIN_VALID
WHERE to_char(FLIGHT_DATE,'YYYY')= '2016' OR to_char(FLIGHT_DATE,'YYYY')= '2015'
GROUP BY PUR.PRODUCT_CODE, PRICE

Related

Return data where the running total of amounts 30 days from another row is greater than or equal to the amount for that row

Let's say I a table that contains the date, account, product, purchase type, and amount like below:
Looking at this table, you can see that for any particular account/product combination, there are buys and sells. Essentially, what I'd like to write is a SQL query that flags the following: Are there accounts that bought at a certain amount and then sold the same aggregate amount or more 30 days from that buy?
So for example, we can see account 1 bought product A for 20k on 8/1. If we look at the running sum of sells by account 1 for product A over the next 30 days, we see they sold a total of 20k - the same as the initial buy:
Ideally, the query would return results that flag all of these instances: for each individual buy, find all sells for that product/account 30 days from that buy, and only return rows where the running total of sells is greater than or equal to that initial buy.
EDIT: Using the sample data provided, the desired should look more or less look like the following:
You'll see that the buy on 8/2 for product B/account 2 is not returned because the running sum of sells for that product/account/buy combination over the next 30 days does not equal or exceed the buy amount of 35k but it does return rows for the buy on 8/3 for product B/ account 2 because the sells do exceed the buy amount of 10k.
I know I need to self join the sells against the buys, where the accounts/products equal and the datediff is less than or equal 30 and I basically have that part structured. What I can't seem to get working is the running total part and only returning data when that total is greater than or equal to that buy. I know I likely need to use the over/partition by clauses for the running sum but I'm struggling to produce the right results/optimize properly. Any help on this would be greatly appreciated - just looking for some general direction on how to approach this.
Bonus: Would be even more powerful to stop returning the sells once the running total passes the buy, so for example, the last two rows in the desired output I provided aren't technically needed - since the first two sells following the buy had already eclipsed the buy amount.
In SQL Server, one option uses a lateral join:
select
t.*,
case when t.amount = x.amount then 1 else 0 end as is_returned
from mytable t
cross apply (
select sum(amount) amount
from mytable t1
where
t1.purchase_type = 'Sell'
and t1.account = t.account
and t1.product = t.product
and t1.date >= t.date
and t1.date <= dateadd(day, 30, t.date)
) x
where t.purchase_type = 'Buy'
The lateral join sums the amount of "sells" of the same account and product within the following 30 days, which you can then compare with the amount of the buy. The query gives you one row per buy, with a boolean flag that indicates if the amounts match.
In databases that support the range specification to window functions, this would be more efficiently expressed with a window sum:
select *
from (
select
t.*,
case when amount = sum(case when purchase_type = 'Sell' then amount end) over(
partition by account, product
order by date
range between current row and interval '30' day following
) then 1 else 0 end as is_returned
from mytable t
) t
where purchase_type = 'Buy'
Edit: this would generate a resultset similar to the third table in your question:
select t.*, x.*
from mytable t
cross apply (
select
t1.date sale_date,
t1.amount sell_amount,
sum(t1.amount) over(order by t1.date) running_sell_amount,
sum(t1.amount) over() total_sell_amount
from mytable t1
where
t1.purchase_type = 'Sell'
and t1.account = t.account
and t1.product = t.product
and t1.date >= t.date
and t1.date <= dateadd(day, 30, t.date)
) x
where t.purchase_type = 'Buy' and t.amount = x.total_sell_amount

Q) Write a query to return Territory and corresponding Sales Growth (compare growth between periods Q4-2019 vs Q3-2019)

Q) Write a query to return Territory and corresponding Sales Growth (compare growth between periods Q4-2019 vs Q3-2019).
Tables given-
Cust_Sales: -Cust_id,product_sku,order_date,order_value,order_id,month
Cust_Territory: cust_id,territory_id,customer_city,customer_pincode
Use tables FCT_CUSTOMER_SALES (which has sales for each Customer) and MAP_CUSTOMER_TERRITORY (which provides Territory-to-Customer mapping) for this question.
Output format-
TERRITORY_ID | SALES_GROWTH
My solution-
Select ((q2.claims - q1.claims)/q1.claims * 100) AS SALES_GROWTH , c.territory_id
From
(select sum(s.order_value) from FCT_CUSTOMER_SALES s inner join MAP_CUSTOMER_TERRITORY c on s.customer_id=c.customer_id where s.order_datetime between 1/07/2019 and 30/09/2019 group by c.territory_id) as q1.claims,
(select sum(s.order_value) from FCT_CUSTOMER_SALES s inner join MAP_CUSTOMER_TERRITORY c on s.customer_id=c.customer_id where s.order_datetime between 1/10/2019 and 31/12/2019 group by c.territory_id) as q2.claims
Group by c.territory_id
My solution is showing up as incorrect I would request anyone who can help me out with the solution and let me know where my mistake is
One option uses conditional aggregation. The idea is to filter the table on the two quarters at once, then use case expressions within the sum() aggregate function to compute the sales of each of them:
select
c.territory_id,
( sum(case when s.order_date >= date '2020-01-01' then s.order_value end)
- sum(case when s.order_date < date '2020-01-01' then s.order_value end)
) / (sum(case when s.order_date < date '2020-01-01' then s.order_value end)) * 100.0 as sales_growth
from fct_customer_sales s
inner join map_customer_territory c on s.customer_id = c.customer_id
where s.order_datetime >= date '2020-01-07' and s.order_datetime < '2020-01-01'
group by c.territory_id
You did not tell which database you are using, while date features are highly vendor-dependent. This uses the standard DATE syntax to declare the literal dates - you might need to adapat that if your database does not support it.

how to query a table date against a series of dates on another table

I have two tables, INVOICES and INV_PRICES. I am trying to find the Invoice table's part price from the Inv_Prices based upon the Invoice_Dt on the Invoice table; if the Invoice_Dt is between (greater than, but less than) or greater than the max EFF_DT on the Inv_Prices, then return that part's price.
I have tired variations on the following code, but no luck. I either do not get all the parts or multiple records.
SELECT DISTINCT A.INVOICE_NBR, A.INVOICE_DT, A.PART_NO,
CASE WHEN TRUNC(A.INVOICE_DT) >= TRUNC(B.EFF_DT) THEN B.DLR_NET_PRC_AM
WHEN (TRUNC(A.INVOICE_DT)||ROWNUM >= TRUNC(B.EFF_DT)||ROWNUM) AND (TRUNC(B.EFF_DT)||ROWNUM <= TRUNC(A.INVOICE_DT)||ROWNUM) THEN B.DLR_NET_PRC_AM
/*MAX(B.EFF_DT) THEN B.DLR_NET_PRC_AM*/
ELSE 0
END AS PRICE
FROM INVOICES A,
INV_PRICES B
WHERE A.PART_NO = B.PART_NO
ORDER BY A.INVOICE_NBR
Can someone assist? I have a sample of each table if needed.
Doesn't it work to put the condition in the JOIN conditions? You can calculate the period when a price is valid using LEAD():
SELECT i.INVOICE_NBR, i.INVOICE_DT, i.PART_NO,
COALESCE(ip.DLR_NET_PRC_AM, 0) as price
FROM INVOICES i LEFT JOIN
(SELECT ip.*, LEAD(eff_dt) OVER (PARTITION BY PART_NO ORDER BY eff_dt) as next_eff_dt
FROM INV_PRICES ip
) ip
ON i.PART_NO = ip.PART_NO AND
i.invoice_dt >= ip.eff_dt AND
(i.invoice_dt < ip.next_eff_dt or ip.next_eff_dt is null)
ORDER BY i.INVOICE_NBR

Grouping matching names with totals Firebird 2.5

I did a basic Firebird Report to call on all debtors and transactions
The report looks as follows
SELECT
POSPAY.TXNO,
DEBTORS.COMPANY,
POSPAY.AMOUNT,
POSINVTRANS.TXDATE
FROM
POSPAY
INNER JOIN DEBTORS ON (POSPAY.ACCTNUMBER = DEBTORS.ACCOUNT)
INNER JOIN POSINVTRANS ON (POSPAY.TXNO = POSINVTRANS.TXNO)
WHERE
PAYMNTTYPID = '7'
and
weekly = :weekly and
txdate >= :fromdate and
txdate <= :todate
This works correctly and gives me output on Debtor Name, TXNO, TXDATE, AMOUNT
I now want to write a similar report but need to group the debtors and give totals on the transactions ie I need the output Debtor name (If JOHN is twice, need to list once), Total ammount (Sum of John's transactions)
I still need innerjoin on debtors but no longer on posinvtrans, I was thinking it should look something like
SELECT
POSPAY.TXNO,
DEBTORS.COMPANY,
POSPAY.AMOUNT
FROM
POSPAY
INNER JOIN DEBTORS ON (POSPAY.ACCTNUMBER = DEBTORS.ACCOUNT)
WHERE
PAYMNTTYPID = '7'
and
weekly = :weekly and
txdate >= :fromdate and
txdate <= :todate
Group by DEBTORS.COMPANY
but no luck, get errors on Group by
'invalid expression in the select list (not containing in either an aggregate function or the GROUP BY CLAUSE)'
any suggestions?
The list of fields in the select list have to be either also listed in the group by list or be aggregate functions like count(*), max(amount), etc.
The problem is that you have not told Firebird what to do with POSPAY.TXNO and POSPAY.AMOUNT and it is not sufficient to tell what you do want to happen to them.
I suggest you remove those 2 fields from the query and have a select list of DEBTORS.COMPANY, sum(POSPAY.AMOUNT) as a starting point.
If you use GROUP BY you either need to include a column in the GROUP BY, or apply an aggregate function on the column. In your example you need to leave out POSPAY.TXNO, as that is transaction specific (or you could use the aggregate function LIST), and you need to apply the aggregate function SUM to AMOUNT to get the total:
SELECT
DEBTORS.COMPANY,
SUM(POSPAY.AMOUNT)
FROM
POSPAY
INNER JOIN DEBTORS ON (POSPAY.ACCTNUMBER = DEBTORS.ACCOUNT)
WHERE
PAYMNTTYPID = '7'
and
weekly = :weekly and
txdate >= :fromdate and
txdate <= :todate
Group by DEBTORS.COMPANY

TERADATA: Aggregate across multiple tables

Consider the following query where aggregation happens across two tables: Sales and Promo and the aggregate values are again used in a calculation.
SELECT
sales.article_id,
avg((sales.euro_value - ZEROIFNULL(promo.euro_value)) / NULLIFZERO(sales.qty - ZEROIFNULL(promo.qty)))
FROM
( SELECT
sales.article_id,
sum(sales.euro_value),
sum(sales.qty)
from SALES_TABLE sales
where year >= 2011
group by article_id
) sales
LEFT OUTER JOIN
( SELECT
promo.article_id,
sum(promo.euro_value),
sum(promo.qty)
from PROMOTION_TABLE promo
where year >= 2011
group by article_id
) promo
ON sales.article_id = promo.article_id
GROUP BY sales.article_id;
Some notes on the query:
Both the inner queries return huge number of rows due to large number of articles. Running explain on teradata, the inner queries themselves take very less time, but the join takes a long time.
Assume primary key on article_id is present and both the tables are partitioned by year.
Left Outer Join because second table contains optional data.
So, can you suggest a better way of writing this query. Thanks for reading this far :)
Not really sure how the avg function got into the mix, so I'm removing it.
SELECT article_id,
(SUM(sales_value) - SUM(promo_value)) /
(SUM(sales_qty) - SUM(promo_qty))
FROM (
SELECT
article_id,
sum(euro_value) AS sales_value,
sum(qty) AS sales_qty,
0 AS promo_value,
0 AS promo_qty
from SALES_TABLE sales
where year >= 2011
group by article_id
UNION ALL
SELECT
article_id,
0 AS sales_value,
0 AS sales_qty,
sum(euro_value) AS promo_value,
sum(qty) AS promo_qty
from SALES_TABLE sales
where year >= 2011
group by article_id
) AS comb
GROUP BY article_id;