Netezza - update one table with max data from another table - sql

I have a table in netezza that I need to update. The columns I am working with are
TABLE A
ID_NO
ENTRY_DATE
PRICE
TABLE B
ID_NO
START_DATE
END_DATE
PRICE
So an example of the data would look like this:
TABLE A
ID_NO
ENTRY_DATE
PRICE
123
2020-05-01
123
2020-08-15
TABLE B
ID_NO
START_DATE
END_DATE
PRICE
123
2019-01-01
2019-11-01
$7.64
123
2020-04-30
2020-05-02
$6.19
123
2020-04-15
2020-08-30
$2.19
I need to update the PRICE in TABLE A to be the max PRICE from TABLE B where a.ENTRY_DATE is between b.START_DATE and b.END_DATE. So the final table should look like this:
TABLE A
ID_NO
ENTRY_DATE
PRICE
123
2020-05-01
$6.19
123
2020-08-15
$2.19
This is what I have so far, but it just ends up taking the max price that fits either row rather than doing the calculation for each row:
update TABLE_A
set PRICE=(select max(b.PRICE)
from TABLE_B b
inner join TABLE_A a on a.ID_NO=b.ID_NO
where a.ENTRY_DATE between b.START_DATE and b.END_DATE)

I don't have access to Netezza, but a usual format would be to use a correlated sub-query.
That is, instead of including TABLE_A again in the query, you refer to the outer reference to TABLE_A...
update
TABLE_A
set
PRICE = (
select max(b.PRICE)
from TABLE_B b
where TABLE_A.ID_NO = b.ID_NO
and TABLE_A.ENTRY_DATE between b.START_DATE and b.END_DATE
)
In this way, the correlated-sub-query is essentially invoked once for each row in TABLE_A and that invocation uses the current row from TABLE_A as its parameters.
An alternative could be...
update
TABLE_A
set
PRICE = revised.PRICE
from
(
select a.ID_NO, a.ENTRY_DATE, max(b.PRICE) AS PRICE
from TABLE_B b
inner join TABLE_A a on a.ID_NO=b.ID_NO
where a.ENTRY_DATE between b.START_DATE and b.END_DATE
group by a.ID_NO, a.ENTRY_DATE
)
AS revised
where
TABLE_A.ID_NO = revised.ID_NO
and TABLE_A.ENTRY_DATE = revised.ENTRY_DATE

Related

Hive - Using Lateral View Explode with Joined Table

I am building some analysis and need to prep the date by joining two tables and then unpivot a date field and create one record for each "date_type". I have been trying to work with lateral view explode(array()) function but I can't figure out how to do this with columns from two separate tables. Any help would be appreciated, open to completely different methods.
TableA:
loan_number
app_date
123
07/09/2022
456
07/11/2022
TableB:
loan_number
funding_date
amount
123
08/13/2022
12000
456
08/18/2022
10000
Desired Result:
loan_number
date_type
date_value
amount
123
app_date
07/09/2022
12000
456
app_date
07/11/2022
10000
123
funding_date
08/13/2022
12000
456
funding_date
08/18/2022
10000
Here is some sample code related the example above I was trying to make work:
SELECT
b.loan_number,
b.amount,
Date_Value
FROM TableA as a
LEFT JOIN
TableB as b
ON a.loan_number=b.loan_number
LATERAL VIEW explode(array(to_date(a.app_date),to_date(b.funding_date)) Date_List AS Date_value
No need lateral view explode, just union, try below:
with base_data as (
select
a.loan_number,
a.app_date,
b.funding_date,
b.amount
from
tableA a
join
tableB b on a.loan_number = b.loan_number
)
select
loan_number,
'app_date' as date_type,
app_date as date_value,
amount
from
base_data
union all
select
loan_number,
'funding_date' as date_type,
funding_date as date_value,
amount
from
base_data

How to get the MAX value of unique column in sql and aggregate other?

I want get the row with max 'date', groupy just by unique 'id' but without considering another columns.
I tried this query:
(But don't work cause modify anothers columns)
SELECT id,
MAX(num),
MAX(date),-- I just want the max of this column
MAX(product_name),
MAX(other_columns)
FROM TB
GROUP BY id
Table:
id num date product_name other_columns
123 0001 2021-12-01 exit 12315413
123 0002 2021-12-02 entry 65481328
333 0001 2021-12-03 entry 13848136
333 ASDV 2021-12-04 exit 1325165
Expected Result:
id num date product_name
123 0002 2021-12-02 entry
333 ASDV 2021-12-04 exit
How to do that?
Sub-query with an inner join can take care of this pretty DBMS agnostically.
SELECT
t.ID
,t.date
,t.product_name
,t.other_columns
FROM tb as t
INNER JOIN (
SELECT
id
,MAX(date) as date
FROM tb
GROUP BY id
) as s on t.id = s.id and t.date = s.date

Remove Transaction using ID and Date range in Redshift SQL

I want to create a query in redshift that will remove transaction items (Table A) base on its validity date (Table B)
Lets call it table A and Table B
Table A contains all transaction of Items
Table A
Item_Code Date
1. I0001 2019-12-01
2. I0002 2019-12-02
3. I0001 2020-01-01
4. I0003 2020-01-01
then Table 2 contains Item validity date
Table B
Item_Code Valid_From Valid_To
1. I0001 2019-01-01 2019-12-31
2. I0002 2019-01-01 2019-12-31
3. I0003 2019-01-01 2019-12-31
and my expected output will be
Item_Code Date
1. I0001 2019-12-01
2. I0002 2019-12-02
THis will give your expected results
Select * from TableA a where not exists(select 1 from TableB b on a.Date between b.Valid_From
and Valid_To)
And once you sure thats what you want, you delete it with
delete from TableA a where not exists(select 1 from TableB b on a.Date between b.Valid_From
and Valid_To)
I would join both tables and insert the validity as a join condition.
This will delete all invalid items or you can reverse the validity condition to detect which items are invalid
SELECT A.ITEM_CODE,A.DATE
FROM Table a
INNER JOIN TABLE B
ON A.ITEM_CODE=B.ITEM_CODE AND A.DATE=> B.Valid_from AND A.date <= B.Valid_to

How to Retrieve Maximum Value of Each Group? - SQL

There is a table tbl_products that contains data as shown below:
Id Name
----------
1 P1
2 P2
3 P3
4 P4
5 P5
6 P6
And another table tbl_inputs that contains data as shown below:
Id Product_Id Price Register_Date
----------------------------------------
1 1 10 2010-01-01
2 1 20 2010-10-11
3 1 30 2011-01-01
4 2 100 2010-01-01
5 2 200 2009-01-01
6 3 500 2011-01-01
7 3 270 2010-10-15
8 4 80 2010-01-01
9 4 50 2010-02-02
10 4 92 2011-01-01
I want to select all products(id, name, price, register_date) with maximum date in each group.
For Example:
Id Name Price Register_Date
----------------------------------------
3 P1 30 2011-01-01
4 P2 100 2010-01-01
6 P3 500 2011-01-01
10 P4 92 2011-01-01
select
id
,name
,code
,price
from tbl_products tp
cross apply (
select top 1 price
from tbl_inputs ti
where ti.product_id = tp.id
order by register_date desc
) tii
Although is not the optimum way you can do it like:
;with gb as (
select
distinct
product_id
,max(register_date) As max_register_date
from tbl_inputs
group by product_id
)
select
id
,product_id
,price
,register_date
from tbl_inputs ti
join gb
on ti.product_id=gb.product_id
and ti.register_date = gb.max_register_date
But as I said earlier .. this is not the way to go in this case.
;with cte as
(
select t1.id, t1.name, t1.code, t2.price, t2.register_date,
row_number() over (partition by product_id order by register_date desc) rn
from tbl_products t1
join tbl_inputs t2
on t1.id = t2.product_id
)
select id, name, code, price, register_date
from cte
where rn = 1
Something like this..
select id, product_id, price, max(register_date)
from tbl_inputs
group by id, product_id, price
you can use the max function and the group by clause. if you only need results from the table tbl_inputs you even don't need a join
select product_id, max(register_date), price
from tbl_inputs
group by product_id, price
if you need field from the tbl_prducts you have to use a join.
select p.name, p. code, i.id, i.price, max(i.register_date)
from tbl_products p join tbl_inputs i on p.id=i.product_id
grooup by p.name, p. code, i.id, i.price
Try this:
SELECT id, product_id, price, register_date
FROM tbl_inputs T1 INNER JOIN
(
SELECT product_id, MAX(register_date) As Max_register_date
FROM tbl_inputs
GROUP BY product_id
) T2 ON(T1.product_id= T2.product_id AND T1.register_date= T2.Max_register_date)
This is, of course, assuming your dates are unique. if they are not, you need to add the DISTINCT Keyword to the outer SELECT statement.
edit
Sorry, I didn't explain it very well. Your dates can be duplicated, it's not a problem as long as they are unique per product id. if you can have duplicated dates per product id, then you will have more then one row per product in the outcome of the select statement I suggested, and you will have to find a way to reduce it to one row per product.
i.e:
If you have records like that (when the last date for a product appears more then once in your table with different prices)
id | product_Id | price | register_date
--------------------------------------------
1 | 1 | 10.00 | 01/01/2000
2 | 1 | 20.00 | 01/01/2000
it will result in having both of these records as outcome.
However, if the register_date is unique per product id, then you will get only one result for each product id.

Filling in for missing latest data with last available data

I have two tables, one (market_cap_data) with month_end_date, id, market_cap fields:
month_end_date id market_cap
2012-12-31 123456 5000
2011-12-31 123456 4000
and a second table (start_date_table) with month_end_date, id, start_date fields:
month_end_date id start_date
2011-12-31 123456 1980-12-31
I want to combine the two tables but the start_date_table data ends a year before the market_cap_data table. I want to fill the latest data where the start_date_table doesn't have data using the most recent start_date. For example, instead of an outside join like:
month_end_date id market_cap start_date
2012-12-31 123456 5000 NULL
2011-12-31 123456 4000 1980-12-31
I want it to look like
month_end_date id market_cap start_date
2012-12-31 123456 5000 1980-12-31
2011-12-31 123456 4000 1980-12-31
Tried a bunch of different things but can't figure it out.
Any help would be appreciated!
SELECT
m.month_end_date,
m.id,
m.market_cap,
CASE
WHEN s.start_date IS NOT NULL THEN s.start_date
ELSE (SELECT MAX(s2.start_date) FROM start_date_table s2 WHERE s2.id = m.id)
END AS start_date
FROM market_cap_data m
LEFT JOIN start_date_table s
ON m.id = s.id
AND m.month_end_date = s.month_end_date
I think you would benefit from a case statement, this is not tested as I don't have a fiddle to validate against
create function get_latest_date_from_table(varchar(100) table_name returns Date
(
return select max(date) from #table_name
)
create procedure modify_null_dates_for_marker
(
max_date Date;
max_date = get_latest_date_from_table('table');
select
foo,
bar
CASE WHEN start_date IS NULL
THEN max_date
ELSE start_date END AS start_date
FROM table
)
This should give a method to set the null columns correctly.