HANA SQL Select table into variable and for loop issue - sql

I have a table in HANA that has the following columns:
Customer(varchar), Day(varchar), Item(varchar), Quantity (decimal), Cost(decimal)
My goal is to have a sql procedure that will duplicate the table values and append them to the existing table daily, while also updating the Day column values with the next day. So it will just be the same data over and over but new values for the Day column.
I believe this needs a select * from the table into a variable, then loop through the variable in which the Day column values will push forward 1 day, and then an insert all the updates rows. I'm stuck in the part of testing out the selection of 1 column and declaring it into a variable, and keep receiving this error:
DO
BEGIN
DECLARE V1 VARCHAR(20);
SELECT 'ITEM' INTO V1 FROM "TABLE_NAME";
SELECT :V1 FROM "TABLE_NAME";
END;
DBTech JDBC: fetch returns more than requested number of rows: "TABLE_NAME"."(DO statement)": line 4 col 5 (at pos 43):

if you want to double your values, you don't need loops or variables.
the following doubles all ITEMs with a curent timestamp
INSERT INTO "TABLE_NAME"
("ITEM", "MYDATETIME")
SELECT "ITEM", NOW ( )
FROM "TABLE_NAME";

The requirement seems to be:
For each customer copy all entries of a reference day (e.g., the most recent entries) to new entries for the day following the reference day.
Such a function could be supporting e.g., the daily "roll over" of inventory entries.
In its most basic form of requirements this can be achieved in plain SQL - no procedure code required.
create column table cust_items
(customer nvarchar(20)
, i
tem_date date
, item nvarchar(20)
, quantity decimal (10,2)
, cost decimal (10,2)) ;
insert into cust_items values ('Aardvark Inc.', add_days(current_date, -3), 'Yellow Balls', 10, 23.23);
insert into cust_items values ('Aardvark Inc.', add_days(current_date, -3), 'Bird Food', 4.5, 47.11);
insert into cust_items values ('Aardvark Inc.', add_days(current_date, -3), 'Carrot Cake', 3, 08.15);
insert into cust_items values ('Wolf Ltd.', add_days(current_date, -3), 'Red Ballon', 1, 47.11);
insert into cust_items values ('Wolf Ltd.', add_days(current_date, -3), 'Black Bile', 2, 23.23);
insert into cust_items values ('Wolf Ltd.', add_days(current_date, -3), 'Carrot Cake', 3, 08.15);
insert into cust_items values ('Wolf Ltd.', add_days(current_date, -2), 'Red Ballon', 1, 47.11);
insert into cust_items values ('Wolf Ltd.', add_days(current_date, -2), 'Black Bile', 2, 23.23);
insert into cust_items values ('Wolf Ltd.', add_days(current_date, -2), 'Carrot Cake', 3, 08.15);
select * from cust_items
order by item_date, customer, item;
/*
CUSTOMER ITEM_DATE ITEM QUANTITY COST
Aardvark Inc. 6 Apr 2022 Bird Food 4.5 47.11
Aardvark Inc. 6 Apr 2022 Carrot Cake 3 8.15
Aardvark Inc. 6 Apr 2022 Yellow Balls 10 23.23
Wolf Ltd. 6 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 6 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 6 Apr 2022 Red Ballon 1 47.11
Wolf Ltd. 7 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 7 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 7 Apr 2022 Red Ballon 1 47.11
*/
We see that the two customers have individual entries, "Wolf Ltd." most recent entries are for April 7th, "Aardvark Inc." most recent entries are for April 6th.
The first part of the task now is to find the entries for most recent ITEM_DATE per customer. A simple join with a sub-query is sufficient here:
select co.customer, add_days(co.item_date, 1) as new_date, co.item, co.quantity, co.cost
from
cust_items co
join (select customer, max(item_date) max_date
from cust_items ci
group by customer) m_date
on (co.customer, co.item_date)
= (m_date.customer, m_date.max_date);
/* new date entries for each customer, based on previous most recent entry per customer
CUSTOMER NEW_DATE ITEM QUANTITY COST
Aardvark Inc. 7 Apr 2022 Yellow Balls 10 23.23
Aardvark Inc. 7 Apr 2022 Bird Food 4.5 47.11
Aardvark Inc. 7 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 8 Apr 2022 Red Ballon 1 47.11
Wolf Ltd. 8 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 8 Apr 2022 Carrot Cake 3 8.15
*/
Note, the add_days(co.item_date, 1) as new_date function takes care of the "moving the day one day ahead" requirement.
The second part of the requirement is INSERTing the new entries into the same table:
insert into cust_items
(select co.customer, add_days(co.item_date, 1) as new_date, co.item, co.quantity, co.cost
from
cust_items co
join (select customer, max(item_date) max_date
from cust_items ci
group by customer) m_date
on (co.customer, co.item_date)
= (m_date.customer, m_date.max_date)
);
/* execute 3 times
Statement 'insert into cust_items (select co.customer, add_days(co.item_date, 1) as new_date, co.item, ...'
successfully executed in 25 ms 530 µs (server processing time: 14 ms 94 µs) - Rows Affected: 6
Statement 'insert into cust_items (select co.customer, add_days(co.item_date, 1) as new_date, co.item, ...'
successfully executed in 9 ms 288 µs (server processing time: 3 ms 900 µs) - Rows Affected: 6
Statement 'insert into cust_items (select co.customer, add_days(co.item_date, 1) as new_date, co.item, ...'
successfully executed in 11 ms 311 µs (server processing time: 4 ms 586 µs) - Rows Affected: 6
--> number of new records always the same as only the most recent values are copied
*/
The table content now looks like this:
/*
CUSTOMER ITEM_DATE ITEM QUANTITY COST
Aardvark Inc. 6 Apr 2022 Bird Food 4.5 47.11
Aardvark Inc. 6 Apr 2022 Carrot Cake 3 8.15
Aardvark Inc. 6 Apr 2022 Yellow Balls 10 23.23
Wolf Ltd. 6 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 6 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 6 Apr 2022 Red Ballon 1 47.11
Aardvark Inc. 7 Apr 2022 Bird Food 4.5 47.11
Aardvark Inc. 7 Apr 2022 Carrot Cake 3 8.15
Aardvark Inc. 7 Apr 2022 Yellow Balls 10 23.23
Wolf Ltd. 7 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 7 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 7 Apr 2022 Red Ballon 1 47.11
Aardvark Inc. 8 Apr 2022 Bird Food 4.5 47.11
Aardvark Inc. 8 Apr 2022 Carrot Cake 3 8.15
Aardvark Inc. 8 Apr 2022 Yellow Balls 10 23.23
Wolf Ltd. 8 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 8 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 8 Apr 2022 Red Ballon 1 47.11
Aardvark Inc. 9 Apr 2022 Bird Food 4.5 47.11
Aardvark Inc. 9 Apr 2022 Carrot Cake 3 8.15
Aardvark Inc. 9 Apr 2022 Yellow Balls 10 23.23
Wolf Ltd. 9 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 9 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 9 Apr 2022 Red Ballon 1 47.11
Wolf Ltd. 10 Apr 2022 Black Bile 2 23.23
Wolf Ltd. 10 Apr 2022 Carrot Cake 3 8.15
Wolf Ltd. 10 Apr 2022 Red Ballon 1 47.11
--> only Wolf Ltd. has entries on 10/4 as it was the only one starting off with
values on 7/4.
*/

Related

sql - How To Remove All Rows After 4th Occurence of Column Combination in postgresql

I have a sql query that results in a table similar to the following after grouping by name, quarter, year and ordering by year DESC, quarter DESC:
name
count
quarter
year
orange
22
4
2022
apple
1
4
2022
banana
123
3
2022
pie
93
2
2022
apple
12
2
2022
orange
0
1
2022
apple
900
4
2021
...
...
...
...
I want to remove any rows that come after the 4th unique combination of quarter and year is reached (for the table above this would be any rows after the last combination of quarter 1, year 2022), like so:
name
count
quarter
year
orange
22
4
2022
apple
1
4
2022
banana
123
3
2022
pie
93
2
2022
apple
12
2
2022
orange
0
1
2022
I am using Postgres 6.10.
If the next year were reached, it would still need to work with the quarter at the top being 1 and the year 2023.
select name
,count
,quarter
,year
from
(
select *
,dense_rank() over(order by year desc, quarter desc) as dns_rnk
from t
) t
where dns_rnk <= 4
name
count
quarter
year
orange
22
4
2022
apple
1
4
2022
banana
123
3
2022
pie
93
2
2022
apple
12
2
2022
orange
0
1
2022
Fiddle

How to shuffle the outer index randomly and inner index in a different random order in a multi index dataframe

The following is some code to generate a sample dataframe:
fruits=pd.DataFrame()
fruits['month']=['jan','feb','feb','march','jan','april','april','june','march','march','june','april']
fruits['fruit']=['apple','orange','pear','orange','apple','pear','cherry','pear','orange','cherry','apple','cherry']
ind=fruits.index
ind_mnth=fruits['month'].values
fruits['price']=[30,20,40,25,30 ,45,60,45,25,55,37,60]
fruits_grp = fruits.set_index([ind_mnth, ind],drop=False)
How can I shuffle the outer index randomly and inner index in a different random order in this multi-index data frame?
Assuming this dataframe with MultiIndex as input:
month fruit price
jan 0 jan apple 30
feb 1 feb orange 20
2 feb pear 40
march 3 march orange 25
jan 4 jan apple 30
april 5 april pear 45
6 april cherry 60
june 7 june pear 45
march 8 march orange 25
9 march cherry 55
june 10 june apple 37
april 11 april cherry 60
First shuffle the whole DataFrame, then regroup the months by indexing on a random order:
np.random.seed(0)
idx0 = np.unique(fruits_grp.index.get_level_values(0))
np.random.shuffle(idx0)
fruits_grp.sample(frac=1).loc[idx0]
output:
month fruit price
jan 0 jan apple 30
4 jan apple 30
april 6 april cherry 60
5 april pear 45
11 april cherry 60
feb 1 feb orange 20
2 feb pear 40
june 10 june apple 37
7 june pear 45
march 8 march orange 25
9 march cherry 55
3 march orange 25

How do I create a function with a pandas dataframe that uses two column headers as arguments to then print a preexisting 3rd column value?

My dataframe looks like this. 3 columns. All I want to do is write a FUNCTION that, when the first two columns are inputs, the corresponding third column (GHG intensity) is the output. I want to be able to input any property name and year and achieve the corresponding GHG intensity value. Please help!
Property Name Data Year
467 GALLERY 37 2018
477 Navy Pier, Inc. 2016
1057 GALLERY 37 2015
1491 Navy Pier, Inc. 2015
1576 GALLERY 37 2016
2469 The Chicago Theatre 2016
3581 Navy Pier, Inc. 2014
4060 Ida Noyes Hall 2015
4231 Chicago Cultural Center 2015
4501 GALLERY 37 2017
5303 Harpo Studios 2015
5450 The Chicago Theatre 2015
5556 Chicago Cultural Center 2016
6275 MARTIN LUTHER KING COMMUNITY CENTER 2015
6409 MARTIN LUTHER KING COMMUNITY CENTER 2018
6665 Ida Noyes Hall 2017
7621 Ida Noyes Hall 2018
7668 MARTIN LUTHER KING COMMUNITY CENTER 2017
7792 The Chicago Theatre 2018
7819 Ida Noyes Hall 2016
8664 MARTIN LUTHER KING COMMUNITY CENTER 2016
8701 The Chicago Theatre 2017
9575 Chicago Cultural Center 2017
10066 Chicago Cultural Center 2018
GHG Intensity (kg CO2e/sq ft)
467 7.50
477 22.50
1057 8.30
1491 23.30
1576 7.40
2469 4.50
3581 17.68
4060 11.20
4231 13.70
4501 7.90
5303 18.70
5450 NaN
5556 10.30
6275 14.10
6409 12.70
6665 8.30
7621 8.40
7668 12.10
7792 4.40
7819 10.20
8664 12.90
8701 4.40
9575 9.30
10066 7.50
You basically need to filter the dataframe based on 2 columns, the syntax for his would be
df[(df[COLUMN_1_NAME]==PROPERTY_1_NAME) & (df[COLUMN_2_NAME]==PROPERTY_2_NAME)]
so in your case you would do (searching for 'GALLERY 37' and '2016')
property_name = "GALLERY 37"
data_year = "2016"
ghg_intensity_df = df[(df['Property Name']==property_name) & (df['Data Year']==data_year)]
ghg_intensity = ghg_intensity_df['GHG Intensity (kg CO2e/sq ft)'].iloc[0]
print(ghg_intensity)
This assumes that for every Property Name and Data Year pairs there would exist one and only one value

Merging old table with new table with different structure

I am using SQL Server 2012. I have two tables which I need to 'merge'. The two tables are called tblOld and tblNew.
tblOld has data from say 2012 to 2013
tblNew has data from 2013 onwards and has a different structure
The dates do not overlap between the tables.
Simple example of the tables:
Old table
t_date region sub_region sales
------------------------------------------
1 Jan 2012 US QR 2
1 Jan 2012 US NT 3
1 Jan 2012 EU QR 5
2 Jan 2012 US QR 4
2 Jan 2012 US NT 6
2 Jan 2012 EU QR 10
...
31 Dec 2013 US QR 8
31 Dec 2013 US NT 9
31 Dec 2013 EU QR 15
New table
t_date region sales
-----------------------------
1 Jan 2014 US 20
1 Jan 2014 EU 50
2 Jan 2014 US 40
2 Jan 2014 EU 100
...
31 Dec 2014 US 80
31 Dec 2014 EU 150
Result I'm looking for:
t_date US QR US NT EU
-------------------------------------
1 Jan 2012 2 3 5
2 Jan 2012 4 6 10
...
31 Dec 2013 8 9 15
1 Jan 2014 20 50
2 Jan 2014 40 100
...
31 Dec 2014 80 150
So I'm trying to create a query which will give me the results above although I'm not sure how to do this or if it can be done?
SELECT t_date,
SUM(CASE WHEN region='US' AND (sub_region='QR' OR sub_region IS NULL) THEN sales ELSE 0 END) 'US QR',
SUM(CASE WHEN region='US' AND sub_region='NT' THEN sales ELSE 0 END) 'US NT',
SUM(CASE WHEN region='EU' THEN sales ELSE 0 END) 'EU'
FROM (
SELECT t_date
,region
,sub_region
,sales
FROM tblOLD
UNION ALL
SELECT t_date
,region
,NULL
,sales
FROM tblNEW
) t
GROUP BY t_date
You are looking for a UNION of the two tables:
SELECT t_date
,region
,sales
,sub_region
FROM tblOLD
UNION ALL
SELECT t_date
,region
,NULL
,sales
FROM tblNEW

How do i unpivot or flatten this table

I have the following table in my sqlserver database:
FiguresYear FiguresMonth Apple Orange Banana Grape
2012 Jan 10 12 15 20
2013 Jan 1 2 3 5
I want to run a query which returns the following format:
FiguresYear FiguresMonth FruitName FruitValue
2012 Jan Apple 10
2012 Jan Orange 12
2012 Jan Banana 15
2012 Jan Grape 20
2013 Jan Apple 1
2013 Jan Orange 2
2013 Jan Banana 3
2013 Jan Grape 5
I am trying to use the unpivot function but cant quite get it working. Does anyone know how to do this with or without unpivot?
Like this:
SELECT *
FROM tablename AS t
unpivot
(
FruitValue
FOR FruitName IN([Apple], [Orange], [Banana], [Grape])
) AS u;
SQL Fiddle Demo