Create a new column for table B based on information from table A - sql

I have this problem. I want to create a report that keeps everything in table B, but adds another column from table A (QtyRecv).
Condition: If RunningTotalQtyUsed (from table B) < QtyRecv, take that QtyRecv for the new column.
For example, for item A1, (RunningTotalQtyUsed) 55 < 100 (QtyRecv), -> ExpectedQtyRecv = 100.
But if RunningTotalQtyUsed exceeds QtyRecv, we take the next QtyRecv to cover that used quantity.
For example, 101 > 100, -> ExpectedQtyRecv = 138.
149 (RunningTotalQtyUsed) < (100 + 138) (QtyRecv) -> get 138.
250 < (100 + 138 + 121) -> get 121.
The same logic applies to item A2.
If total QtyRecv = 6 + 4 + 10 = 20, but RunningTotalQtyUsed = 31 -> result should be 99999 to notify an error that QtyRecv can't cover QtyUsed.
Table A:
Item QtyRecv
A1 100
A1 138
A1 121
A2 6
A2 4
A2 10
Table B:
Item RunningTotalQtyUsed
A1 55
A1 101
A1 149
A1 250
A2 1
A2 5
A2 9
A2 19
A2 31
Expected result:
Item RunningTotalQtyUsed ExpectedQtyRecv
A1 55 100
A1 101 138
A1 149 138
A1 250 121
A2 1 6
A2 5 6
A2 9 4
A2 19 10
A2 31 99999
What I made an effort:
SELECT b.*
FROM tableB b LEFT JOIN tableA a
ON b.item = a.item
item RunningTotalQtyUsed
A1 55
A1 55
A1 55
A1 101
A1 101
A1 101
A1 149
A1 149
A1 149
A1 250
A1 250
A1 250
A2 1
A2 1
A2 1
A2 5
A2 5
A2 5
A2 9
A2 9
A2 9
A2 19
A2 19
A2 19
A2 31
A2 31
A2 31
It doesn't keep the same number of rows as table B. How to still keep table B but add the ExpectQtyRecv from table A? Thank you so much for all the help!

SELECT B.TOTAL,B.SUM_RunningTotalQtyUsed,A.SUM_QtyRecv FROM
(
SELECT B.ITEM,SUM(B.RunningTotalQtyUsed)AS SUM_RunningTotalQtyUsed
FROM TABLE_B AS B
GROUP BY B.ITEM
)B_TOTAL
LEFT JOIN
(
SELECT A.ITEM,SUM(A.QtyRecv)AS SUM_QtyRecv
FROM TABLE_A AS A
GROUP BY A.ITEM
)A_TOTAL ON B.ITEM=A.ITEM
I can not be sure, but may be you need something like above ?

Related

SUM based on another column Sign in Oracle

I have two tables
Table A has only last level id(leaf_id) information along with sum_data.
id sum_data
A5 40
B3 -50
C2 90
Table B has hierarchy information of id's and the sign to be considered for the id.
id Z has three children A2 and B2 and C2
id parent id leaf_id level sign
Z NULL A5 1 +
A2 Z A5 2 +
A3 A2 A5 3 -
A4 A3 A5 4 +
A5 A4 A5 5 +
Z NULL B3 1 +
B2 Z B3 2 -
B3 B2 B3 3 +
Z NULL C2 1 +
C2 Z C2 2 +
I need to calculate sum_data of Z based on sign operator and the calculation follows like this:
id parent id leaf_id level sign sum_data
Z NULL A5 1 + -40 --(rolled up sum_data from A2* sign =-40 * +)
A2 Z A5 2 + -40 --(rolled up sum_data from A3* sign =-40 * +)
A3 A2 A5 3 - -40 --(rolled up sum_data from A4* sign = 40 * -)
A4 A3 A5 4 + +40 --(rolled up sum_data from A5)
A5 A4 A5 5 + 40 --got this from Table A
Z NULL B3 1 + 50 --(rolled up sum_data from B2* sign = 50 * +)
B2 Z B3 2 - 50 --(rolled up sum_data from B3* sign = -50 * -)
B3 B2 B3 3 + -50 -- got this from Table A
Z NULL C2 1 + 0
C2 Z C2 2 ignore 0 -- (90 comes from Table A, as sign is ignore it is 0)
My output should be
id sum_data
Z 10 ( -40 from A5 hierarchy + 50 from B3 hierarchy + 0 from C2 hierarchy)
Can you please help me in deriving the sum_data in Oracle SQL code.

Groupby and smallest on more than one index [duplicate]

This question already has answers here:
Keep other columns when doing groupby
(5 answers)
pandas groupby, then sort within groups
(9 answers)
Closed 2 years ago.
I have a data frame as follows
REG LOC DATE SUM
1 A1 19-07-20 10
1 B1 19-07-20 25
1 C1 19-07-20 20
2 A2 19-07-20 25
2 B2 19-07-20 30
2 C3 19-07-20 45
1 A1 20-07-20 15
1 B1 20-07-20 20
1 C1 20-07-20 30
2 A2 20-07-20 10
2 B2 20-07-20 15
2 C3 20-07-20 30
1 A1 21-07-20 25
1 B1 21-07-20 35
1 C1 21-07-20 45
2 A2 21-07-20 20
2 B2 21-07-20 30
2 C3 21-07-20 40
I want to find LOC with smallest 2 value of SUM for each region and date combination. For example for Date 19-7-20 and region 1, smallest is Loc A1 and C1 and for region 2 is A2 and B2. I am able to do it for one level with following code but not able to introduce another level in the code.
groupby(level=0,group_keys=False).apply(lambda x: x.nsmallest())
How can I do it for 2 levels not just one level up when I want n smallest values for a combination.
Thanks

SQL : SELECT all rows with maximum values and with WHERE condition also

I've table which look like this:
id data version rulecode
---------------------------
a1 1 100 x1
a2 1 100 x1
a1 1 200 x2
a4 2 500 x2
a7 2 200 x1
a6 2 500 x1
a7 2 500 x2
a8 2 150 x2
a9 3 120 x1
a10 3 130 x2
a10 3 120 x1
a12 3 130 x2
a13 3 130 x1
a14 3 110 x2
a15 3 110 x1
a16 4 220 x1
a17 4 230 x2
a18 4 240 x2
a19 4 240 x1
..........................
..........................
Now I want only those rows which has maximum version and data value as (1,2 and 4)
When I tried with dense_rank(), I am getting only those rows which have 1 value from data column:
SELECT * FROM
(SELECT *, dense_rank() OVER (ORDER BY version desc) as col
FROM public.table_name WHERE data in (1,2,4))x
WHERE x.col=1
Output:
id data version rulecode
---------------------------
a1 1 200 x2
My expected output:
id data version rulecode
a1 1 200 x2
a4 2 500 x2
a6 2 500 x1
a8 2 500 x2
a18 4 240 x2
a19 4 240 x1
Note: the value of data column is till millions.
Can someone help me out here to get the expected output?
You seem to want a PARTITION BY:
SELECT *
FROM (SELECT *,
DENSE_RANK() OVER (PARTITION BY data ORDER BY version desc) as seqnum
FROM public.table_name
WHERE data in (1, 2, 4)
) x
WHERE x.seqnum = 1
Using analytic functions:
WITH cte AS (
SELECT *, MAX(version) OVER (PARTITION BY data) max_version
FROM yourTable
)
SELECT id, data, version, rulecode
FROM cte
WHERE version = max_version AND data IN (1, 2, 4);
Note that we could have also filtered the data values inside the CTE. I will leave it as is, for a general solution to your problem.

sortindex() for a string index

My dataframe looks like this:
Method Dataset
A1 B2 10 20
B3 10 20
B1 10 20
B1 10 20
A2 B2 10 20
B1 10 20
A3 B9 10 20
B5 10 20
The Dataset index is a string. How can I sort just the second (Dataset) index using a list like ["B1", "B2", "B3", "B4", "B5"] ? I think I'm looking for sortindex() but with custom ordering.

How to merge common indices when creating MultiIndex DataFrame

I have a DataFrame that looks like this:
Method Dataset foo bar
0 A1 B1 10 20
1 A1 B2 10 20
2 A1 B2 10 20
3 A2 B1 10 20
4 A3 B1 10 20
5 A1 B1 10 20
6 A2 B2 10 20
7 A3 B2 10 20
I'd like to use Method and Dataset columns to turn this into a MultiIndex DataFrame. So I tried doing:
df.set_index(["Method", "Dataset"], inplace=True)
df.sort_index(inplace=True)
Which gives:
Method Dataset
A1 B1 10 20
B1 10 20
B2 10 20
B2 10 20
A2 B1 10 20
B2 10 20
A3 B1 10 20
B2 10 20
This is almost what I want but I was expecting to see common values in Dataset index to also be merged under one value, i.e. similar to Method index:
foo bar
Method Dataset
A1 B1 10 20
10 20
B2 10 20
10 20
A2 B1 10 20
B2 10 20
A3 B1 10 20
B2 10 20
How can I achieve that?
(This might not make a big difference to how you'd use a DataFrame but I'm trying to use the to_latex() method which is sensitive to these things)
I suggest you do this at the very end right before you write the DataFrame to_latex, otherwise you can have issues with data processing.
We will make the duplicated entries in the last level the empty string and reconstruct the entire MultiIndex.
import pandas as pd
import numpy as np
df.index = pd.MultiIndex.from_arrays([
df.index.get_level_values('Method'),
np.where(df.index.duplicated(), '', df.index.get_level_values('Dataset'))
], names=['Method', 'Dataset'])
foo bar
Method Dataset
A1 B1 10 20
10 20
B2 10 20
10 20
A2 B1 10 20
B2 10 20
A3 B1 10 20
B2 10 20
If you want to make this a bit more flexible for any number of levels (even just a simple Index) we can use this function which will replace in the last level:
def white_out_index(idx):
"""idx : pd.MultiIndex or pd.Index"""
i0 = [idx.get_level_values(i) for i in range(idx.nlevels-1)]
i0.append(np.where(idx.duplicated(), '', idx.get_level_values(-1)))
return pd.MultiIndex.from_arrays(i0, names=idx.names)
df.index = white_out_index(df.index)