Find the cell with more common subcells - vba

I haven't seen any similar question to this one. Thank you in advance for your help!
I have these two columns:
Final Product - Subcomponent
A - 1
B - 1
C - 1
D - 1
A - 2
C - 2
B - 3
C - 3
A - 4
C - 4
D - 4
A - 5
B - 5
Final product A is made with the subcomponents 1, 2,4 and 5.
B is made with the subcomponents 1,3 and 5.
C is made with the subcomponents 1,2 and 4.
D is made with the subcomponents 1 and 4.
What I am looking for is an algorithm in vba or pivot tables that optimizes the final production in this way:
1 repeats 4 times.
2 repeats 2 times.
3 repeats 2 times.
4 repeats 3 times.
5 repeats 2 times.
First A should be made because it has more common components. Then B should be made because there is just 1 component missing compared with A. Then C because there is just one component to be replaced and last D because there is has the same two components as C.
I know this is not easy at all... Thank you!

Try this code
Sub Test()
Dim d As Object
Dim v As Variant
Dim m As Long
Dim r As Long
Dim i As Long
m = Range("A" & Rows.Count).End(xlUp).Row
v = Range("A1:B" & m).Value
Set d = CreateObject("Scripting.Dictionary")
For r = 1 To m
If d.Exists(v(r, 1)) Then
d(v(r, 1)) = d(v(r, 1)) & ", " & v(r, 2)
Else
d(v(r, 1)) = v(r, 2)
End If
Next r
Range("E1").Resize(d.Count).Value = Application.Transpose(d.Keys)
Range("F1").Resize(d.Count).Value = Application.Transpose(d.Items)
End Sub

Related

Iterating over a dataframe twice: which is the ideal way?

I am trying to create a dataframe for Sankey chart in Power BI which needs source and destination like this.
id
Source
Destination
1
Starting a
next point b
1
next point b
final point c
1
final point c
end
2
Starting a
next point b
2
next point b
3
Starting a
next point b
3
next point b
final point c
3
final point c
end
I have a dataframe like this:
ID
flow
1
Starting a
1
next point b
1
final point c
2
Starting a
2
next point b
3
Starting a
3
next point b
3
final point c
I tried doing by iterating over the dataframe twice like below:
for index, row in df.iterrows():
for j, r in df.iterrows():
if row['ID'] == r['ID']:
if (index + 1 == j) & ("final point c" not in row['flow']):
df['Destination'][index] = df['flow'][j]
elif "final point c" in row['flow']:
df['Destination'][index] = 'End of flow'
Since it is iterating over the same dataframe twice, when the records are huge, it is taking a lot of time to process.
Is there any better way to do this? I tried looking at the all similar questions, but couldn't find anything that relates to my question.
You could use groupby+shift and a bit of masking:
end = df['flow'].str.startswith('final point')
df2 = (df.assign(destination=df.groupby('ID')['flow'].shift(-1)
.mask(end, end.map({True: 'end'}))
)
.rename(columns={'flow': 'source'})
)
output:
ID source destination
0 1 Starting a next point b
1 1 next point b final point c
2 1 final point c end
3 2 Starting a next point b
4 2 next point b NaN
5 3 Starting a next point b
6 3 next point b final point c
7 3 final point c end
Alternative with combine_first to fill the NaNs:
end = df['flow'].str.startswith('final point').map({True: 'end', False: ''})
df2 = (df.assign(destination=df.groupby('ID')['flow'].shift(-1).combine_first(end))
.rename(columns={'flow': 'source'})
)
output:
ID source destination
0 1 Starting a next point b
1 1 next point b final point c
2 1 final point c end
3 2 Starting a next point b
4 2 next point b
5 3 Starting a next point b
6 3 next point b final point c
7 3 final point c end

VBA running sum

Let's say I have 2 columns (the following table includes the result)
Product ID Price Average
1 4 5
1 4 5
1 7 5
2 3 3
2 3 3
3 9 9
I want to be able to write a VBA code to loop through the rows of Product IDs and create the 3rd column which has average out the Prices.
I guess a For statement would work, but how do I define temp variables to store each ID?
Thanks!
As Vasily and L.Dutch told, AVERAGEIF() function is all you need. If you want to loop it through all cells, you can use Do While loop like this:
Sub avg()
Dim i As Integer
i = 2
Do While Range("A" & i).Value <> ""
Range("C" & i).FormulaR1C1 = "=AVERAGEIF(C1,RC1,C2)"
i = i + 1
Loop
End Sub

How to define a range using R for loop to create graphs

I have a dataset in excel that looks like this:
MA M1 M2 T1 T2 W1 W2 Th1 Th2 F1 F2
100 1 2 2 1 2 0 0 2 2 1
100 2 0 2 1 2 2 1 2 2 0
101 1 3 0 1 1 0 1 0 1 1
101 0 2 1 1 0 1 1 1 1 1
102 1 1 1 2 0 1 0 0 2 2
102 1 2 0 1 1 0 1 1 0 3
I am trying to create a column chart for each code (100,101,102) where each code will have 2 data sets and the horizontal values will be m1, m2, t1, etc.
So in the end I want 3 column graphs. I am trying to use a for loop to create these graphs in VBA, and here is what I have been trying:
Sub MA()
Dim i As Integer
Dim row1 As Integer, row2 As Integer
For i = 1 To 6 Step 2
Dim MAChart As Chart
Set MAChart = ActiveSheet.Shapes.AddChart.Chart
With MAChart
row1 = i + 1
row2 = i + 2
.ChartType = xlColumnClustered
.SetSourceData Source:=ActiveSheet.Range("Q& row1 & : & Z & row2")
End With
Next i
End Sub
I keep getting an "Application defined or object defined" error. I am having trouble defining the range of each chart since it changes based on i. I would love to find a clean way to make a series of charts using a for loop without redefining the range/dataset each time for each different chart. Does anyone know a good way to do this??
Below is treating the whole "range" as a string which would not equate to a range
Range("Q& row1 & : & Z & row2")
Try using below, you have no need for row1 and row2. Take note of how I am building up the string that makes a valid range
Range("Q" & 1+i & ":Z" & 2+i)
Used with your code be something like, notice I have also moved your "Dim" out of the loop this does not need to created each loop but needs to be "Set" every loop
Sub MA()
Dim i As Integer
Dim MAChart As Chart
Dim row1 As Integer, row2 As Integer
For i = 1 To 6 Step 2
Set MAChart = ActiveSheet.Shapes.AddChart.Chart
With MAChart
.ChartType = xlColumnClustered
.SetSourceData Source:=ActiveSheet.Range("Q" & i & ":Z" & 1+i)
End With
Next i
End Sub
You may also want to consider the position of the charts above will create them all on top of each other.

how do I conditionally subtract in Excel?

I am trying to do the following with knowing that column A and B are data and C is the result:
A B C
1 5 (B1-A1)=4
2 3 (B2-A1)=2
3 5 (B3-A1)=4
4 7 (B4-A2)=5
5 4 (B5-A2)=3
6 9 (B6-A2)=7
.
.
.
.
How do I do this automatically in Excel or in Excel Visual Basic?
Sub sequence()
Dim i As Integer
Dim j As Integer
i = 2
j = 2
For i = 2 To 25 Step 3
Cells(i, 3) = Cells(i, 2) - Cells(j, 1)
Cells(i + 1, 3) = Cells(i + 1, 2) - Cells(j, 1)
Cells(i + 2, 3) = Cells(i + 2, 2) - Cells(j, 1)
j = j + 1
Next i
End Sub
Here is the VBA code that solves.
You must define the range in for loop, currently it is set from 2nd Row to 25th Row.
A B C
1 4 =B2-A2
1 2 =B3-A3
1 3 =B4-A4
=A2+1 5 =B5-A5
=A3+1 6 =B6-A6
=A4+1 7 =B7-A7
=A5+1 6 =B8-A8
=A6+1 7 =B9-A9
=A7+1 9 =B10-A10
You can initiate your first 3 rows with 1 and then just add 1 in the 4th row column A; drag the formula down. Subsequently, you may then subtract Column B from Column A.
The only drawback is that your column A will not be a sequence incrementing by 1 every step instead a sequence stepping by 1 on every fourth occasion.
OFFSET with ROW() is your friend for any repeating n-th row/column problem.
=B1-OFFSET(A$1,ROUNDUP(ROW()/3,0)-1,0), copied down column C.
1 5 4
2 3 2
3 5 4
4 7 5
5 4 2
6 9 7
You can use the $ in the function ($B5-$A1) and drag the cursor with the cell over the C column to the last element written.

Merge two excel files using a common column

I have two excel sheets. I have to merge the two such that the values in one match with the other. For eg.
The first excel, the 2nd excel
1 t 1 tes1
2 5 3 tes3
3 t 4 tes4
4 g
Notice that in the first column of the 2nd excel, 2 is missing, so I want the first excel to look like this,
1 tes1 t
2 5
3 tes3 t
4 tes4 g
I am new to excel. Any help on this will be highly appreciated.
Sub left_join()
Dim res As Variant
Dim i As Long, lastUsedRowSh1 As Long, lastUsedRowSh2 As Long
Dim cell As Range
Sheets(3).Cells.ClearContents
Sheets(1).Range("a:b").Copy Destination:=Sheets(3).Range("a1")
Sheets(3).Columns(2).Insert Shift:=xlToRight
lastUsedRowSh1 = Sheets(1).Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
lastUsedRowSh2 = Sheets(2).Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
i = 1
For Each cell In Sheets(1).Range("a1:a" & lastUsedRowSh1)
On Error Resume Next
res = Application.WorksheetFunction.VLookup(cell.Value, Sheets(2).Range("a1:b" & lastUsedRowSh2), 2, 0)
If Err.Number = 0 Then
Sheets(3).Range("b" & i).Value = res
i = i + 1
Else
i = i + 1
End If
Next cell
End Sub
You can even solve with a simple formula.
Foglio1
A B
1 t
2 5
3 t
4 g
Foglio2
A B
1 tes1
3 tes3
4 tes4
Foglio3
Copy the content of Foglio1 in Foglio3, then run this formula
=IF(ISERROR(VLOOKUP(Foglio1!A1,Foglio2!$A$1:$B$3,2,0))=TRUE,"",VLOOKUP(Foglio1!A1,Foglio2!$A$1:$B$3,2,0))
and drag it down. Regards.