Excel formula or VBA searching for matching values, every matching value found move to new column - vba

I'm new to excel and have very basic knowledge of it. I hope that some one will by able to help me as I'm trying to find similar formula or VBA code for few days now and no luck.
I need formula or VBA to find same value in range and when first matching value find move to new column.
For example:
Value in A1 check against all values in column H and if same value found move that value to B1, then again value in A1 check against values left in H column if same found move to C1, every time finds A1 same as value in H column moves that value to new column D1, E1 .....continue until all values in H column have been checked and matching ones moved. After this value in A2 checked against all values in H column and all matching values moved to B2,C2,E2 etc. Continue until all values have been moved from column H.
This how data looks before moving:
0 A B C D E F G H
----------------------------------------------------------
1 123 123
2 256 123
3 333 123
4 123
5 123
6 256
7 256
8 333
9 333
10 333
11 333
12 333
13 333
After movment:
0 A B C D E F G H
----------------------------------------------------------
1 123 123 123 123 123 123
2 256 256 256
3 333 333 333 333 333 333 333
4
5
6
7
8
9
10
11

Something like this:-
Sub tester()
With ActiveSheet
For Each cell In .Range("A1:" & .Range("A1").End(xlDown).Address)
i = 1
For Each c In .Range("H1:" & .Range("H1").End(xlDown).Address)
If cell.Value = c.Value Then
cell.Offset(0, i).Value = c.Value
i = i + 1
End If
Next
Next
End With
End Sub
Note: This does not clear the results from run to run.

Related

Pandas create new column with specific row values from dict

I have a dataframe:
ID val
1 a
2 b
3 c
4 d
5 a
7 d
6 v
8 j
9 k
10 a
I have a dictionary as follows:
{aa:3, bb: 3,cc:4}
In the dictionary the numerical values indicates the number of records. The sum of numerical values is equal to the number of rows that I have in the data frame. In this example 3 + 3 + 4 = 10 and I have 10 rows in the data frame.
I am trying to split the data frame by rows that are equal to the number given in the dictionary and fill the key as column value into a new column. The desired output is as follows:
ID val. new_col
1 a. aa
2 b aa
3 c. aa
4 d. bb
5 a. bb
6 v. bb
7. d. cc
8 j. cc
9 k. cc
10 a. cc
The order of the fill is not important as long as the count of records match with the count given in the dict. I am trying to resolve this by iterating through the dict but I am not able to isolate specific number of records of the data frame with every new key value pair.
I have also tried using pd.cut by splitting the dict values to bins and keys as column values. However I am getting the error ValueError: bins must increase monotonically.
d = {'aa':3, 'bb': 3,'cc':4}
df['new_col'] = pd.Series([np.repeat(i, j) for i, j in d.items()]).explode().to_numpy()
df
Out[64]:
ID val new_col
0 1 a aa
1 2 b aa
2 3 c aa
3 4 d bb
4 5 a bb
5 7 d bb
6 6 v cc
7 8 j cc
8 9 k cc
9 10 a cc

Removing duplicates from many excel sheets

I got a question if there is any fast way to remove duplicate rows across two excel spreadsheets. After searching I can do it by comparing the same rows in the spreadsheets (VBA). But I want to check whether the row from one is included anywhere in two. If exactly the same row exists in two it should be removed. So far I can do it if they are the same rows (e.g. 1 and 1).
Thanks in advance for any kind of help.
I can think of a workaround for this:
Create a column at the end of each row which is concatenation of all the columns of that particular row: Lets sat below are the two tables on the two excel sheets:
sheet1
A B C D(Concat)
1 2 3 123
4 5 6 456
7 8 9 789
1 3 5 135
4 3 2 432
sheet2
A B C D(Concat)
2 3 4 234
1 1 1 111
1 2 3 123
2 2 2 222
4 5 6 456
We will now identify the duplicate rows based on the last concatenated column. Using the formula =IF(ISNUMBER(MATCH(D4,Sheet1!D:D,0)),"DUP","NONDUP") in the second sheet, we can identify the rows which are already present in sheet1 irrespective of the sequence of the row in sheet1 wrt sheet2.
Result on Sheet2 shows up as below:
A B C D E(Result)
2 3 4 234 NONDUP
1 1 1 111 NONDUP
1 2 3 123 DUP
2 2 2 222 NONDUP
4 5 6 456 DUP

VLOOKUP returns 0 for blank cells

I have a data in SheetW like
A B C D
1 EID Y/N
2 1001 n
3 1004 n
4 1005 n
5 1006 y
6 1009 n
7 1006 y
8 1007 n
9 1008 y
10 1010
I'm using VLOOKUP in other sheetYN to fill Y/N based on above table, using vba vlookup
My vba code
With Sheets("sheetYN")
LastRow2 = .Cells(.Rows.Count, "A").End(xlUp).Row + 1
End With
For X = 2 To LastRow2
wsSheet1.Cells(X, 2).Formula =
"=VLOOKUP(A" & X & ",'SheetW'!$A$2:$B$321,2,FALSE)"
Next X
And the result looks like below. But empty *EID*s are filling with 0. How to
display Y/N as blank instead of 0 if EID is blank
1 EID Y/N
2 0
3 1001 n
4 0
5 1004 n
6 0
7 1005 n
8 0
9 1006 y
10 0
11 1009 n
12 1006 y
13 1007 n
14 1008 y
15 1010 0
Simple If statement should do the trick. Check if the cell next to you is blank. If true, then "" if false then have your formula.
Easiest by way of a quick fix may be merely to accept the output 'as is' and format ColumnB with something like ;;;General, effectively hiding the cell's contents.
An alternative quick fix would be to trap the errors arising from failed lookups (ie #N/A results rather than 0) within the existing formula, by changing it to:
wsSheet1.Cells(X, 2).Formula = "=iferror(VLOOKUP(A" & X & ",'SheetW'!$A$2:$B$321,2,FALSE),"""")"
so where there are no values to be looked up the formula displays nothing.

Hiding rows by using a criteria on specific cells is too slow

I am new to VBA, I have a requirement where I have a sheet of data containing more than 5000 rows. Rows might increase and decrease so it is not a fixed value. Now I have 20 columns and I have a user input cell where he will enter say in cell A3 5 and cell A5 is 9.
What I need to do is filter row wise, matching values in the range and hide them. The complication is that the matching value has to look on only 5 out of 20 columns. When any one of the cell values matches from the 5 columns, then I should not hide the rows and the rest should be hidden.
example:
A B C D E F G H I J
2014.Jan 2014.FEB 2014.MAR
value1 value2 value3 value1 value2 value3 value1 value2 value3
Material1 5 10 15 20 25 30 35 40 45
Material2 7 8 9 10 11 12 13 14 15
Material3 2 3 4 5 6 7 8 9 10
When user enter value 0 to 9.
Then I need to check the columns D,G,J and check if the number falls between 0 to 9. If any one
matches then don't hide the row.
The output should be:
A B C D E F G H I J
2014.Jan 2014.FEB 2014.MAR
value1 value2 value3 value1 value2 value3 value1 value2 value3
Material2 7 8 9 10 11 12 13 14 15
Material3 2 3 4 5 6 7 8 9 10
I tried the code below as a sample of the solution but it takes a long time as it loops each field.
Application.EnableEvents = False
LastRow = Cells(Cells.Rows.Count, "H").End(xlUp).Row
On Error Resume Next
For Each c In Range("M21:M" & LastRow)
If c.value >= Range("L10").value And c.value <= Range("N10").value Then
c.EntireRow.Hidden = True
Else
c.EntireRow.Hidden = False
End If
Next
On Error GoTo 0
Application.EnableEvents = True
Is there a better way to do it?

SPSS using value of one cell to call another cell

Below is some data:
Test Day1 Day2 Score
A 1 2 100
B 1 3 62
C 3 4 90
D 2 4 20
E 4 5 80
I am trying to take the values from column 'day' and 'day2' and use them to select the row number for the column score. For example for Test A I would like to find the sum of 100 and 62 because that is the values of the first and second rows of score. Test B I would like to find the sum of 100, 62 and 90.
Does anyone have any ideas on how to go about doing this? I am looking to use something similar to the indirect function in Excel? Thank You
The trick is to convert variable "Score" as a row. Could not think of an easy way how to avoid SAVE/GET - room for improvements.
file handle tmp
/name = "C:\DATA\Temp".
***.
data list free /Test (a1) Day1 (f8) Day2 (f8) Score (f8).
begin data
A 1 2 100
B 1 3 62
C 3 4 90
D 2 4 20
E 4 5 80
end data.
comp f = 1.
var wid all (12).
save out "tmp\data.sav".
***.
get "tmp\data.sav"
/keep score.
flip.
comp f = 1.
match files
/file "tmp\data.sav"
/table *
/by f
/drop case_lbl.
comp stat = 0.
do rep var = var001 to var005
/k = 1 to 5.
if range(k, Day1, Day2) stat = sum(stat, var).
end rep.
list Test Day1 Day2 Score stat.
The result:
Test Day1 Day2 Score stat
A 1 2 100 162
B 1 3 62 252
C 3 4 90 110
D 2 4 20 172
E 4 5 80 100
Number of cases read: 5 Number of cases listed: 5