How can I compare two sets of data having two columns in excel? Picture below will elaborate - vba

Below are two sets of data. Each has two columns. I want that that the similar data comes in front of each other.

This is a manual solution with formulas and sorting.
Imagine the following data in columns A to E:
Enter the following formulas into columns G to K
Column G: =IFERROR(IF(VLOOKUP(D:D,A:B,2,FALSE)=E:E,1,2),3)
Column H: =IF(G:G<3,D:D,"")
Column I: =IFERROR(VLOOKUP(H:H,A:B,2,FALSE),"")
Column J: =D:D
Column K: =IFERROR(VLOOKUP(J:J,D:E,2,FALSE),"")
The column G sort by now shows:
1 if part and quantity matched
2 if only part matched
3 if nothing matched
So if you now select data from A3:K10 and sort by column G (sort by) then it will result in this:

Related

how to sum rows in my dataframe Pandas with specific condition?

Could anyone help me ?
I want to sum the values with the format:
print (...+....+)
for example:
a b
France 2
Italie 15
Croatie 7
I want to make the sum of France and Croatie.
Thank you for your help !
One of possible solutions:
set column a as the index,
using loc select rows for the "wanted" values,
take column b,
sum the values found.
So the code can be:
result = df.set_index('a').loc[['France', 'Croatie']].b.sum()
Note double square brackets. The outer pair is the "container" of index values
passed to loc.
The inner part, and what is inside, is a list of values.
To subtract two sums (one for some set of countries and the second for another set),
you can run e.g.:
wrk = df.set_index('a').b
result = wrk.loc[['Italie', 'USA']].sum() - wrk.loc[['France', 'Croatie']].sum()

How to change rows in pandas based on an attribute of the other rows

I have a dataframe with columns: A(continuous variable) and B(discrete 1 or 0). The df is initially sorted by A variable.
I need to order the dataframe so for each set of X rows, there are Y rows with value 1 in B column, and (X-Y) rows with 0 (B column) (when possible!). But these sets should have variable A in desceding order. X and Y are input by the user
Example:
X=4, Y=3
Rows 0-11 are ok, since the sets (0-3),(4-7) and (8-11) has 3 rows with 1 in column B and only one row with 0 AND variable A is descending. However, rows 12-15 are not ok, since there are 2 rows with 1(variable B) and two with 0. Row 17 would replace row 15 to make this set valid. There is no problem if the last rows has 0 in variable B, since there isn't any with value 1.
The code should be general enough to run on dataframes with different number of rows.
Any ideas?

Mark accumulated values on a QlikView column if condition is fulfilled

I have a table in Qlikview with 2 columns:
A B
a 10
b 45
c 30
d 15
Based on this table, I have a formula with full acumulation defined as:
SUM(a)/SUM(TOTAL a)
As a result,
A B D
b 45 45/100=0.45
c 30 75/100=0.75
d 15 90/100=0.90
a 10 100/100=1
My question is. how do I mark in colour the values in column A that have on column D <=0.8)?
The challenge is that D is defined with full accumulation, but if I reference D in a formula, it doesn't consider the full accumulation!
I tried with defining a formula E=if(D>0.8,'Y','N') but this formula doesn't take the visible (accumulated) value for D unfortunately, instead it takes the D with no accumulation. If this worked, I would have tried to hide (not disable) E and reference it from the dimensions column of the table , Text colour option. Any ideas please?? Thanks
You can't get an expression column's value from within a dimension or it's properties, because the expression columns rely on the dimensions provided. It would create an endless loop. Your options are:
Apply your background colour to the expression columns, not the dimensions. This would actually make more sense as the accumulated values would have the colour, not the dimension.
When loading this specific table, have QlikView create a new column that contains the accumulated values of B. This would mean, however, that the order of your chart-table would need to be fixed for the accumulations to make any sense.
Use aggregation to create a temporary table and accumulate the values using RangeSum(). Note this will only accumulate properly if the table is ordered in Ascending order of Column A
=IF(Aggr(RangeSum(Above(Sum(B),0,10)),A)/100>0.8,
rgb(0,0,0),
rgb(255,0,0)
)

Add values from a column when two other columns match

I have an ecology data table with about 12,000 rows. There are three columns: site, species, and value. I need to add up the values for each set of matching site and species - for example, all "red maple" values at "site A". I have the data sorted by site and species, so I can do it by hand, but it's slow going. The number of site/species matches varies, so I can't just add up the values in sets of three or anything.
Similar types of questions have talked about pivot tables, but none have needed to match two columns and add a third column, and I haven't been able to figure out how to extrapolate to my situation.
I'm reasonably comfortable coding and would like to do something that looks like this pseudocode, but I'm not clear on the syntax in VBA:
For each row
if a(x) = a(x+1) and b(x) = b(x+1) then
sum = sum + c(x)
else
d(x) = sum
sum = 0
next
Any ideas?
In a PivotTable, put site in Row Labels and species in Column Labels (or vice versa) and Sum of value in Σ Values:

Excel: one column has duplicates of each value, I need to take averages of the corresponding two values from the other columns

Example:
column A column B
A 1
A 2
B 2
B 2
C 1
C 1
I would somehow like to get the following result:
column A column B
A 1.5
B 2
C 1
(which are averages of 1 and 2, 2 and 2 and 1 and 1)
How do I achieve that?
Thanks
If you're using Excel 2007 or above, you can also use the shorter AVERAGEIF function:
=AVERAGEIF($A$1:$A:$6,D1,$B$1:$B$6)
Less typing, easier to read..
In D1:D3, type A, B, C. Then in E1, put this formula
=SUMIF($A$1:$A$6,D1,$B$1:$B$6)/COUNTIF($A$1:$A$6,D1)
and fill down to E3. If you want to replace the existing data, copy E1:E3 and paste-special-values over itself. Then delete A:C.
Alternatively, you can add headers to your data, say "Letter" and "Number". Then create a Pivot Table from your data. Put Letter in the rows section and Number in the Data section. Change your Data section from SUM to AVERAGE and you'll get the same result.