Re-writing an AVERAGEIFS statement into a STDEV statement - vba

I am looking to rewrite my averageifs statement into a STDEV statement. I currently have an average if statement which looks for the current name "N" within the type "M", and finds the same type and name within columns "A" and "B", and will average the results "C" for those rows.
=AVERAGEIFS(C:C,A:A,M4,B:B,N4)
=AVERAGEIFS(C:C,A:A,M5,B:B,N5)
=AVERAGEIFS(C:C,A:A,M6,B:B,N6)
etc...
I would like to do the same with STDEV, however the inputs requirements are different as STDEVIFS, doesn't exist.
=STDEV(BG:BG,C:C,BL4,K:K,BM4) will give the dev of all the columns. How could I fix this to be the same as my averageifs statement, but for STDEV.
A B C M N O
x x x x x x
x x x x x x
Type Name Mass Type Name AVG Mass
Fruit Apple 3 Fruit Apple 4.25
Veggie Tomato 5 Fruit Orange 6.5
Veggie Lettuce 1 Veggie Tomato 6.333
Veggie Tomato 7 Veggie Lettuce 2.3333
Fruit Orange 6 Fruit Watermelon 5.5
Fruit Apple 5 Veggie Watermelon 4
Fruit Watermelon 5
Veggie Watermelon 3
Fruit Apple 3
Fruit Apple 6
Veggie Watermelon 5
Fruit Watermelon 6
Fruit Orange 7
Veggie Lettuce 3
Veggie Lettuce 3
Veggie Tomato 7
x = non included info

You would us an array form of STDEV with an IF() inside:
=STDEV(IF(($A$4:$A$19=M4)*($B$4:$B$19=N4),$C$4:$C$19))
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of enter when exiting edit mode. If done correctly the Excel will put {} around the formula.
Array formulas are different than normal formula in that they will calculate every thing in the referenced range and therefore full column references should not be used.
You can do as Ron stated in his answer and name the ranges to limit the references. Or you can use a table which will do the same or just limit the range as I have done here.

You can use an IF function to return either the appropriate value, or a Boolean FALSE which will be ignored. So, given your data sample, and example might be: (entered with ctrl+shift+enter as an array formula)
=STDEV(IF((Type=M5)*(Name=N5),Mass))
Type is the Named Range in Column A
Name is the Named Range in Column B
Mass is the Named Range in Column C
M5 contains the Type you are looking for
N5 contains the Name you are looking for
Note: You can use whole (or partial) column arguments instead of Named Ranges in the formulas above -- it would just be less efficient.

Related

Minimum number if Common Items in 2 Dynamic Stacks

I have a verbal algorithm question, thus I have no code yet. The question is this: How can I possibly create an algorithm such that I have 2 dynamic stacks, both can or can not have duplicate items of strings, for example I have 3 breads, 4 lemons and 2 pens in the first stack, say s1, and I have 5 breads, 3 lemons and 5 pens in the second stack, say s2. I want to find the number of duplicates in each stack, and print out the minimum number of duplicates in both lists, for example:
bread --> 3
lemon --> 3
pen --> 2
How can I traverse 2 stacks and print the number of duplicated occurrences until the end of stacks? If you are confused about anything, I can edit my question depending on your confusion. Thanks.

Compare multiple values based on cell Value

I have a 3 datasets.
Master dataset have
A B C D
11 T Jim India
12 U Mary UK
13 V Bob US
14 P Peter India
India dataset
A B H K
10 11 T Jim
10 13 0 Krestel
10 14 P Peter
10 15 L Robert
If the D coulmn had India then the details of columns A, B and C should match that in India dataset with coulmn B, H and K respectively. (The combination of the column A, B and C should present in Dataset- India, If not hoghlighted or add comment in last column of master dataset)
I have been doing this manually by adding several helper columns in all the datasets using concatenation and then using vlookup.
Is it possible to automate this process using vba?
Any help will be appreciated.
Actually, I think that you can achieve this through spreadsheet functions alone, without the need of VBA. Check the usage of the function VLOOKUP.
The idea would be to deploy a formula in, say, column "E" of the Master dataset that would check for an entry in the relevant country dataset matching the values of A, B and C. You will need to build the reference to the range VLOOKUP uses taking into account the country name.
Hope this serves you as a good guide.

How to find conditional cumulative sums in an excel table using VBA macro

Let's say I have two columns.
3.5463 11
4.5592 12
1.6993 111
0.92521 112
1.7331 121
2.1407 122
1.4082 1111
2.0698 1112
2.3973 1121
2.4518 1122
1.1719 1211
1.153 1212
0.67139 1221
0.64744 1222
1.3705 11111
0.9557 11112
0.64868 11121
0.7325 11211
0.58874 11212
0.86673 11221
0.17075 11222
0.64026 12111
0.80229 12112
0.43422 12122
1.0405 12211
0.63376 12212
0.56491 12221
0.34626 12222
0.81631 111111
0.91837 111112
0.70013 111121
0.87384 111122
1.1474 111211
0.47411 111221
0.12249 111222
0.56728 112111
0.88169 112112
0.14509 112121
0.68655 112211
0.36274 112212
1.1652 121111
0.99314 121112
0.42024 121121
0.23937 121122
1.0346 122111
0.64642 122112
0.15632 122121
0.41725 122122
0.40793 122211
In the first column, there is a number. With every one of those numbers, in the second column, is an associated ID. Now, there are some blank rows that do not contain any numbers in them.
Define one of these numbers to be a "daughter" of another number if the ID of the first number is the same as the ID of the second, with an extra digit on the end. For example, both IDs 11211 and 11212 are daughters of 1121, because the ID of 1121 has an extra digit, either a 1 or a 2, added onto the end to form the ID of its daughters. Thus, 1121 is the parent of both 11211 and 11212.
Here is what I want the macro to do. It must output a third column which contains, for every row, a cumulative sum of the number of the first column in that row, plus the parent number of that number, and the parent number of the parent number, etc. all the way up until it reachers either 11 or 12. It will begin by simply outputting the numbers in column 1 for 11 and 12 in the third column. Then, in a loop beginning with 111, it will add up the cumulative sum of every row (the number in that row plus the third column output of the parent), only if that row has a number and an id, and only if the parent exists and has an output in column 3. So for example, the number in the 3rd column of the row with ID 11222 should be the number in column 1 of that row, plus that of 1122, plus that of 112, plus that of 11. So, 0.17075+2.4518+0.92521+3.5463, or 7.09406. However, if you try to do this for ID 111221, you will notice that the row where the parent 11122 should be is empty. Thus, the parent does not exist, and no value will be outputted in column 3 for 111221.
I would greatly appreciate it if someone has some time on their hands to code up this VBA macro for me in exchange for an accepted solution.
Thanks
I don't think a macro is needed, just some formulas. First, I put a header on my columns of data, such as "value," and "id." If you then highlight the column labels (i.e., A and B) and sort by B ("id") then A ("value"), you'll group your blank rows. You can then delete those rows. Now you have the data almost ready. When I did this, I converted the id column to text, as opposed to a number value, so if I sort the table by id, the pattern will be, "11, 111, 1111," and so on, instead of, "11, 12, 111, 112, 121." Then, I added columns to separate the separate characters or levels of the ids. This is to help with parents and children. You can use text-to-columns, or a MID formula, but what I did was have 6 more columns to the right. For each id row, each column would either have a "1," a "2," or a blank (null) value. Then I added another column, calling it "level." I used a formula like COUNTA across all my id splitting columns. So, for 11, my level value was 2. 111 would be 3, 11221 would be 5, and so on. This gives me the id level (parent, child, grandchild, etc). Then I added my final column to the right to compute my cumulative sum of the values. In concept I have one big nested IF statement, but in practice, I needed two. My formula says, if the row above me has a lower level number (i.e., it is some kind of parent), add the value of the current row to the value of the above row. Otherwise, keep going up a row till I do get a parent, and add the current row value to that number.
My final formula for all but the first 5 rows of data was (in the 6th row of data):
=if(K6
rest of answer is below
=if(K6<K7,L6+C7,if(K5<K7,L5+C7,if(K4<K7,L4+C7,if(K3<K7,L3+C7,if(K2<K7,L2+C7,C7)))))
The values were column C, the original id in column D, the id split columns were E through J, the level column was K, and my formula was in L. This formula can be copied down the table. For the first 4 rows, you just need 1 less IF statement each row you go up. The fifth row of data might take the above formula; it depends how it will deal with the column headers in row one. The formula on the 4 row of data might be:
=if(K4<K5,L4+C5,if(K3<K5,L3+C5,if(K2<K5,L2+C5,if(K1<K5,L1+C5,C5))))
I'm still learning how to format these comments, so I'll try to provide a sample of the layout I have...
C D E F G H I J K L
1 value id 1 2 3 4 5 6 lvl cumul_sum
2 3.546300 11 1 1 2 3.546300
3 1.699300 111 1 1 1 3 5.245600
4 1.408200 1111 1 1 1 1 4 6.653800
5 1.370500 11111 1 1 1 1 1 5 8.024300
6 0.816310 111111 1 1 1 1 1 1 6 8.840610
7 0.918370 111112 1 1 1 1 1 2 6 8.942670
8 0.955700 11112 1 1 1 1 2 5 7.609500
So for example, the number in the 3rd column of the row with ID 11222 should be the number in column 1 of that row, plus that of 1122, plus that of 112, plus that of 11. So, 0.17075+2.4518+0.92521+3.5463, or 7.09406.However, if you try to do this for ID 111221, you will notice that the row where the parent 11122 should be is empty. Thus, the parent does not exist, and no value will be outputted in column 3 for 111221.
As a native worksheet array formula¹ in D1,
=IF(LEN(B1), SUM(SUMIFS(A$1:INDEX(A:A, MATCH(1E+99, A:A)),
B$1:INDEX(B:B, MATCH(1E+99, A:A)), LEFT(B1, ROW(INDIRECT("2:"&LEN(B1)))))), TEXT(,))
The above does not compensate for missing parents (null string). It totals everything it can find and uses zero for missing parents.
As a VBA UDF² in E1,
Function conditionalCumulativeSum(nums As Range, _
ids As Range, sib As Range, _
Optional nullOnBlank As Boolean = True)
Dim i As Integer
'truncate any full column reference to the UsedRange
Set nums = Intersect(nums, nums.Parent.UsedRange)
'match the nums and ids ranges
Set ids = ids.Resize(nums.Rows.Count, nums.Columns.Count)
For i = Len(sib.Value2) To 2 Step -1
If nullOnBlank And IsError(Application.Match(--Left(sib, i), ids, 0)) Then
conditionalCumulativeSum = vbNullString
Exit For
End If
conditionalCumulativeSum = conditionalCumulativeSum + _
Application.SumIfs(nums, ids, Left(sib, i))
Next i
If i = 0 Then conditionalCumulativeSum = vbNullString
End Function
The above defaults to return a null string when it encounters any missing parent through the hereditary chain. This can be turned off by adding FALSE as the optional fourth parameter and then the UDF will behave identically to the native formula.
Results from sample data
    
¹ Array formulas need to be finalized with Ctrl+Shift+Enter↵. If entered correctly, Excel with wrap the formula in braces (e.g. { and }). You do not type the braces in yourself. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula. Try and reduce your full-column references to ranges more closely representing the extents of your actual data. Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum. See Guidelines and examples of array formulas for more information.
² A User Defined Function (aka UDF) is placed into a standard module code sheet. Tap Alt+F11 and when the VBE opens, immediately use the pull-down menus to Insert ► Module (Alt+I,M). Paste the function code into the new module code sheet titled something like Book1 - Module1 (Code). Tap Alt+Q to return to your worksheet(s).

Multi-parameter matching + weighted random pick in Redis

Let's say I have a set of objects with properties:
Object Quantity Color Shape Kind
----------------------------------------
APPLE 12 RED ROUND FRUIT
APPLE 3 GREEN ROUND FRUIT
ORANGE 6 ORANGE ROUND FRUIT
CARROT 0 RED CONICAL VEGETABLE
RADISH 24 RED ROUND VEGETABLE
Object and all properties except quantity are represented as strings. Quantity is a number.
I must compose a random list of objects, based on user's query.
Query contains values for all string properties (that is, all properties except quantity).
Value in query may be either exact property value, or a wildcard (meaning "any value would do for this property"), or a negation — "NOT this exact property value".
Query result is an object, picked by weighted random from all object with matching properties. Weight for the random pick is the quantity.
For example:
Query -> Probabilities -> Example
random result
-----------------------------------------------------------------------------
* ROUND FRUIT -> APPLE 12 / APPLE 3 = APPLE 15 -> APPLE
!GREEN ROUND FRUIT -> APPLE 12 / ORANGE 6 -> ORANGE
RED * * -> CARROT 0 / APPLE 12 / RADISH 24
= APPLE 12 / RADISH 24 -> RADISH
RED CONICAL VEGETABLE -> CARROT 0
= (none) -> (none)
For self-education purposes, I would like to build this system using Redis for data storage.
The question is — how to do this elegantly and with least amount of application logic (as opposed to in-Redis operations)? Weights and negation kind of spoil the picture. Otherwise it would be nicely doable with sets.
Any hints are welcome.
Since redis can only query keys and not values, a good option is to store the individual values of each object in seperate redis lists.
For example, when you add the object ...
APPLE 12 RED ROUND FRUIT
you would store it as
hmset obj:1 name apple qty 12 color red shape round kind fruit
and then ...
sadd name:apple obj:1,
sadd color:red obj:1
sadd shape:round obj:1
This way you have a way to interrogate sets directly and be able to pick the object using a random number based on, for example, the total number of items in the set returned.
Hope that helps. If you need more explanation, hit me up.

Excel: one column has duplicates of each value, I need to take averages of the corresponding two values from the other columns

Example:
column A column B
A 1
A 2
B 2
B 2
C 1
C 1
I would somehow like to get the following result:
column A column B
A 1.5
B 2
C 1
(which are averages of 1 and 2, 2 and 2 and 1 and 1)
How do I achieve that?
Thanks
If you're using Excel 2007 or above, you can also use the shorter AVERAGEIF function:
=AVERAGEIF($A$1:$A:$6,D1,$B$1:$B$6)
Less typing, easier to read..
In D1:D3, type A, B, C. Then in E1, put this formula
=SUMIF($A$1:$A$6,D1,$B$1:$B$6)/COUNTIF($A$1:$A$6,D1)
and fill down to E3. If you want to replace the existing data, copy E1:E3 and paste-special-values over itself. Then delete A:C.
Alternatively, you can add headers to your data, say "Letter" and "Number". Then create a Pivot Table from your data. Put Letter in the rows section and Number in the Data section. Change your Data section from SUM to AVERAGE and you'll get the same result.