Conditional formatting: max value, comparing rows with specific data - formatting

I am making an exercise tracking sheet with Google Sheets, and ran into a problem. The sheet has a table for raw data such as day, exercise type chosen from a validated list, and sets, reps, weight, you name it. To find the useful information for analysis, I have set up a pivot table. I want to find the max values for each type of value per exercise.
For example, comparing all the three instances of "DL-m BB" in column D, the table should highlight the highest values between all them: H9 would be the record weight, F5 record volume and so on, and for "SQ-lb BB box" H12 would be max weight and F3 max volume. Eventually the table will have several hundred rows per year, and finding max values per exercise per attribute is going to be too much of a task, time better spent elsewhere.

The Conditional Formatting can be set as follow for the two examples you give above. A separate rule is set for each. They are set from the same cell (H1) adding an additional rule.
Apply to Range
H1:H1000
Custom Formula is
=$H1=max(filter($D:$H,$D:$D="DL-m BB"))
Add another rule
Apply to Range
H1:H1000
Custom Formula is
=$H1=max(filter($D:$H,$D:$D="SQ-lb box BB"))
Place this on your Pivot table page (Try M1 - it must be outside of the PT)
It list max correctly.
=UNIQUE(query($D2:$K,"SELECT D,Max(F),Max(G),Max(H),Max(I),Max(J), Max(K) Where D !='' Group By D Label Max(F) 'Max F', Max(G) 'Max G', Max(H) 'Max H', Max(I) 'Max I',Max(J) 'Max J',Max(K) 'Max K'"))
The below query lists the max for F
=query($D2:$F,"SELECT Max(F) Where D !='' Group By D label Max(F)''")
I have been trying this with conditional formatting and it almost works. Maybe you will see something I don't. Still trying.
This works.
function onOpen(){
keepUnique()
}
function keepUnique(){ //create array og unique non black values to find max for
var col = 4 ; // choose the column you want to use as data source (0 indexed, it works at array level)
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sh = ss.getSheets()[1];
var data = sh.getRange(2, col, sh.getLastRow()-2).getValues();
var newdata = new Array();
for(nn in data){
var duplicate = false;
for(j in newdata){
if(data[nn][0] == newdata[j][0] || data[nn][0]==""){
duplicate = true;
}
}
if(!duplicate){
newdata.push([data[nn][0]]);
}}
colorMax(newdata)
}
function colorMax(newdata){
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sh = ss.getSheets()[1];
lc=sh.getLastColumn()
var data = sh.getRange(2, 4, sh.getLastRow()-2,lc).getValues(); //get col 4 to last col
for(k=2;k<lc;k++){
for(i=0;i<newdata.length;i++){
var maxVal=0
for(j=0;j<data.length;j++){
if(data[j][0]==newdata[i][0]){
if(data[j][k]>maxVal){maxVal=data[j][k];var row=j+2} //find max value and max value row number
}}
var c =sh.getRange(row,k+4,1,1)//get cell to format
var cv=c.getValue()
c.setFontColor("red") //set font red
}}
}

Related

Pandas: Creating an index year ago for multiple items

I can create an index vs the previous year when I have just one item, but I'm trying to figure out how to do this when I have multiple items. Here is my data set:
rng = pd.date_range('1/1/2011', periods=3, freq='Y')
rng = np.repeat(rng,3)
country = ["USA","Brazil","Japan"]*3
df = pd.DataFrame({'Country':country,'date':rng,'value':range(20,29)})
If I only had one item/country I can do something like this:
df['pct_iya'] = 100*(df['value'].pct_change()+1)
I'm trying to get this to work with multiple items. Here is the expected result:
Maybe this could work with a groupby, but my attempt did not work...
df['pct_iya2'] = df.groupby(['Country','date'])['value'].pct_change()
Answer: Use a group by (excluding date) and than add one to the percent change (ex +15percent goes from .15 to 1.15), then multiple you 100.
df['pct_iya'] = 100*(df.groupby(['Country'])['value'].pct_change()+1)

Conditional Running Tally / Cumulative Sum: Looking for Formula or Script

I work in a warehouse, and I'm using Google Sheets to keep track of inventory. Adding and subtracting is easy, but I've been tasked with creating a "reserve" system:
A number of pieces of stock are reserved for upcoming jobs. When that stock is ordered and received, the reserve quantity is "satisfied" and decreases by the number of pieces received. The problem with setting it up just like the ADD and SUBTRACT function is that not all stock received is "reserved", and my RESERVE totals end up being "-57", "-72", "-112", etc.
I have a large dataset of form responses logged in four columns: Timestamp, Item ID#, Action (ADD, SUBTRACT, or RESERVE), and QTY. What I'm looking for is a way to create a running tally in column E for each unique Item ID#, using the values in column D "QTY". I need for any value <0 to return "0".
Example Sheet
I've been able to create a running tally formula that satisfies my conditions for one Item ID# at a time. To avoid creating a separate column for each Item ID#, though, I need to figure out how to apply it separately to each unique Item ID#, and array it down Column E so each new form response is calculated automatically.
=if(C2="RESERVE",E1+D2,if(and(C2="ADD",(E1+D2)<0),0,E1+D2))
The closest thing to a solution I've been able to find is a script created by user79865 for this question titled: "Running Total In Google Sheets with Array". Unfortunately, trying to plug this into Google Sheets Script Editor gives me an error popup:
TypeError: Cannot read property "length" from undefined. (line 2, file "runningtotal")
I have no programming background and never dreamed I'd be looking at code just to make a running tally.
If anybody can offer any insight into this, fixing or replacing the script or offering an ARRAYFORMULA solution, I'd really appreciate it!
function runningTotal(names, dates, amounts) {
var sum, totals = [], n = names.length;
if (dates.length != n || amounts.length != n) {
return 'Error: need three columns of equal length';
}
for (var i = 0; i < n; i++) {
if (names[i][0]) {
sum = 0;
for (var j = 0; j < n; j++) {
if (names[j][0] == names[i][0] && dates[j][0] <= dates[i][0]) {
sum = sum + amounts[j][0];
}
}
}
else {
sum = '';
}
totals.push([sum]);
}
return totals;
}

Reduce Complexity of This Formula?

I have a question about optimizing a formula I've been using in Google Sheets:
=ARRAYFORMULA(
IF(
IFERROR(
MATCH($B2 & A2, ($B$1:B1) & ($A$1:A2), 0),
0
) = 0,
1,
0))
The formula works by counting all the unique values in column A (ID) given that it appears in the date range of column B (Date), to give an output in column C (Count).
Notice how the count values are only 0 and 1, and will only show a 1 if it is the ID's first appearance in the date range.
Example data below.
ID Date Count
138 Oct-13 1
138 Oct-13 0
29 Oct-13 1
29 Nov-13 1
138 Nov-13 1
138 Nov-13 0
The issue is once I get over 10000 lines to parse, the formula grinds to a slow pace, and takes upwards of an hour to finish computing. I'm wondering if anyone has a suggestion on how to optimize this formula so I don't need to have it running for so long.
Thanks,
I've been playing around with some formulas, and I think this one works better, but is still becoming quite slow after 10000 lines.
=IF(COUNTIF((FILTER($A$1:$A2, $B$1:$B2 = $B2)),$A2) = 1, 1, 0)
Edit
Here is an additional formula posted on the Google Product Forum which only has to be put in one cell, and autofills down. This is the best answer I've found so far.
=ArrayFormula(IF(LEN(A2:A),--(MATCH(A2:A&B2:B,A2:A&B2:B,0)=ROW(A2:A)-1),))
I wasn't able to find a formula-only solution that I could say outperforms what you have. I did, however, come up with a custom function that runs in linear time, so it ought to perform well. I'd be curious to know how it compares to your final solution.
/**
* Returns 1 for rows in the given range that have not yet occurred in the range,
* or 0 otherwise.
*
* #param {A2:B8} range A range of cells
* #param {2} key_col Relative position of a column to key by, e.g. the sort
* column (optional; may improve performance)
* #return 1 if the values in the row have not yet occurred in the range;
* otherwise 0.
* #customfunction
*/
function COUNT_FIRST_OF_GROUP(range, key_col) {
if (!Array.isArray(range)) {
return 1;
}
const grouped = {};
key_col = typeof key_col === 'undefined' ? 0 : key_col - 1; // convert from 1-based to 0-based
return range.map(function(rowCells) {
const group = groupFor_(grouped, rowCells, key_col);
const rowStr = JSON.stringify(rowCells); // a bit of a hack to identify unique rows, but probably a good compromise
if (rowStr in group) {
return 0;
} else {
group[rowStr] = true;
return 1;
}
});
}
/** #private */
function groupFor_(grouped, row, key_col) {
if (key_col < 0) {
return grouped; // no key column; use one big group for all rows
}
const key = JSON.stringify(row[key_col]);
if (!(key in grouped)) {
grouped[key] = {};
}
return grouped[key];
}
To use it, in Google Sheets go to Tools > Script editor..., paste it into the editor, and click Save. Then, in your spreadsheet, use the function like so:
=COUNT_FIRST_OF_GROUP(A2:B99, 2)
It will autofill for all rows in the range. You can see it in action here.
If certain assumptions are fulfilled, Like, 1. Same ID numbers always occur together(If not, maybe you could SORT them by ID first and then date later), then,
=ARRAYFORMULA(1*(A2:A10000&B2:B10000<>A1:A9999&B1:B9999))
If dates are recognised, I think you could use + instead of & . Again, Various assumptions were made here and there.

dimple.js aggregating non-numerical values

Say i have a csv containing exactly one column Name, here's the first few values:
Name
A
B
B
C
C
C
D
Now i wish to create a bar plot using dimple.js that displays the distribution of the Name (i.e the count of each distinct Name) but i can't find a way to do that in dimple without having the actual counts of each name. I looked up the documentation of chart.addMeasureAxis(
https://github.com/PMSI-AlignAlytics/dimple/wiki/dimple.chart#addMeasureAxis) and found that for the measure argument:
measure (Required): This should be a field name from the data. If the field contains numerical values they will be aggregated, if it contains any non-numeric values the distinct count of values will be used.
So basically what that says is it doesn't matter how many times each category occurred, dimple is just going to treat each of them as if they occurred once ??
I tried the following chart.addMeasureAxis('y', 'Name') and the result was a barplot that had each bar at value of 1 (i.e each name occurred only once).
Is there a way to achieve this without changing the data set ?
You do need something else in your data which distinguishes the rows:
e.g.
Type Name
X A
Y A
Z A
X B
Y B
A C
You could do it with:
chart.addCategoryAxis("x", "Name");
chart.addMeasureAxis("y", "Type");
or add a count in:
Name Count
A 1
A 1
A 1
B 1
B 1
C 1
Then it's just
chart.addCategoryAxis("x", "Name");
chart.addMeasureAxis("y", "Count");
But there isn't a way to do it with just the name.
Hope this helps.. First use d3.nest() to get the unique values of the dataset. then add the count of each distinct values to the newly created object.
var countData = d3.nest()
.key(function (d) {
return d["name];
})
.entries(yourDate);
countData.forEach(function (c) {
c.count = c.values.length;
});
after this created the dimple chart from the new "countData Object"
var myChart = new dimple.chart(svg, countData);
Then for 'x' and 'y' axis,
myChart.addCategoryAxis("x", "key");
myChart.addMeasureAxis("y", "count");
Please look in to d3.nest() to understand the "countData" object.

Convert cell text in progressive number

I have written this SQL in PostgreSQL environment:
SELECT
ST_X("Position4326") AS lon,
ST_Y("Position4326") AS lat,
"Values"[4] AS ppe,
"Values"[5] AS speed,
"Date" AS "timestamp",
"SourceId" AS smartphone,
"Track" as session
FROM
"SingleData"
WHERE
"OsmLineId" = 44792088
AND
array_length("Values", 1) > 4
AND
"Values"[5] > 0
ORDER BY smartphone, session;
Now I have imported the result in Matlab and I have six vectors and one cell (because the text from the UUIDs was converted in cell) all of 5710x1 size.
Now I would like convert the text in the cell, in a progressive number, like 1, 2, 3... for each different session code.
In Excel it is easy with FIND.VERT(obj, matrix, col), but I do not know how do it in Matlab.
Now I have a big cell with a lot of codes like:
ff95465f-0593-43cb-b400-7d32942023e1
I would like convert this cell in an array of numbers where at the first occurrence of
ff95465f-0593-43cb-b400-7d32942023e1 -> 1
and so on. And you put 2 when a different code appear, and so on.
OK, I have solve.
I put the single session code in a second cell C.
At this point, with a for loop, I obtain:
%% Converting of the UUIDs into integer
C = unique(session);
N = length(session);
session2 = zeros(N, 1);
for i = 1:N
session2(i) = find(strcmp(C, session(i)));
end
Thanks to all!