Use python to pull variable sized range of cells between 2 known fields from Excel file - openpyxl

I want to pull data from an Excel file that matches certain fields. See sample worksheet below:
The fields I need are the dispatch and arrival times but I also need the values in between the two cells in the first column marked "Incident Notes:" and "Event log" (highlighted in bold).
For instance, in the first event, the cells I need are:
08/02/2022 4:02:29
08/02/2022 4:04:22
Pizza with cheese
Did not tip
So there are 2 cells in that range between "Incident Notes" and "Event log" in the first call. In the second call however, there is only a single cell:
"Italian sub". In the third call there are 3 cells:
"Hot wings"
"Rude"
"No street parking".
This is the code I have so far but I am stuck on how to pull a range of cells that vary in number between 2 known fields.
wb = openpyxl.load_workbook('TestSpreadsheet.xlsx')
sheet = wb['Sheet1']
for row in range(1, sheet.max_row):
myrow = sheet[row]
callstatus = sheet['C' + str(row)].value
if callstatus == 'Dispatch:':
calltime =sheet['C' + str(row+1)].value
print(f" Status: {callstatus} at {calltime}")
# TODO: figure out how to pull range of cells between
# Incident Notes: and Event log: cells
I know I want some kind of while loop that does something like
while after 'incident notes:' cell and before 'event log' cell
print cells in between
but am stuck on this.

You can search for the known word(s) that delineate each section then use offsets to find the data from that position since it seems this is consistent in your example data.
Offset allows you to select another cell at row and column from the current cell.
Below is an example code for your sample;
import openpyxl
def fill_list(row_cell):
# Get the cell values by offset positions
data_list = []
corrent_row = 0
search_term2 = 'incident notes:'
search_term3 = 'event log:'
data_list.append(row_cell.offset(column=2, row=2).value)
data_list.append(row_cell.offset(column=4, row=2).value)
while row_cell.offset(row=corrent_row).value.lower() != search_term2:
corrent_row += 1
corrent_row += 1
while row_cell.offset(row=corrent_row).value.lower() != search_term3:
data_list.append(row_cell.offset(row=corrent_row).value)
corrent_row += 1
return data_list, row_cell.row + corrent_row
filename = 'foo.xlsx'
search_term1 = 'event no:'
wb = openpyxl.load_workbook(filename)
ws = wb['Sheet1']
row_next = 0
for row_cell in (ws['A']):
cur_cell_val = row_cell.value
# Jump down to the next row after the last processed data
if row_cell.row < row_next:
continue
if cur_cell_val.lower() == search_term1:
event_list, row_next = fill_list(row_cell)
print(event_list)
['08/02/2022 4:02:29', '08/02/2022 4:04:22', 'Pizza with cheese', 'Did not tip']
['08/02/2022 5:01:12', '08/02/2022 5:14:18', 'Italian sub']
['08/02/2022 6:12:37', '08/02/2022 6:14:50', 'Hotwings', 'Rude', 'No street parking']
---------Additional answers-------
For the offset method both column and row are optional. As you can see from the code extract from the module below both are set to 0 by default
def offset(self, row=0, column=0):
"""Returns a cell location relative to this cell.
:param row: number of rows to offset
:type row: int
Not setting either will mean of course that you are referencing the same cell as your current. However referencing a cell in the same plane as your current cell there is no need to set that attribute to 0.
The dates are down to the Excel formatting used on the cell. I don't know what formatting you have on any of the cells so for the test output I set the date cells as text. Your sheet appears is using a custom format for these date/time cells. If that format was like 'dd-mmm-yy hh:mm:ss' it would display as shown in Excel however its probably using forward slashes so python is interpreting as datetime.datetime(2022, 8, 2, 4, 2, 29). If you can/do not want to change the cell formatting then just convert in python. The exact conversion will probably depend on what you want to do with the dates. If its just for display you can do the following. The default will display with hyphens the second line will change the hyphens to forward slashes.
import datetime
print(datetime.datetime(2022, 2, 8, 5, 1, 12))
print(str(datetime.datetime(2022, 2, 8, 5, 1, 12)).replace('-', '/'))
Output
2022-02-08 05:01:12
2022/02/08 05:01:12

Related

Return values from other workbook

Have a question about formula which will resolve my issue.
In my main workbook I need to compare data from two sources.
One of the columns must retrieve data(amounts) from other workbook.
I want formula which will search for all amounts in column G and will skip all blank cells. Tried to use VLOOKUP, INDEX and SMALL functions but no effect.
Each day amounts are different and I need to match them in main file and find exeptions.
Any ideas?
How about an array formula such as the following?
=INDEX($G$2:$G$20,SMALL(IF(($G$2:$G$20)=0,"",ROW($G$2:$G$20)),ROW()-1)-ROW($G$2:$G$20)+1)
The formula would have to be placed into cell I2 as an array formula (which must be entered pressing Strg + Shift + Enter). Then you can drag down the formula to get all the other values.
It doesn't have to be in column I but it has to be in row 2 because this formula get's the n-th Number from the list which is not = 0. The n-th place is (in this formula) row()-1. So for row 2 it will be 2-1=1 and thus the 1st number. By dragging down the formula you get the 2nd, 3rd, etc. number. If you start with the formula in cell I5 instead then it would have to be adjusted to be as follows:
=INDEX($G$2:$G$20,SMALL(IF(($G$2:$G$20)=0,"",ROW($G$2:$G$20)),ROW()-4)-ROW($G$2:$G$20)+1)
You could loop through the column and store each value >0 in an array and then compare or you loop through the column and compare directly...
something like:
Dim i as Integer = 0
Foreach value in Maintable
Do
If otherworkbook.cells(i,7) = value Then '7 for G
do your stuff
End If
i = i + 1
While i < otherworkbook.rows.count
Next
I think that could be the right approach

"Range" object must move along when adding a new row

I'm trying to color different rows depending on an if-statement. Example:
' # 1
If (test = True) Then
Worksheets("sheet1").Range("A1:J1").Interior.Color = varColor1
Else
Worksheets("sheet1").Range("A1:J1").Interior.Color = varColor2
End If
' # 2
If (test2 = True) Then
Worksheets("sheet1").Range("A2:J2").Interior.Color = varColor1
Else
Worksheets("sheet1").Range("A2:J2").Interior.Color = varColor2
End If
' # 3
'...etc
My question is, if I add a new row in the excel sheet, the value of the "Range" becomes incorrect. For instance, if I add a row between A and B the program will color the new inserted row (as this new inserted row becomes B) and the actual B row that I want to color is now C. How can I ensure that when a row is added, the correct row is still colored. Ofcourse I can change the "Range" values manually but I have a lot of them so it will take very long to change all, there must be another way to do it...
If you define a named range (Ctrl-F3) XL will expand the definition if you insert a column within that range.
Worksheets("sheet1").Range("test1_range").Interior.Color = varColor1
Be careful if you insert a column at left edge of the named range - it will not be included into the named range.

If dropdown value is selected add formatting and values to cells to the right

Ok so I am making a sheet where a user would select a value from a dropdown and then needs to see the required parameters based on the selection.
Example:
If i select X from the drop down then three new input fields would be created and formatted as input cells with the parameter as the default value (so the yellowish orange input format with default text explaining what the parameter is -- like start date for one then the next end date)
My start:
Sub test()
If ActiveCell = "setEnrollnet" Then
ActiveCell.Offset(0, 1) = startDate
ActiveCell.Offset(1).EntireRow.Insert
ActiveCell.Offset(1, 1) = 0
ActiveCell.Offset(0, 2) = endDate
ActiveCell.Offset(1).EntireRow.Insert
ActiveCell.Offset(1, 2) = 0
End If
End Sub
This does not work well and it does not apply formatting. I need help figuring out how to do both format and input the correct text based on selection in another cell.
I suggest - for each value in the drop down list - you define the input fields in a seperate sheet, together with field labels and formats. After selection from the drodown, you scan the parameter table and copy whatever is listed for that parameter to your input sheet.
Example (range named InputParams):
ParamVal InputLabel InputField
X StartDate (empty but formatted)
X EndDate (empty but formatted)
Y Foo default_Foo
Y Bar default_Bar
Upon selecting "X" from the dropdown list you walk through InputParams and,
If InputParams(Idx, 1) = DropDownValue Then copy cells InputParams(Index,2) and InputParams(Index,3) to your input range - by default formats are copied along with content.
It is desireable to define a named range also for Input, giving it a size of maximum parameters expected, so that you can clear content and formats with one single statement prior to re-populating this range when walking through the InputParams list.

Clear all cells in column who = ""?

So, I have a script that is plotting data points for me, and I'm running into an issue where there are several blank spots in the data. Of of my columns is calculated, and I have a formula that sets it equal to "" if more than 0 cells that are used in the calculation are blank. The plots that use the blank cells work fine to show gaps in the data, but Excel doesn't evaluate a cell that has a formula that results in "" as blank.
Therefore, I need to set up some code that can search the entire column of data and clear the formula out of the cells whose value equal "", thereby making them blank and displaying as gaps in the plot.
I know I can use the Find and What commands to find the first cell that is evaluated as "", but how can I use it to find all the cells in the column?
The row range for the data is always constant, between 4 and 243, and the column number I am searching (within a loop) evaluates as 3*(iCounter - 1) + 2, if that helps anyone.
(The range I am searching is equal to Range(Cells(4, 3*(iCounter - 1) + 2), Cells(243, 3*(iCounter - 1) + 2))
Click on any cell in the column you wish to cleanup and run this:
Sub ClearThem()
Dim BigR As Range, r As Range
Set BigR = Intersect(ActiveCell.EntireColumn, ActiveSheet.UsedRange)
For Each r In BigR
If r.Value = "" Then r.Clear
Next r
End Sub

Auto Fill Row B with the last four characters of Row A

So basicly i want a VBA script to fill Row B with the last four characters that are in Row A
RowA contains a telephone number with around 12 numbers in it.
Assuming that you meant to say
I have a series of telephone numbers in column A. I would like to
create a second column in which I have just the last four digits of
these numbers. I am new to Excel. Could someone please help me get
started on this?"
The answer would go like this:
In Excel you can create formulas that compute "something" - often based on the contents of other cells. For your specific situation, there is a function called RIGHT(object, length) which takes two arguments:
object = a string (or a reference to a string)
length = the number of characters (starting from the right) that you want.
You can see this for yourself by typing the following in a cell:
=RIGHT("hello world", 5)
When you hit <enter>, you will see that the cell shows the value world.
You can extend this concept by using a cell reference rather than a fixed string. Imagine you have "hello world" in cell A1. Now you can put the following in cell B1:
=RIGHT(A1, 5)
and you will see the value "world" in B1.
Now here is the cool trick. Assume you have a bunch of numbers in column A (say starting at row 2, since row 1 contains some header information - the title of the column). Then you can write the following in cell B2:
=RIGHT(A2, 4)
to get the last four digits. Now select that cell, and double-click on the little box in the bottom right hand corner:
Like magic, Excel figures out "you want to do this with all the cells in this column, for as many rows as there is data in Column A. I can do that!" - and your formula will propagate to all cells in column B, with the row number adjusted (so in row 3, the formula will be
=RIGHT(A3, 4)
etc.
Try
Dim ws As Worksheet
Set ws = Worksheets("Sheet1")
With ws.Range("B2:B99")
.Formula = "=Right(A2, 4)"
.Value = .Value
End With