Determine if worksheet is empty in openpyxl - openpyxl

I have an application where I write to a worksheet to the last column + 2 if there is already data and to the last column + 1 if the worksheet is empty. I get what I think is an empty worksheet as follows:
from openpyxl.workbook.workbook import Workbook
book = Workbook()
sheet = book.active
When I do sheet.max_row and sheet.max_column, I get 1 for both properties. I would like to get zero, so I do the following:
if sheet.max_row == 1 and sheet.max_column == 1 and not sheet['A1'].value:
start_col = 1
else:
start_col = sheet.max_column + 2
Having to check the value of the cell seems a bit overkill, not to mention error-prone. I initially expected the following to work:
if sheet.max_column == 0:
start_col = 1
else:
start_col = sheet.max_column + 2
Is there a reason max_row and max_column are always >= 1? Is this a bug, an intentional feature or a reasonable side-effect of something else? Finally, is there a workaround (e.g. something like sheet.isempty()).
By the way, I found the following comment while browsing the bug tracker: https://bitbucket.org/openpyxl/openpyxl/issues/514/cell-max_row-reports-higher-than-actual#comment-21091771:
Rows with empty cells are still rows. They might be formatted for future use or other stuff, we don't know. 2.3 has slightly improved heuristics but will still contain rows of empty cells.
This (and another comment I can no longer find a link for) lead me to believe that the first cell effectively always exists. In this case, is it worth submitting a feature request?

The only reliable way to determine if a worksheet is "empty" is to look at the _cells attribute. However, this is part of the internal API and liable to change. In fact it is almost certain to change.
The max_row and max_column attributes are used internally for counters and must be 1 or more.
A feature request without relevant code and tests would be rejected and I'm fairly sceptical about the idea of "empty" worksheet in general.

Since I do not find any direct answer I am adding this.
from openpyxl import load_workbook
book = load_workbook('sample.xlsx')
sheet = book.active
if [cell.value for cells in sheet.rows for cell in cells] == [None]:
print ("Sheet is empty")
**sheet.rows can also be replaced with sheet.column, both are generators to iterate cells by rows and columns respectively. So using the list comprehension to list the values would help us to check empty sheets.

Related

Issue with duplicated values in VBA

I would like to replace this formula with a function in VBA : =IFERROR(INDEX(SZCategoryData!E:E,MATCH(1,('SZCategory tailored'!B$3=SZCategoryData!F:F)*('SZCategory tailored'!A12=SZCategoryData!A:A),0)),"")
I used this function:
Sub BRM_ID1()
For i = 2 To 224
For j = 4 To 224
If Worksheets("SZCategoryData").Cells(i, 6).Value = "BRM_ID" Then
Worksheets("SZCategory tailored").Cells(j, 2).Value = Worksheets("SZCategoryData").Cells(i, 5)
End If
Next
Next
End Sub
I have column A ( Task_id ) : 1211,1211,1212,1213,1214 in my sheet SZCategoryData and column B ( BRM_ID associated to each task id ) that I need to copy from SZCategoryData to column C in another sheet SZCategory tailored.
Sometimes my task Id dosen't have an associated brm_id so the probleme with my code is that : it's copy values one after another without checking if it is associated to the right task id. For example my task id 1212 doesn't have a BRM_ID associated in the column B instead of keeping the cell empty it copies the BRM ID of 1213 ( the next one).
I am not completely certain I understand your code, but hopefully this will get you a bit closer to a solution. As Mat's Mug correctly noted, you need more descriptive names with your variables. This makes it far easier to understand your code. It wouldn't hurt to turn on Option Explicit either.
Here's the modified code:
Sub BRM_ID1()
Dim SourceData As Worksheet
' Highly recommend not relying on ActiveWorkbook. Only using it as a qualifier since that is the current qualifier (though implicit).
Set SourceData = ActiveWorkbook.Worksheets("SZCategoryData")
Dim TailoredData As Worksheet
Set TailoredData = ActiveWorkbook.Worksheets("SZCategory tailored")
Dim SourceRow As Long
' You're going to run into issues with the hardcoded min and max values here.
For SourceRow = 2 To 224
Dim DestinationRow As Long
' Here as well.
For DestinationRow = 4 To 224
' Note that I am assuming that you want to match the value in TailoredData.Cells(DestinatioNRow, 6).
' You will need to adjust this depending on where your match value is.
If SourceData.Cells(SourceRow, 6).Value = TailoredData.Cells(DestinationRow, 6).Value Then
TailoredData.Cells(DestinationRow, 2).Value = SourceData.Cells(SourceRow, 5)
End If
Next
Next
End Sub
If I am understanding your code and problem correctly, you were having issues because your code was simply checking if the value of a cell was equal to "BRM_ID". In reality, you need to be checking if the Task_ID of the TailoredData is equivalent to the Task_ID of the SourceData. I took a stab at correctly aligning this, but I have no clue where your task/brm_id's are stored. Your question said columns A, B, and C, but your indices (5, and 6) don't align to this.
Lastly, I would strongly recommend getting your hands dirty with arrays and dictionaries. Once you get this solution running, it will work, but it won't work for long. The code is fragile. In other words, if one detail changes, the code will cease to work correctly. For example, if the length of your data changes from 224 rows to 224,000 rows you will need to fix the code to reflect this (and expect a serious increase in processing time as well).
This will get you started with learning VBA, but I would strongly recommend working on improving the code further (or, ideally, work on improving your Excel skills and avoiding VBA as much as possible so that you are only solving problems with VBA that you can't reasonably solve with the built-in functionality Excel offers).
Best of luck!

Excel VBA for hiding cells in one sheet if they match cells in another sheet

I am new to VBA and am having problems learning the rules of variables (I think that's the problem here).
I have two worksheets in a spreadsheet. I need to make code that automatically hides a row on worksheet 2 if that same value in column a is on worksheet 1, column a.
Here's one of the variations of code I've tried:
Dim Sheet2Value As Variant
Dim Sheet1Value As Variant
'
Sheet2Value = Sheets("Sheet2").Range("A:A").Value
Sheet1Value = Sheets("Sheet1").Range("A:A").Value
'
If Sheet2Value = Sheet1Value Then
Sheets("BMAC=N").EntireRow.Hidden = False
Else
Sheets("BMAC=N").EntireRow.Hidden = True
End If
I get a type mismatch error but I'm not sure exactly why. I chose variant because I don't know what I'm doing, but both columns in excel will be set to "General".
Can anyone help with this? What concept am I missing?
Thanks so much for your time.
Few things:
you cannot compare entire column:
Sheet2Value = Sheets("Sheet2").Range("A:A").Value
you need to loop through the collection of cells, see this: Fast compare method of 2 columns
you cannot hide row without defining a range to hide
Sheets("BMAC=N").Range("Some_address").EntireRow.Hidden
Finally, i'd suggest to change your code to shortest way:
Sheets("BMAC=N").Range("A1").EntireRow.Hidden = (value1<>value2)
Good luck!

AutoFilter Criteria Using Array (Error) - Too Large String?

Update: Through some additional testing I discovered:
1) 255 Characters does seem to be the breaking point (character limit). Setting the filter with an array with a character length of 245 worked fine -- I was able to save and reopen without any errors. I added another criteria to the array to make the length 262, saved the file, and then got the same error.
2) The sheet in the removed records message refers to the sheet index, not the sheet name, and it does indeed reference the sheet with the autofiltering. End Update
My Issue -- I've written code to set a dataset's AutoFilter based on selected items in several slicers. Sometimes when I open up the file I get the error (paraphrased): Excel found unreadable content in the workbook. Do you want to repair the file? Then a dialog pops up and says Removed Records: Sorting from /xl/worksheets/sheet2.xml part.
The code works as designed; the dataset reflects whatever is selected in the slicers (even many selections).
I set the array (a string array) as follows and then use the array to set the criteria:
If sCache.Name = "Slicer_Test" Then
For Each sItem In ActiveWorkbook.SlicerCaches(sCache.Name).SlicerItems
If sItem.Selected = True Then
ReDim Preserve sArr(0 To sCount)
sArr(sCount) = sItem.Name
sCount = sCount + 1
End If
Next sItem
filterRng.AutoFilter Field:=9, Criteria1:=sArr, Operator:=xlFilterValues
ReDim sArr(0 To 0)
End If
I replicate the above code for each slicer.
Where I think the problem stems from is that the three largest slicers contain 27, 120, and 322 items, respectively. So as you can imagine, when all the items in the largest slicer are selected, the array's string length is over 5K characters long... like I mentioned above, the code works as designed. I found this thread, which mentions a character maximum?
I've tried removing the filters before saving/closing the workbook, but that doesn't always work, and this file will be used by many other people. So I'm wondering if 1) anyone has a suggestion for a way to workaround this error, or 2) if there might be a way to accomplish the filtering without using a terribly-long array...
Any thoughts on this will be much appreciated!
A co-worker of mine helped me resolve the issue.
Apparently when using this syntax:
Criteria1:=sArr
Excel reads the array as one long string instead of looking at it as an array that contains many string elements.
The fix is to use the Array() function like so:
Criteria1:=Array(sArr)
This seems to prevent Excel from corrupting.
Sorting before autofilter will help you to perform autofilter function faster and better

Why is my conditional format offset when added by VBA?

I was trying to add conditional formats like this:
If expression =($G5<>"") then make set interior green, use this for $A$5:$H$25.
Tried this, worked fine, as expected, then tried to adapt this as VBA-Code with following code, which is working, but not as expected:
With ActiveSheet.UsedRange.Offset(1)
.FormatConditions.Delete
'set used row range to green interior color, if "Erledigt Datum" is not empty
With .FormatConditions.Add(Type:=xlExpression, _
Formula1:="=($" & cstrDefaultProgressColumn & _
.row & "<>"""")")
.Interior.ColorIndex = 4
End With
End With
The Problem is, .row is providing the right row while in debug, however my added conditional-formula seems to be one or more rows off - depending on my solution for setting the row. So I am ending up with a conditional formatting, which has an offset to the row, which should have been formatted.
In the dialog it is then =($G6<>"") or G3 or G100310 or something like this. But not my desired G5.
Setting the row has to be dynamicall, because this is used to setup conditional formats on different worksheets, which can have their data starting at different rows.
I was suspecting my With arrangement, but it did not fix this problem.
edit: To be more specific, this is NOT a UsedRange problem, having the same trouble with this:
Dim rngData As Range
Set rngData = ActiveSheet.Range("A:H") 'ActiveSheet.UsedRange.Offset(1)
rngData.FormatConditions.Delete
With rngData.FormatConditions.Add(Type:=xlExpression, _
Formula1:="=($" & cstrDefaultProgressColumn & _
1 & "<>"""")")
.Interior.ColorIndex = 4
End With
My Data looks like this:
1 -> empty cells
2 -> empty cells
3 -> empty cells
4 -> TitleCols -> A;B;C;...;H
5 -> Data to TitleCols
. .
. .
. .
25
When I execute this edited code on Excel 2007 and lookup the formula in the conditional dialog it is =($G1048571<>"") - it should be =($G1<>""), then everything works fine.
Whats even more strange - this is an edited version of a fine working code, which used to add conditional formats for each row. But then I realized, that it's possible to write an expression, which formats a whole row or parts of it - thought this would be adapted in a minute, and now this ^^
edit: Additional task informations
I use conditional formatting here, because this functions shall setup a table to react on user input. So, if properly setup and a user edits some cell in my conditionalized column of this tabel, the corresponding row will turn green for the used range of rows.
Now, because there might be rows before the main header-row and there might be a various number of data-columns, and also the targeted column may change, I do of course use some specific informations.
To keep them minimal, I do use NamedRanges to determine the correct offset and to determine the correct DefaultProgessColumn.
GetTitleRow is used to determine the header-row by NamedRange or header-contents.
With ActiveSheet.UsedRange.Offset(GetTitleRow(ActiveSheet.UsedRange) - _
ActiveSheet.UsedRange.Rows(1).row + 1)
Corrected my Formula1, because I found the construct before not well formed.
Formula1:="=(" & Cells(.row, _
Range(strMatchCol1).Column).Address(RowAbsolute:=False) & _
"<>"""")"
strMatchCol1 - is the name of a range.
Got it, lol. Set the ActiveCell before doing the grunt work...
ActiveSheet.Range("A1").Activate
Excel is pulling its automagic range adjusting which is throwing off the formula when the FromatCondition is added.
The reason that Conditional Formatting and Data Validation exhibit this strange behavior is because the formulas they use are outside the normal calculation chain. They have to be so that you can refer to the active cell in the formula. If you're in G1, you can't type =G1="" because you'll create a circular reference. But in CF or DV, you can type that formula. Those formulas are disassociated with the current cell unlike real formulas.
When you enter a CF formula, it's always relative to the active cell. If, in CF, you make a formula
=ISBLANK($G2)
and you're in A5, Excel converts it to
=ISBLANK(R[-3]C7)
and when that gets put into the CF, it ends up being relative to the cell it's applied to. So in row 2, the formula comes out to
=ISBLANK($G655536)
(for Excel 2003). It offsets -3 rows and that wraps to the bottom of the spreadsheet.
You can use Application.ConvertFormula to make the formula relative to some other cell. If I'm in row 5 and the start of my range is in row 2, I make the formula relative to row 8. That way the R[-3] will put the formula in A5 as $G5 (three rows up from A8).
Sub test()
Dim cstrDefaultProgressColumn As String
Dim sFormula As String
cstrDefaultProgressColumn = "$G"
With ActiveSheet.UsedRange.Offset(1)
.FormatConditions.Delete
'set used row range to green interior color, if "Erledigt Datum" is not empty
'Build formula
sFormula = "=ISBLANK(" & cstrDefaultProgressColumn & .Row & ")"
'convert to r1c1
sFormula = Application.ConvertFormula(sFormula, xlA1, xlR1C1)
'convert to a1 and make relative
sFormula = Application.ConvertFormula(sFormula, xlR1C1, xlA1, , ActiveCell.Offset(ActiveCell.Row - .Cells(1).Row))
With .FormatConditions.Add(Type:=xlExpression, _
Formula1:=sFormula)
.Interior.ColorIndex = 4
End With
End With
End Sub
I only offset .Cells(1) row-wise because the column is absolute in this example. If both row and column are relative in your CF formula, you need more offsetting. Also, this only works if the active cell is below the first cell in your range. To make it more general purpose, you would have to determine where the activecell is relative to the range and offset appropriately. If the offset put you above row 1, you would need to code it so that it referred to a cell nearer the bottom of the total number of rows for your version of Excel.
If you thought selecting was a bit of a kludge, I'm sure you'll agree that this is worse. Even though I abhor unnecessary Selecting and Activating, Conditional Formatting and Data Validation are two places where it's a necessary evil.
A brief example:
Sub Format_Range()
Dim oRange As Range
Dim iRange_Rows As Integer
Dim iCnt As Integer
'First, create a named range manually in Excel (eg. "FORMAT_RANGE")
'In your case that would be range "$A$5:$H$25".
'You only need to do this once,
'through VBA you can afterwards dynamically adapt size + location at any time.
'If you don't feel comfortable with that, you can create headers
'and look for the headers dynamically in the sheet to retrieve
'their position dynamically too.
'Setting this range makes it independent
'from which sheet in the workbook is active
'No unnecessary .Activate is needed and certainly no hard coded "A1" cell.
'(which makes it more potentially subject to bugs later on)
Set oRange = ThisWorkbook.Names("FORMAT_RANGE").RefersToRange
iRange_Rows = oRange.Rows.Count
For iCnt = 1 To iRange_Rows
If oRange(iCnt, 1) <> oRange(iCnt, 2) Then
oRange(iCnt, 2).Interior.ColorIndex = 4
End If
Next iCnt
End Sub
Regarding my comments given on the other reply:
If you have to do this for many rows, it is definitely faster to load the the entire range into memory (an array) and check the conditions within the array, after which you do the writing on those cells that need to be written (formatted).
I could agree that this technique is not "necessary" in this case - however it is good practise because it is flexible for many (any type of) customizations afterwards and easier to debug (using the immediate / locals / watches window).
I'm not a fan of Offset although I don't state it doesn't work as it should and in some limited scenarios I could say that the chance for problems "could" be small: I experienced that some business users tend to use it constantly (here offset +3, there offset -3, then again -2, etc...); although it is easy to write, I can tell you it is hell to revise. It is also very often subject to bugs when changes are made by end users.
I am very much "for" the use of headers (although I'm also a fan of reducing database capabilities for Excel, because for many it results in avoiding Access), because it will allow you very much flexibility. Even when I used columns 1 and 2; better is it to retrieve the column nr dynamically based on the location of the named range of the header. If then another column is inserted, no bugs will appear.
Last but not least, it may sound exaggerated, but the last time, I used a class module with properties and functions to perform all retrievals of potential data within each sheet dynamically, perform checks on all bugs I could think of and some additional functions to execute specific tasks.
So if you need many types of data from a specific sheet, you can instantiate that class and have all the data at your disposal, accessible through defined functions. I haven't noticed anyone doing it so far, but it gives you few trouble despite a little bit more work (you can use the same principles again over and over).
Now I don't think that this is what you need; but there may come a day that you need to make large tools for end users who don't know how it works but will complain a lot about things because of something they might have done themselves (even when it's not your "fault"); it's good to keep this in mind.

Excel Macro Autofilter issue with variable

I have a table of data with the top row being filters, I have a loop that changes which filter needs to be used inside the loop is the variable filterColumn that is being assigned a new value every time the loop runs through.
when i try to use filterColumn to determine which filter will be 'switched on' i get an error
Autofilter method of Range Class Failed
ActiveSheet.Range("$U$83:$CV$1217").AutoFilter Field:=filterColumn, Criteria1:="<>"
What is the correct syntax in order to use a variable to determine which field the filter is in?
Problem Solved I found the solution. I was referencing the filters columns position in terms of the whole worksheet when in fact I should have been referencing what number it was in the group of filters. For example the filter I wanted to change was in 'CF' which is the 84th column but my the filter I wanted to change is the 64th in the group.
Dim filterColumn As Integer
filterColumn = 2
ActiveSheet.Range("$U$83:$CV$1217").AutoFilter Field:=filterColumn, _
Criteria1:="<>"
EDIT: I tried #HeadofCatering's solution and initially it failed. However I filled in values in the referenced columns and it worked (my solution also failed under reverse conditions - make the column headers blank and it fails).
However this doesn't quite mesh with what I've (and probably you've) seen - you can definitely add filters to columns with blank headers. However one thing was consistent in the failures I saw - the filterColumn referenced a column that was outside of Application.UsedRange. You may want to try verifying that the column you are referencing is actually within Application.UsedRange (easy way: run Application.UsedRange.Select in the Immediate Window and see if your column is selected). Since you are referencing a decent amount of columns, it is possible that there are no values past a certain point (including column headers), and when you specify the column to filter, you are actually specifying something outside of your UsedRange.
An interesting (this is new to me as well) thing to test is taking a blank sheet, filling in values in cells A1 and B1, selecting columns A:G and manually adding AutoFilters - this will only add filters to columns A and B (a related situation can be found if you try to add filters to a completely blank sheet).
Sorry for the babble - chances are this isn't even your problem :)
Old solution (doesn't work when conditions described above are used)
I may be overkilling it, but try setting the sheet values as well (note I used a sample range here):
Sub SOTest()
Dim ws As Worksheet
Dim filterColumn As Integer
' Set the sheet object and declare your variable
Set ws = ActiveSheet
filterColumn = 2
' Now try the filter
ws.Range("$A$1:$E$10").AutoFilter Field:=filterColumn, Criteria1:="<>"
End Sub