openpyxl: Is it possible to load a workbook (with data_only=False), work on it, save it and re open the saved vile with (data_only= True)? - openpyxl

Basically the title. The thing is that I got an Excel file already (with a lot of formulas) and I have to use it as a template, but I have to copy certain column and paste it in another column.
Since I have to make some graphs in between I need the numeric data of the excel file so my plan is the following:
1.- load the file with data_only = False.
2.- Make the for loops needed to copy and paste info from one worksheet to another.
3.- Save the copied data as another Excel file.
4.- Open the new Excel created file, this time with data_only = True, so I can work with the numeric values.
The problem is that after doing this, it's like after putting data_only on the new created file it doesn't work, because when I made a list that filters NoneType values and strings in a column that have actual numerical values it gives me an empty list.
#I made the following
wb = load_workbook('file_name.xlsx', data_only = True)
S1 = wb['Sheet 1']
S2 = wb['Sheet 2']
#Determination of min and max cols and rows
col_min = S1.min_column
col_max = S1.max_column
row_min = S1.min_row
row_max = S1.max_row
for i in range(row_min + 2, row_max + 1):
for j in range(col_min + Value, Value + 2):
S2.cell(row = i+6, column = j+10-Value).value = S1.cell(row = i, column = j).value
Transition_file = wb.save('transition.xlsx')
wb1 = load_workbook('transition.xlsx', data_only = True) #To obtain only numerical values
S2 = wb1['Sheet 2'] #Re define my Sheet 2 values

Related

VBA - to compare 2 tables from 2 sheets, and get result in another sheets creating comparison table

It is above my possibilities.
I know how to create Vba code with Vlookup or Hlookup for single comparision. However whatIi am trying is beyond my knowledge.
I need to compare 2 tables. 1st is requirements where 2nd is DB extract.
Both tables contains same "Action" column and other columns with Employer role as a header. Values from Action column for both tables are the same however in different order ( Those values need to act as primary key).
Columns with Employer role as a header - same header value for both tables - however columns in different order.
Amount of columns with Employer role as a header is not constants and it gets change every time I get this files. Those columns in extract are in different order than in requirements.
Amount of values from "Action" columns ( primary key) also not constants and change every time I receive files. So I cannot set specific range.
Example of Requirements table
Example of Extract table
Example of what is expected
New target worksheet need to be created where Comparison table will be created.
VBA to create Comparison table in newly created worksheet.
This table should have "Action" column + all columns with Employers role as header form requirements + all columns with Employers role as header form extract set in same order like columns in requirements + comparison table which compare values between Employers roles from Requirements and Extract and show values YES or NO
Try the next code, please. It will return in a newly created sheet (after the last existing). The new file is named "New Sheet". If it exists, it is cleared and reused:
Sub testMatchTables()
Dim sh As Worksheet, sh1 As Worksheet, shNew As Worksheet
Dim tbl1 As ListObject, tbl2 As ListObject, rightH As Long
Dim arrR, arrE, arrH1, arrH2, arrFin, i As Long, j As Long, k As Long
Dim first As Long, sec As Long, refFirst As Long, refSec As Long
Set sh = Set sh = Worksheets("Requirements")'use here the sheet keeping the first table
Set sh1 = Worksheets("Extract Table") 'use here your appropriate sheet
Set tbl1 = sh.ListObjects(1) 'use here your first table name (instead of 1)
Set tbl2 = sh1.ListObjects(1) 'use here your second table name (instead of 1)
arrR = tbl1.Range.value 'put the table range in an array
arrE = tbl2.Range.value 'put the table range in an array
'working with arrays will condiderably increase the processing speed
ReDim arrFin(1 To UBound(arrR), 1 To UBound(arrR, 2) * 3 + 2) 'redim the array to keep the processing result
'UBound is a property telling the number of array elements
arrH1 = Application.Index(arrR, 1, 0) 'make a slice in the array (1D array), the first row, which keeps the headers
arrH2 = Application.Index(arrE, 1, 0) 'make a slice in the array (1D array), the first row, which keeps the headers
'build the column headers:
For i = 1 To UBound(arrFin, 2)
If i <= UBound(arrH1) Then 'firstly the headers of the first table are filled in the final array
arrFin(1, i) = arrH1(i)
ElseIf refSec = 0 Then 'refSec is the column where a blanck column will exist
first = first + 1 'the code incrementes this variable to allow making empty only for the following row
If first = 1 Then
arrFin(1, i) = Empty: refFirst = i 'make the empty column between the two tables data and create a reference
'to be decreated from the already incremented i variable
Else
arrFin(1, i) = arrH1(i - refFirst) 'place each header column values
If i - refFirst = UBound(arrH1) Then refSec = i + 1 'when the code reaches the end of the first array
'it creates a reference for referencing the second time
End If
Else
sec = sec + 1 'the same philosophy as above, to create the second empty column
If sec = 1 Then
arrFin(1, i) = Empty 'create the empty column (for each processed row)
Else
arrFin(1, i) = arrH1(i - refSec) 'fill the header columns
End If
End If
Next
Dim C As Long, r As Long, eT As Long, T As Long
eT = UBound(arrR) 'mark the ending of the first array (where to be the first empty column)
T = UBound(arrR, 2) * 2 + 2 'mark the begining of the third final array part
'after the second empty column
For i = 2 To UBound(arrR) 'iterating between the first array rows
For j = 2 To UBound(arrE) 'iterating the second array rows
If arrR(i, 1) = arrE(j, 1) Then 'if the both arrays first column matches
arrFin(i, 1) = arrR(i, 1): arrFin(i, T + 1) = arrR(i, 1) 'put the Action values in the first area columns
arrFin(i, eT) = arrR(i, 1) 'put the Action values in the last area column
For C = 2 To UBound(arrR, 2) 'iterate between the array columns
rightH = Application.match(arrR(1, C), arrH2, 0) 'find the match of the first array header in the second one
arrFin(i, C) = arrR(i, C): arrFin(i, C + eT - 1) = arrE(j, rightH) 'place the matching header in the final array
If arrR(i, C) = arrE(j, rightH) Then
arrFin(i, T + C) = "TRUE" 'place 'TRUE' in case of matching
Else
arrFin(i, T + C) = "FALSE" 'place 'FALSE' in case of NOT matching
End If
Next C
End If
Next j
Next i
On Error Resume Next 'necessary to return an error if worksheet "New Sheet" does not exist
Set shNew = Worksheets("New Sheet")
If err.Number = 9 Then 'if it raises error number 9, this means that the sheet does not exist
err.Clear: On Error GoTo 0 'clear the error and make the code to return other errors, if any
Set shNew = Worksheets.Add(After:=Worksheets(Worksheets.count)) 'set shNew as new inserted sheet
shNew.name = "New Sheet" 'name the newly inserted sheet
Else
shNew.cells.Clear: On Error GoTo 0 ' in case of sheet exists, it is clear and the code is made to return errors
End If
'set the range where the final array to drop its values:
With shNew.Range("A1").Resize(UBound(arrFin), UBound(arrFin, 2))
.value = arrFin 'drop the array content
.EntireColumn.AutoFit 'AutoFit the involved columns
End With
End Sub
Please, test it and send some feedback.
Edited:
I commented the code as detailed I could. If still something unclear, please do not hesitate to ask for clarifications.

Pandas, Copy cells from workbook to another

I am having issues copying cells from excel workbook and pasting as values to another workbook.
I get an error on line rowSelected.append(sheet.cell(row = i, column = j).value) with the message AttributeError: 'str' object has no attribute 'cell'
Can anyone help with this?
import openpyxl
#Prepare the spreadsheets to copy from and paste too.
#File to be copied
wb = openpyxl.load_workbook(r"") #Add file name
sheet = wb["BusinessDetails"] #Add Sheet name
#File to be pasted into
template = openpyxl.load_workbook(r"") #Add file name
temp_sheet = template["Sheet1"] #Add Sheet name
#Copy range of cells as a nested list
#Takes: start cell, end cell, and sheet you want to copy from.
def copyRange(startCol, startRow, endCol, endRow, sheet):
rangeSelected = []
#Loops through selected Rows
for i in range(startRow,endRow + 1,1):
#Appends the row to a RowSelected list
rowSelected = []
for j in range(startCol,endCol+1,1):
rowSelected.append(sheet.cell(row = i, column = j).value)
#Adds the RowSelected List and nests inside the rangeSelected
rangeSelected.append(rowSelected)
return rangeSelected
#Paste range
#Paste data from copyRange into template sheet
def pasteRange(startCol, startRow, endCol, endRow, sheetReceiving,copiedData):
countRow = 0
for i in range(startRow,endRow+1,1):
countCol = 0
for j in range(startCol,endCol+1,1):
sheetReceiving.cell(row = i, column = j).value = copiedData[countRow][countCol]
countCol += 1
countRow += 1
def createData():
print("Processing...")
selectedRange = copyRange(1,2,4,14,sheet) #Change the 4 number values
pastingRange = pasteRange(1,3,4,15,temp_sheet,selectedRange) #Change the 4 number values
#You can save the template as another file to create a new file here too.s
template.save(r"")
print("Range copied and pasted!")
copyRange(2,4,30,78,"BusinessDetails")
pasteRange(2,4,30,78,"Sheet1")
It must be:
copyRange(2,4,30,78,sheet)
pasteRange(2,4,30,78,temp_sheet)
i.e. you need to pass the sheet objects, not the sheet names to your functions.
Update as per comment:
rangeSelected = copyRange(2,4,30,78,sheet)
pasteRange(2,4,30,78,temp_sheet, rangeSelected)

can openpyxl amend chart data labels to values in selected cells?

I have an excel spreadsheet which contains values across three columns. A = Task, B = Date, and C = Team. I've written some simple code to build a excel scatter chart, place the values in column C on the y axis and values in column B on the x axis. However, I would like the chart data labels to be taken from column A. Is there and way of doing this?
At the moment I'm getting the data labels to be column B via using chart.dataLabels.showCatName = True ,which is useful but not what I need. I think it might be possible by specifying the labels via the chart.dataLabels Parameter dLbl which stands for datalabellist. the source code for which is here http://openpyxl.readthedocs.io/en/2.3.5/_modules/openpyxl/chart/label.html .
The most useful posts I found on this issue are here https://groups.google.com/forum/#!topic/openpyxl-users/jB0pPVUtUsM, and here Adding labels to ScatterChart in openpyxl
At the moment the chart looks like this.
chart = ScatterChart()
chart.title = "Team's Schedule"
chart.style = 7
chart.x_axis.title = 'Days'
chart.y_axis.title = 'Team'
xvalues = Reference(chart_sheet, min_col=2, min_row=2, max_row=20)
values = Reference(chart_sheet, min_col=3, min_row=1, max_row=20)
series = Series(values, xvalues, title_from_data=True)
chart.series.append(series)
s1 = chart.series[0]
s1.marker.symbol = "triangle"
s1.marker.graphicalProperties.solidFill = "FF0000" # Marker filling
s1.marker.graphicalProperties.line.solidFill = "FF0000" # Marker outline
s1.graphicalProperties.line.noFill = True
chart.dataLabels = DataLabelList()
chart.dataLabels.showCatName = True
chart_sheet.add_chart(chart, "F2")

Dynamically add items from a file to a ComboBox

I am working on an application that allows the user to dynamically add to and remove items from an excel file. The quantity of items shall be unlimited.
I am looking for a way to grab the items from the excel file and transfer them to the ComboBox.
To make myself clearer: The problem is not iterating through cells, but getting cell values into the ComboBox. I need a method that captures the content of all cells with values in a given column, where the end of range is unknown and then transfer the values to a ComboBox.
The Combobox only accepts values, not any empty cells. I also don't want fields in the ComboBox that say "No Value".
I have tried itering through cells and range methods, but this doesn't get the values into the ComboBox.
What I have so far is:
wb = load_workbook (source_file)
ws = wb.active
self.value_1 = ws['B2'].value
self.value_2 = ws['B3'].value
self.value_3 = ws['B4'].value
self.value_4 = ws['B5'].value
self.value_5 = ws['B6'].value
self.value_6 = ws['B7'].value
self.value_7 = ws['B8'].value
self.value_8 = ws['B9'].value
self.value_9 = ws['B10'].value
self.value_10 = ws['B11'].value
stock_items = [ self.value_1 , self.value_2 , self.value_3 , self.value_4 , self.value_5 ,
self.value_6 , self.value_7 , self.value_8 , self.value_9 , self.value_10 ]
self.combo_items_list = [ ]
for stock_item in stock_items :
if stock_item != None :
self.combo_items_list.append (stock_item)
self.combo.addItems(self.combo_items_list)
This works as expected, but what troubles me is that I have to add a line of code for each item I grab from the excel file, besides having to put an extra entry into the stock_items list. If there were 5.000 items in the file, that would result in 5.000 lines of code and 5000 entries in the list.
Is there a more efficient and elegant way to handle the issue with "counter" or pandas?
Thanks in advance.
I found a way to do this nicely using Pandas, not opnpyxl :
import pandas as pd
import numpy as np
# get sheet and whole column
sales = pd.read_excel ("Inventory.xlsx")
# filter out any None Values
sales_article = sales ["Artigo"] .dropna()
# transform into list
sales_list = sales_article.values.tolist()
# add list to ComboBox
self.combo.addItems(sales_list)
In openpyxl 2.4 worksheets have an iter_cols method that allow you to select a range of cells and have them returned as columns. Just like iter_rows returns them as rows. This is the simplest and most efficient way to do what you want to do.
See https://openpyxl.readthedocs.io/en/default/tutorial.html#accessing-many-cells for details.
An example for your use case:
cells = [cell.value for cell in ws.iter_cols(min_col=2, max_col=2, min_row=2) if cell.value is not None]
wb = load_workbook(source_file)
ws = wb.active
lastrow = ws.UsedRange.Height # don't remember method name
for row in range(lastrow):
value = ws['B' + str(row + 2)].value
if value is not None:
self.combo_items_list.append (value)
self.combo.addItems(self.combo_items_list)
See worksheet docs for other ways to get range of excel rows.
wb = load_workbook (source_file)
ws = wb.active
self.combo_items_list = [ ]
// loop from 2(start)-11(end)
// check if ws['B'<counter>].value is available and not null
// add this value to your array of combo for each ittration.
self.combo_items_list.append (ws['B'<counter>].value)
self.combo.addItems(self.combo_items_list)
Sorry, if i am getting it wrong. I don't know the syntax of given language. Still saw a logical answer and decided to post.

VBA for loop with VLookups avoiding duplicates

I am compiling a list of employees CPD classes and the hours they are, producing a printout essentially for each one. I am pulling data from multiple worksheets and am having trouble with VLOOKUP continually finding the same data. My code currently is:
result = lookup.Find(name) Is Nothing
If (result = False) Then
For Each cell In lookup
className = Application.VLookup(name, lookup, 7, True)
classHours = Application.VLookup(name, lookup, 9, True)
Sheets("employee output").Range("B" & counter).Formula = className
Sheets("employee output").Range("C" & counter).Formula = classHours
empHours = empHours + classHours
counter = counter + 1
numClasses = numClasses + 1
corporate = Range("B" & i) *** vba precompliles the loop so this doesn't work *
Set lookup = Range("corporate:K") ** doesn't work either***
Next
End If
So I need to limit the scope of the lookup range or somehow avoid finding the same data. Any help would be greatly appreciated...
I assume your lookup variable represents the range containing you data.
Your data duplication propably comes from your for each loop. Indeed, you are iterating trough each Cell of your range. That means that once the name is found, the code Inside the loop is executed as many times as there are cells in the range lookup.
Try changing your loop to the following:
Dim row as Range
for each row In lookup.Row
'your code goes here
Next
Whit this, the code will only run once for each rows.