Copy whole worksheet with openpyxl - openpyxl

Please can someone give me an example, how to copy
a whole worksheet with styles (from rows and columns)
to a second worksheet in the same workbook ?
(in a new workbook would also be possible)
Thank you.
P.S.: I tried to do a deepcopy, but that failed on saving changed data cells.
Purpose is: I try to fill some worksheets with my data and the first worksheet is my template.
I was successful in copying the values but only some styles.
I am using the latest version of openpyxl, so please no 1.x methods.

Version 2.4 will allow you to do this: copy_worksheet
>>> source = wb.active
>>> target = wb.copy_worksheet(source)
For older ones you can probably copy the source code from here
UPDATE: You can't simply graft this code into older versions of the library

You can't do this easily. The best approach is probably the one described in bug 171

I had the same problem. I solved using copy instead deepcopy. I found the solution on this site
I hope this works for you!

Regarding the links for the Openpyxl repository on #charlie-clark and #sam answers, these are not working anymore since the repo was moved from Bitbucket to Gitlab.
The updated link for issue 171 is https://foss.heptapod.net/openpyxl/openpyxl/-/issues/171

Notice that the copy_worksheet method does not copy data validations. You can do this manually using openpyxl's add_data_validation. A full working example:
# Copy worksheet
workbook = openpyxl.load_workbook(source_excel)
sheet_to_copy = workbook[sheet_name]
new_sheet = workbook.copy_worksheet(sheet_to_copy)
new_sheet.title = new_sheet_name
# Copy data validations
for data_validation in sheet_to_copy.data_validations.dataValidation:
new_sheet.add_data_validation(data_validation)
# Save result
workbook.save(source_excel)

Related

using openpyxl to update certain columns in a complex spreadsheet

Hi I have a pretty complex spreadsheet and I need to update "ONLY" certain part of its contents.
I tried openpyxl, which works ok except all the pivot tables, styles and formulas are lost, after save the workbook. Is there a way to get around that?
Check which version of openpyxl you have installed. Support for Pivot Tables was only added in version 2.5.0-a1 (2017-05-30), and that only supported one Pivot Table until version 2.5.0-b2 (2018-01-19). See changelog.
Alternatives: Simplify, xlwings, content protection, and win32com (most likely).
If you are using 2.5 and it doesn't work, you'll probably have to simplify your spreadsheet if you want to continue to use openpyxl. E.g. split it into a source and output file, and use python to update the source file.
You could try running xlwings to host the Python code inside of the spreadsheet itself.
Another possible solution would be to lock the contents - turn on protection - for all except the sections you plan to update. This is unlikely to work as python is changing the contents directly, but it may respect the lock.
Finally, the result most likely to work. Drive the changes through the win32com module. I think this, unlike openpyxl, requires Excel to be installed as it is using Excel's functionality to make the changes. Examples of usage are here and here.
A brief example (taken from here):
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Add()
ws = wb.Worksheets("Sheet1")
ws.Cells(1,1).Value = "Cell A1"
ws.Cells(1,1).Offset(2,4).Value = "Cell D2"
ws.Range("A2").Value = "Cell A2"
ws.Range("A3:B4").Value = "A3:B4"
ws.Range("A6:B7,A9:B10").Value = "A6:B7,A9:B10"
wb.SaveAs('ranges_and_offsets.xlsx')
excel.Application.Quit()

Import online xls data into MS access via VBA

I need to import exchange rate data stored in an Excel spreadsheet online into an Access data table. However first I need to manipulate it so I would like to import it into an array and then write the array to the table. The code I used for Excel doesn't seem to work with Access...
Dim arr as variant
Workbook.Open ("http://www.rba.gov.au/statistics/tables/xls-hist/f11hist.xls")
arr=activeworkbook.worksheets("Data").Range("A12:X" & Range("A1045876").end(xldown).Row)
'data manipulation ommitted
'add to data table
Clearly this doesn't work in Access, but I've got no idea how to open the file and read the data. Any help appreciated!
Your question is overly broad, so the answer is generic. You can use Microsoft Excel object library in MS Access application by adding the reference to that library and start using its methods, similar to what you have done in Excel. More details in: https://msdn.microsoft.com/en-us/library/office/ff194944.aspx. Hope this may help.
I guess this line can help you.
Docmd.Transferspreadsheet acImport,,"name of excel table","link to find the table (ex:c:.....)",true
that will help to take a excel table and import its on access .
Excel does this by - behind the scene - first downloading the file then reading it.
Access can't do this, but you can use VBA to download the file Download file from URL and then create a link to a Worksheet or a Named Range in it. Or open the file via automation.

Delete top row of excel file using SSIS

I have an excel file that has a header row which is a row that I want to delete. The header row in thsi file are the cells of A1 to W1 merged into one. This causes a problem when I try to read the file because I am expecting column names. Proper column names exist in the second row of the file, which is why I want to delete the first.
To accomplish this I thought I'd be able to use the 'Excel Source' item in SSIS since it supports a SQL option to write a query. What I want to do is something like this:
SELECT * from ExcelFile WHERE Row > 1
My file only has data in columns A thru W.
I don't know what syntax I can use in the query to do this. The query builder that is in the Excel Source item will allow me to do many things with columns but I don't see an option for doing anything with rows. Searching online and using the help didn't get me anywhere.
None of these solutions will work because the Excel driver will be confused by the merged first line. You won't be able to use any driver features such as skip first row to do this. You need to run some script to open the Excel file and delete the row manually.
There is some basic sample script at this site:
http://www.sqlservercentral.com/Forums/Topic1327014-1292-1.aspx
The code below is adapted from the code written by snsingh at that site.
You would obviously want to use connnection manager properties, not hard coded paths
Excel needs to be installed on the SSIS Server for it to work - this is the only way to use Excel automation.
Dim filename As String
Dim appExcel As Object
Dim newBook As Object
Dim oSheet1 As Object
appExcel = CreateObject("Excel.Application")
filename = "C:\test.xls"
appExcel.DisplayAlerts = False
newBook = appExcel.Workbooks.Open(filename)
oSheet1 = newBook.worksheets("Sheet1")
oSheet1.Range("A1").Entirerow.Delete()
newBook.SaveAs(filename, FileFormat:=56)
appExcel.Workbooks.Close()
appExcel.Quit()
You don't need to use a syntax.
Go to control flow..
Pull in a data flow task.
Add a excel file source...add a conection manager
With excel sheet.
Open your connection manager and then check the box which says.
Column names In first row. That's it and add ur destination.

how to read excel files by (column, row) with npoi and VISUAL BASIC

I'd like to be able to ready an excel file by column index, row by row, in VB .NET.
I can do it extremely easily with python XLRD, and with vb6 via "ADOX.Catalog"
Basically, this should be enough for my needs:
wb = open(excelfile)
ws = wb.worksheet(0)
cell = ws.get_cell(col, row)
How can I do that?
Is there a way to get the columns of a worksheet?
I haven't found any documentation of NPOI
I haven't found any *.vb file samples in the NPOI downloads (alpha, beta and stable versions)
is there a unified excel file opener that will detect the file version and pick up the correct opener class, or do i have to do it by file extension?
subjective question: am I the only one who think this API is overly complex?
IMPORTANT EDIT: I CAN'T INSTALL EXCEL ON THE MACHINE, so i'm looking for non-excel solutions
ps. i'm new to .net
Not sure what npoi is, but this may help, assuming you have declared them:
xlObject = new excel.application()
xlWB = xlObject.workbooks(0) // I think it also takes string (name of wb)
sheet = xlWB.worksheet(0)
sheet.Range("B2") //specific cell
Are you talking about xls vs xlsx? It should handle both fine...
And NO! It's not nearly as intuitive or simple as it should be!

Sparkline Printing to PDF

I have an issue where when I export to PDF via VBA my sparkline graphs are not printed. I've browsed your site, and a few others, trying to come up with a solution. Unfortunately I can't get it to work.
I'm the only one that uses the application, so the process is completely visible. I've tried to do all of the following before the export line in an effort to get the sparklines to 'refresh':
application.screenupdating = false then application.screenupdating = true
application.visible = true (based on forum here, even though it was never hidden)
select the cell where sparkline is located
select entire sheet where sparkline(s) are located
select.copy the cell where sparkline is located
application.wait to see if it would refresh
application.calculate to see if it would refresh
I really can't think of anything else to try. The spreadsheet is designed to create a report for a single entity, print the report, and then move on to create the next report for a different entity (pulls data from Access, creates over 200 10 page reports).
Any help is appreciated.
Thanks - Kris.
I had the same issue and I tried all of your listed ideas as well as DoEvents, but after testing the code line for line I found the offending code was:
.Axes.Vertical.MinScaleType = xlSparkScaleGroup
For some reason the xlSparkScaleGroup interferred with the update of the sparklines on the page and when I tried to update-print-update-print--- the sheet the sparklines were missing. My solution was to simply remove this code and then manually set the scales. Something like this:
.Axes.Vertical.MinScaleType = xlSparkScaleCustom
.Axes.Vertical.CustomMinScaleValue = Application.WorksheetFunction.Min(Range(SLAddress))
.Axes.Vertical.MaxScaleType = xlSparkScaleCustom
.Axes.Vertical.CustomMaxScaleValue = Application.WorksheetFunction.Max(Range(SLAddress))
where SLAddress was the address of the sparkline data I was using. I hope this helps solve your issue and maybe Microsoft will actually fix this issue.
Had to use a temp file to make it happen. Basically saved the current file as a temp file using 'savecopyas', then open the temp file (which allowed it to refresh the sparklines) do the print, close the temp file, and then start the process over again.
Hope they fix this at some point.
Kris.