Proper usage of load_workbook in openpyxl when saving data to worksheet in existing file - pandas

I have a dataframe called results and an excel file named as vlpandas.xlsx. I set a default path for working dir as follows:
excel_dir = 'Users/Documents/Pythonfiles'
Based on the example from this post,: How to save a new sheet in an existing excel file, using Pandas?, I did the following:
book = load_workbook(excel_dir)
But I an error above after running the command, to fix it I use the usage as documented from the openpyxl and the following works:
book = load_workbook(filename = 'vlpandas.xlsx')
But then I get an exception error when I run the command below. Something about workbook.py from the openpyxl directory.
writer = pd.ExcelWriter(excel_dir, engine='openpyxl')
and then I want to complete my task saving the data to the new worksheet in the existing file vlpandas.xlsx with the following lines of code:
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
## Your dataframe to append.
results.to_excel(writer, '3rd_sheet')
writer.save()
So my two questions are:
1) What is the proper usage of load_Workbook from openpyxl
2) Why I am I getting an error with the command line below. I also changed my working folder path using the command osc.chdir:
writer = pd.ExcelWriter(excel_dir, engine='openpyxl')
Regards,
Gus

Related

Pandas/Openpyxl - Save Current Date into xlsx Filename

Trying to save an xlsx file and include the current date in the file name during the process. Currently, I'm using the below code but I receive the error invalid format string - uncertain what format I can use to accomplish this.
I saw this method recommended in another thread but it doesn't work for me. I've tried several other solutions as well but nothing seems to work. Any guidance would be appreciated.
from openpyxl import load_workbook
from datetime import datetime, date
import os
from glob import glob
import pandas as pd
file = glob(
'C:\\Users\\all*.xlsx')[0]
wb1 = load_workbook(file)
ws1 = wb1.worksheets[0]
for row in ws1['A2':'D5']:
for cell in row:
cell.value = None
wb1.save('file1'+now.strftime("%Y%m%d%")+'.xlsx')
The error is in the save command where you have an extra % in the end. Also, just now is not sufficient, it needs the (). For the code above, think it also needs the datetime. to be added. So, change the last line from....
wb1.save('file1'+now.strftime("%Y%m%d%")+'.xlsx')
to
wb1.save('file1'+datetime.now().strftime("%Y%m%d")+'.xlsx')

How do fix "Exception caught in workbook destructor. Explicit close() may be required for workbook."

I want to export values from a dataframe in python to excel.
If I don't add'writer.save()' at the end :I get the following Exception: Exception caught in workbook destructor. Explicit close() may be required for workbook.
I have not created a separate workbook.
On adding 'writer.save()', I am getting PermissionError: [Errno 13] Permission denied: 'F:........\docs.xlsx'
I tried executing my program with and without adding writer.save(), still I get an Exception or PermissionError
# file_compare is a function comparing 2 excel files
# newpath is the path for storing newly created excel file
new_df= file_compare(df1, df2)
_logger.info('Creating new excel file')
writer = pandas.ExcelWriter(newpath, engine='xlsxwriter')
new_df.to_excel(writer, index=False, header=True)
writer.save()
_logger.info('File successfully created.')
I want that the file should get saved in newfile location.

Python write data to Excel

I want to write some data from a CSV to an Excel-Sheet.
Here is my Code:
import xlrd
import xlsxwriter
import requests
# download
link="...."
s=requests.get(link).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')))
book = xlrd.open_workbook("***.xlsx","w")
worksheet = book.sheet_by_index(10)
worksheet.write('B2',5)
workbook.close()
Actually i want to use the data from the requested link. But the code still stops on write a simple Issue. It works if i add a new sheet. But i cant write in an existing sheet. Have anyone an idea?

Is there a way to save all hyperlinked documents in excel?

I know this question has been answered before, but the problem I'm facing is a bit different, and this is why I'm asking for your help :)
So, I'm working with multiple excel files that contain multiple hyperlinks that lead to documents such as Excel files, PDFs, DOCs and sometimes even images. The problem with these hyperlinks is that they are not leading to a "normal" website, but to a special internal software, that in its turn links to a local address on my computer that contatins the desired file. That means that there is no direct link that could be grabbed with a simple VBA code.
Let's have an example, assuming the internal software name is "John":
I see in the Excel documents this link: John://3434545345/345345345
When I click on it, it opens the file, which is located, for example, in: C:/local/Cutekitten.pdf
After this long intro, my question is: Is there a way to automate the process of saving each document, instead of manually opening it and saving it? Could it be solved with a VBA code? Or does is require a different approach? I was actually thinking to bypass this problem by finding a way to open all hyperlinks at once with VBA, and then maybe find some code (not VBA?) that saves all open documents.
P.S Please keep in mind that I can't download EXE files or any other "suspicious" files due to workplace restrictions.
Any help will be much appreciated,
Thanks! :)
You may try this
'''
Input: test.xlsx
name link
1 location/file1.jpg
2 location/file2.xlsx
3 location/file3.pdf
4 location/file4.mp4
'''
from openpyxl import load_workbook
wb = load_workbook('test.xlsx')
print wb.get_sheet_names()
# ['Sheet 1', 'Sheet 2', 'Sheet 3']
ws1 = wb['Sheet 1']
## alternate -> worksheet2 = wb2.get_sheet_by_name('Sheet2')
import urllib
import os
## Result directory
directory = 'result'
if not os.path.exists(directory):
os.makedirs(directory)
for row in ws1:
if '.exe' in row[1].value:
continue
print row[0].value, '\t', row[1].value
urllib.urlretrieve (row[1].value, directory+'/'+row[1].value)
'''
Output:
Table 1 None
None None
name link
1 location/file1.jpg
2 location/file2.xlsx
3 location/file3.pdf
4 location/file4.mp4
'''
Sub PathsText()
Dim List(), Path As String
Dim i, x As Integer
Dim s As InlineShape
Dim fso As FileSystemObject, ts As TextStream
Set fso = New FileSystemObject
Set ts = fso.OpenTextFile("C:\MyFolder\List.txt", 8, True)
With ts
.WriteLine (ActiveDocument.InlineShapes.Count)
End With
For Each s In ActiveDocument.InlineShapes
Path = s.LinkFormat.SourcePath & "\" _
& s.LinkFormat.SourceName
With ts
.WriteLine (Path)
End With
Next s
End Sub
I guess the important part is if "Path = s.LinkFormat.SourcePath" works.
If so, if you have a ton of excel files to do it for, I'd put those (the ones with the links) in one folder. I'd make a new empty workbook with one button on a single sheet to call code. The button would "For Each workbook in directory, For Each sheet in workbook, For Each link on sheet" add source path to a text file.
Once you had a list of paths the next step would be obvious. I wish I had demo code, but I migrated to c# years ago. My monstrously slow cell-by-cell search, modify and graph inventions threatened to bury me.

workbookfactory for getting a file is throwing error

I am a selenium beginner,
FileInputStream fis= new FileInputStream("C:\Documents\Landing_page.xls");
Workbook wb = new WorkbookFactory(fis);
I am getting compilation error for executing the above 2 lines even after importing all neccessary jar files
If you look at the JavaDocs for WorkbookFactory, you'll see it isn't a class you instantiate.
Instead, what you want to do is, as taken from the POI docs:
Workbook wb = WorkbookFactory.create(new File("MyExcel.xls"));
You need call the create(File) method, and you'll want to use the File directly rather than an InputStream for lower memory.