How do I import an excel workbook into jupyter notebook. I am using tensorflow.
xl.file =pd.excelfile('c:\users\owner\downloads\book1.xlsx')
book1 = pd.excelfile('book1.xlsx')
It looks like you are confusing the filename with the pandas method to read a file.
import pandas as pd
filename = 'c:\users\owner\downloads\book1.xlsx'
dataframe = pd.read_excel(filename)
Related
I am using the following code:
import os
import numpy as np
import pandas as pd
from openpyxl import load_workbook
def dump2ExcelTest(df, fname, sheetNameIn='Sheet1'):
if os.path.exists(fname):
writer = pd.ExcelWriter(fname, engine='openpyxl', mode='a')
book = load_workbook(fname)
writer.book = book
else:
writer = pd.ExcelWriter(fname, engine='openpyxl', mode='w')
df.to_excel(writer, sheet_name = sheetNameIn)
writer.save()
writer.close()
x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
dump2ExcelTest(df1, r'Y:\summary\test3.xlsx')
On trying to open test3.xlsx I get the following warning window:
However, if I just do df1.to_excel(r'Y:\summary\test3.xlsx') then test3.xlsx opens fine.
I am not sure what to do about this as there is nothing in the log file.
I believe the way the ExcelWriter opens the file and tracks existing workbook contents is the problem. I'm not sure exactly what is going on under the hood but you have to both
specify the proper startrow for append
copy sheet information to the writer
I've used a contextmanager in Python for a little cleaner syntax.
This is your example but properly writing and appending as you desire.
import os
import numpy as np
import pandas as pd
from openpyxl import load_workbook
def dump2ExcelTest(df, fname, sheetNameIn='Sheet1'):
if os.path.exists(fname) is False:
df.to_excel(fname, engine='openpyxl')
start_row = 0
with pd.ExcelWriter(fname, engine='openpyxl', mode='a') as writer:
writer.book = load_workbook(fname)
if sheetNameIn not in writer.book.sheetnames:
raise ValueError(f"sheet {sheetNameIn} not in workbook")
# grab the proper start row and copy existing sheets to new writer
start_row = writer.book[sheetNameIn].max_row
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
df.to_excel(writer, sheetNameIn, startrow=start_row, header=False)
x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
dump2ExcelTest(df1, "test3.xlsx")
More details and similar question here
I have a master excel sheet where the data looks like this [1]: https://i.stack.imgur.com/IS4cw.png
I have a script which imports the csv files and combines them and save it to the master excel sheet.
import pandas as pd
from openpyxl import load_workbook
import tkinter as tk
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
root.call('wm', 'attributes', '.', '-topmost', True)
files = filedialog.askopenfilename(multiple=True)
%gui tk
var = root.tk.splitlist(files)
filePaths = []
for f in var:
df = pd.read_csv(f,skiprows=8, index_col=None, header='infer',parse_dates=True, squeeze=True, encoding='ISO-8859–1',names=['Date', 'Time', 'Temperature', 'Humidty'])
filePaths.append(df)
df = pd.concat(filePaths, axis=0, join='outer', ignore_index=True, sort=True)
book = load_workbook(r'C:\Users\Administrator\Documents\Hebin\Scripts\Temperature Distribution chart/july/12.xlsx')
writer = pd.ExcelWriter(r'C:\Users\Administrator\Documents\Hebin\Scripts\Temperature Distribution chart/july/12.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, "Sheet1", columns=['Date', 'Time','Temperature', 'Humidty'],index=False)
writer.save()
The problem is that the newly imported data is saved from row 1 instead of starting at the ending row of the previously saved data. How can I save the data in an orderly manner everytime without entering the row number?
The ExcelWriter can have its mode set to either write ('w') or append ('a'). The default is write.
writer = pd.ExcelWriter(r'C:\Users\Administrator\Documents\Hebin\Scripts\Temperature Distribution chart/july/12.xlsx', engine='openpyxl', mode='a')
I am able to create an excel file with in one sheet the data from a data frame and in a second sheet a button to run a macro
What I need is to have both the data from the dataframe than the button in the same sheet
This is the code I found that I have tried to modify:
import pandas as pd
import xlsxwriter
df = pd.DataFrame({'Data': [10, 20, 30, 40]})
writer = pd.ExcelWriter('hellot.xlsx', engine='xlsxwriter')
worksheet = workbook.add_worksheet()
#df.to_excel(writer, sheet_name='Sheet1')
workbook = writer.book
workbook.filename = 'test.xlsm'
worksheet = workbook.add_worksheet()
workbook.add_vba_project('./vbaProject.bin')
worksheet.write('A3', 'Press the button to say hello.')
#Add a button tied to a macro in the VBA project.
worksheet.insert_button('A1', {'macro': 'start',
'caption': 'Press Me',
'width': 80,
'height': 30})
df.to_excel(writer, sheet_name ='Sheet2')
writer.save()
workbook.close()
I know that you simply asked how to insert the button in the same sheet but i decided to check how the macros are working with xlsxwriter, so i wrote a complete tutorial on how to add a macro.
1) Firstly we need to create manually a file which will contain the macro in order to extract it as a bin file and inject it later using xlsxwriter. So we create a new excel file, go to Developer tab, Visual Basic, Insert Module and write the following code:
Sub TestMsgBox()
MsgBox "Hello World!"
End Sub
Save the file with xlsm extension to contain the macro, e.g. as Book1.xlsm.
2) Now we need to extract the bin file. Open your cmd and browse to the directory where you have saved the Book1.xlsm. Then browse through file explorer to the folder where you have installed python (or to the virtual environment folder) and search for vba_extract.py. Copy this script into the same folder as the Book1.xlsm. Then type in cmd:
python vba_extract.py Book1.xlsm
This way you will extract the macro and create the vbaProject.bin file in the same folder.
3) Now it's time to create the final file. Delete the Book1.xlsm and the vba_extract.py files as they 're not needed anymore and run the following code:
import pandas as pd
# Create a test dataframe
df = pd.DataFrame({'Data': [10, 20, 30, 40]})
# Import it through the xlsxwriter
writer = pd.ExcelWriter('hello_world.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', index=False)
# Create the workbook and the worksheet
workbook = writer.book
workbook.filename = 'hello_world.xlsm' # rename the workbook to xlsm extension
worksheet = writer.sheets['Sheet1']
# Inject the bin file we extracted earlier
workbook.add_vba_project('./vbaProject.bin')
# Insert a description
worksheet.write('B1', 'Press the button to say hello.')
#Add a button tied to a macro in the VBA project.
worksheet.insert_button('B2', {'macro': 'TestMsgBox',
'caption': 'Press Me',
'width': 80, 'height': 30})
# Finally write the file
writer.save()
Now button is in the same sheet as your data and working:
I am absolutely new to python and I am trying to build a code that will upload a data frame based on the browsed xlsx file and then drop down list of all sheets in the selected xlsx file.
I have found two codes: one for browsing and reading excel file and second for drop down list with all sheets in selected xlsx file. What I need to do is actually to combine this two codes. First of all I would like to select an xlsx sheet and then I would like to select which sheet to read (based on drop down list).
Function to browse and read excel file
enter code hereimport tkinter as tk
enter code herefrom tkinter import filedialog
enter code hereimport pandas as pd
root= tk.Tk()
canvas1 = tk.Canvas(root, width = 300, height = 300, bg = 'white')
canvas1.pack()
def getExcel ():
global df
import_file_path = filedialog.askopenfilename()
df = pd.read_excel(import_file_path, sheet_name='Loan Tape')
df.keys()
df
root.destroy()
browseButton_Excel = tk.Button(text='Import Excel File', command=getExcel, bg='yellow', fg='black', font=('arial', 12, 'bold'))
canvas1.create_window(150, 150, window=browseButton_Excel)
root.mainloop()
Function to create a drop down list of all sheets in the xlsx file
import tkinter as tk
from tkinter import *
root = Tk()
root.title("Select a sheet")
mainframe = Frame(root)
mainframe.grid(column=0,row=0, sticky=(N,W,E,S) )
mainframe.columnconfigure(0, weight = 1)
mainframe.rowconfigure(0, weight = 1)
mainframe.pack(pady = 100, padx = 100)
tkvar = StringVar(root)
xl = pd.ExcelFile(r'Full file path.xlsx')
choices=xl.sheet_names
tkvar.set('Nothing selected') # set the default option
popupMenu = OptionMenu(mainframe, tkvar, *choices)
Label(mainframe, text="Select a sheet").grid(row = 1, column = 1)
popupMenu.grid(row = 2, column =1)
def change_dropdown(*args):
print( tkvar.get() )
root.destroy()
tkvar.trace('w', change_dropdown)
root.mainloop()
What I need to do is to actually combine this two codes. First of all I would like to select an xlsx sheet and then I would like to select which sheet to read (based on drop down list).
all
I can read the text in cells, but the textbox can't read the text...
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re,os,sys,time
import openpyxl
from openpyxl import load_workbook
from openpyxl import Workbook
from openpyxl.drawing import *
reload(sys)
sys.setdefaultencoding('utf8')
wb = load_workbook(u'2.xlsx')
sheetnames = wb.get_sheet_names()
for i in range(0,len(sheetnames)):
sheet = wb.get_sheet_by_name(sheetnames[i])
for row in sheet.rows:
for cell in row:
if cell.value:
print cell.value
I try to unzip the xlsx file and find the content of textbox in xl\drawings\drawing[0-9].xml files..
and can openpyxl.drawing.text can read the textbox? I have no idea...
How can i do this..? thx...
I have to unzip the xlsx file......
zipFile = zipfile.ZipFile(os.path.join(os.getcwd(), u''+str(flist)+''))
for file in zipFile.namelist():
zipFile.extract(file, r'tmp')
zipFile.close()
num = 0
if os.path.exists(r'tmp/xl/drawings'):
xmldir = os.listdir(r'tmp/xl/drawings')
for xmlfile in xmldir:
xml = os.path.basename(xmlfile)
if os.path.splitext(xml)[1] == '.xml':
a = open(u'tmp/xl/drawings/'+str(xml)+'').read()
b = a.replace('\n','').replace(' ','')
c = re.findall(r'<a:p>(.*?)</a:p>',b)
for i in c:
text = "".join(re.findall(r'(?<=<a:t>).*?(?=</a:t>)',u''+str(i)+'',re.S)).replace(' ','').replace(' ','').replace('\\u6d3b\\u52a8','').replace('<','<').replace('>','>').replace('&','&')