How to export data to different excel sheets in python pandas? - pandas

How can i export data to different excel sheets in pandas?
import pandas as pd
df = pd.ExcelFile('my_excel.xlsx')
new_sheet1 = calculated_df.to_excel('my_excel.xlsx', 'sheet1')
new_sheet2 = calculated_df2.to_excel('my_excel.xlsx', 'sheet2')
When i use above code, the sheet2 overwrite the sheet1, how can i export to different excel tabs?
Thank you!
Export data to different excel tabs via Pandas

You can use pandas.ExcelWriter :
list_df = [calculated_df, calculated_d2]
with pd.ExcelWriter("my_excel.xlsx") as writer:
for idx, df in enumerate(list_df, start=1):
df.to_excel(writer, sheet_name=f"sheet{idx}", index=False)

Related

Adding a frame border to a dataframe in Excel

It seems simplistic as a task to perform, but I've been having hard time to add a border frame to my excel-written table (using xlsxwriter engine). The only way I could do so is by getting the size of my df & starting row/column then loop on each cell and format it, which is redundant. Is there a solution I'm not seeing ? I tried the styleframe module in vain.
Reproducible example:
import pandas as pd
import numpy as np
from styleframe import StyleFrame, Styler, utils
df = pd.DataFrame(np.random.randint(0,100,size=(100, 2)), columns=list('AB'))
df = df.style.set_properties(**{'text-align': 'center'})
writer = StyleFrame.ExcelWriter("Test.xlsx", engine='xlsxwriter')
df.to_excel(writer, sheet_name= 'Random', index=False)
format_x = workbook.add_format({'border': 2})
worksheet.set_column('A:B',20,format_x)
writer.save()

How can to determine excel files's name that be have error while reading by pandas?

I have many Excel files (about 500 files). The Excel files have different sheets. The purpose of the written code is to extract data from specific columns and rows in the second sheet. The output of the desired code is the collection of extracted data in Excel file format named 'FluidsVolumeReport.xlsx'.
import pandas as pd
import glob
import numpy as np
filenames = glob.glob('*.xlsx')
dlist = []
for file in filenames:
Dict = {}
xl = pd.ExcelFile(file)
Sheet_Names_list = xl.sheet_names
df = pd.DataFrame()
df = pd.read_excel(file, sheet_name= Sheet_Names_list[1])
for row in range(3, 8):
Dict.update({"Date": df.iat[2, 2]})
Dict.update({df.iat[9, 25]: df.iat[9, 35]})
Dict.update({df.iat[10, 25]: df.iat[10, 35]})
Dict.update({df.iat[row, 40]: df.iat[row, 47]})
dlist.append(Dict)
dflist = []
for i in range(0, len(dlist)):
Dict = dlist[i]
df = pd.DataFrame(data=Dict, index=[0])
dflist.append(df)
df = pd.concat(dflist, axis=0, sort=False, ignore_index=True)
df.sort_values(by='Date', inplace=True)
df.replace(0.0, np.nan, inplace=True)
df.to_excel('FluidsVolumeReport.xlsx')
The code has error while reading some Excel files due to the file not opening or not matching the specified range, and as a result the code stops giving an error.
IndexError: index 25 is out of bounds for axis 0 with size 14
My aim is to write a code that ignores any Excel files that have errors and reports the names of these files. Please help me complete my code.
As suggested by #rahlf23, you could do something like that (to keep it simple):
...
try:
for file in filenames:
...
dlist.append(Dict)
except IndexError:
print(file)
pass
dflist = []
...

RDKit - Export pandas data frame with mol image

I would like to know whether is it possible to export pandas dataframe with molecular image directly in excel file format?
Thanks in advance,
In RDKit's PandasTools there is the funktion SaveXlsxFromFrame.
http://www.rdkit.org/Python_Docs/rdkit.Chem.PandasTools-module.html#SaveXlsxFromFrame
XlsxWriter must be installed.
import pandas as pd
from rdkit import Chem
from rdkit.Chem import PandasTools
smiles = ['c1ccccc1', 'c1ccccc1O', 'c1cc(O)ccc1O']
df = pd.DataFrame({'ID':['Benzene', 'Phenol', 'Hydroquinone'], 'SMILES':smiles})
df['Mol Image'] = [Chem.MolFromSmiles(s) for s in df['SMILES']]
PandasTools.SaveXlsxFromFrame(df, 'test.xlsx', molCol='Mol Image')

How to get column header in excel generated via python ExcelWriter

I am fetching excel data from django database via raw query. excel is generated but column header is missing .
please suggest some way to get that header.
import pandas as pd
from pandas import ExcelWriter
df1 = pd.DataFrame(row1)
try:
from StringIO import StringIO
except:
from io import StringIO
import xlwt
wb = Workbook()
writer = ExcelWriter("XYZ.xlsx",options={'remove_timezone': True})
xl_out = StringIO()
writer.path = xl_out
ws1 = wb.add_sheet("abc")
for col_num, value in enumerate(df1.columns.values):
ws1.write(1,col_num + 1, 'value')
df1.to_excel(writer,"abc", index= True, header=True)
writer.save()

How to specify column type(I need string) using pandas.to_csv method in Python?

import pandas as pd
data = {'x':['011','012','013'],'y':['022','033','041']}
Df = pd.DataFrame(data = data,type = str)
Df.to_csv("path/to/save.csv")
There result I've obtained seems as this
To achieve such result it will be easier to export directly to xlsx file, even without setting dtype of DataFrame.
import pandas as pd
writer = pd.ExcelWriter('path/to/save.xlsx')
data = {'x':['011','012','013'],'y':['022','033','041']}
Df = pd.DataFrame(data = data)
Df.to_excel(writer,"Sheet1")
writer.save()
I've tried also some other methods like prepending apostrophe or quoting all fields with ", but it gave no effect.