ExcelWriter format number to currency by column name

ExcelWriter format number to currency by column name - pandas

I'm converting the float columns to currency data type with the following:
df = pd.DataFrame({'col0': [71513.0, 200000.0, None],
'col1': [True, False, False],
'col2': [100.0, 200.0, 0.0]})
df[['col0', 'col2']] = df[['col0', 'col2']].astype(float).astype("Int32").applymap(\
lambda x: "${:,.0f}".format(x) if isinstance(x, int) else x)
I am outputting the table with the following:
writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
df.to_excel(writer, index= False)
workbook = writer.book
ws = writer.sheets['Sheet1']
writer.close()
writer.save()
However, when I output the datable with the following, the currency is stored as text:
How would I format the excel sheet itself (instead of the pandas column) based on the column name so that the value is a number, but the formatting is currency?

This is how it worked for me
Removed the column formatting within df
df = pd.DataFrame({'col0': [71513.0, 200000.0, None],
'col1': [True, False, False],
'col2': [100.0, 200.0, 0.0]})
Removed index parameter from to_excel,
Defined format for the columns, and assign it to columns 1, and 3
writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
df.to_excel(writer) # index= False)
workbook = writer.book
ws = writer.sheets['Sheet1']
format1 = workbook.add_format({'num_format': '$#,##0.00'})
ws.set_column(1, 1, 18, format1)
ws.set_column(3, 3, 18, format1)
writer.save()
writer.close()
reference to documentation: https://xlsxwriter.readthedocs.io/example_pandas_column_formats.html

Related

Creating new dataframe by search result in df

I am reading a txt file for search variable.
I am using this variable to find it in a dataframe.
for lines in lines_list:
sn = lines
if sn in df[df['SERIAL'].str.contains(sn)]:
condition = df[df['SERIAL'].str.contains(sn)]
df_new = pd.DataFrame(condition)
df_new.to_csv('try.csv',mode='a', sep=',', index=False)
When I check the try.csv file, it has much more lines the txt file has.
The df has a lots of lines, more than the txt file.
I want save the whole line from search result into a dataframe or file
I tried to append the search result to a new dataframe or csv.

first create line list
f = open("text.txt", "r")
l = list(map(lambda x: x.strip(), f.readlines()))
write this apply func has comparing values and filtering
def apply_func(x):
if str(x) in l:
return x
return np.nan
and get output
df["Serial"] = df["Serial"].apply(apply_func)
df.dropna(inplace=True)
df.to_csv("new_df.csv", mode="a", index=False)
or try filter method
f = open("text.txt", "r")
l = list(map(lambda x: x.strip(), f.readlines()))
df = df.set_index("Serial").filter(items=l, axis=0).reset_index()
df.to_csv("new_df.csv", mode="a", index=False)

Saving multiple dataframe in sheets and books based on the condition

I am able to save multiple dataframe in multiple excel sheets.
writer = pd.ExcelWriter('Cloud.xlsx', engine='xlsxwriter')
frames = {'Image': df1, 'Objects': df2, 'Text': df3 , 'Labels': df4}
for sheet, frame in frames.items():
frame.to_excel(writer, sheet_name = sheet)
writer.save()
Now I want to create multiple files based on the dataframes column. For example, I want to create 4 excel files:
df1
Category URL Obj
A example.com Chair
A example2.com table
B example3.com glass
B example4.com tv
So my all datframes have 7 categories and I want to create 7 files based the category.

I think you need:
frames = {'Image': df1, 'Objects': df2, 'Text': df3 , 'Labels': df4}
for sheet, frame in frames.items():
for cat, g in frame.groupby('Category'):
# if file does not exist
if not os.path.isfile(f'{cat}.xlsx'):
writer = pd.ExcelWriter(f'{cat}.xlsx')
else: # else it exists
writer = pd.ExcelWriter(f'{cat}.xlsx', mode='a', engine='openpyxl')
g.to_excel(writer, sheet_name = sheet)
writer.save()

Dataframe index rows all 0's

I'm iterating through PDF's to obtain the text entered in the form fields. When I send the rows to a csv file it only exports the last row. When I print results from the Dataframe, all the row indexes are 0's. I have tried various solutions from stackoverflow, but I can't get anything to work, what should be 0, 1, 2, 3...etc. are coming in as 0, 0, 0, 0...etc.
Here is what I get when printing results, only the last row exports to csv file:
0
0 1938282828
0
0 1938282828
0
0 22222222
infile = glob.glob('./*.pdf')
for i in infile:
if i.endswith('.pdf'):
pdreader = PdfFileReader(open(i,'rb'))
diction = pdreader.getFormTextFields()
myfieldvalue2 = str(diction['ID'])
df = pd.DataFrame([myfieldvalue2])
print(df)`
Thank you for any help!

You are replacing the same dataframe each time:
infile = glob.glob('./*.pdf')
for i in infile:
if i.endswith('.pdf'):
pdreader = PdfFileReader(open(i,'rb'))
diction = pdreader.getFormTextFields()
myfieldvalue2 = str(diction['ID'])
df = pd.DataFrame([myfieldvalue2]) # this creates new df each time
print(df)
Correct Code:
infile = glob.glob('./*.pdf')
df = pd.DataFrame()
for i in infile:
if i.endswith('.pdf'):
pdreader = PdfFileReader(open(i,'rb'))
diction = pdreader.getFormTextFields()
myfieldvalue2 = str(diction['ID'])
df = df.append([myfieldvalue2])
print(df)

coloring cell while exporting dataframe to excel

I have dataframe "df" as below:
x = [1,3,5,7]
y1 = [3,2,2,2]
y2 = [2,5,2,2]
y3 = [7,2,2,1]
df = pd.DataFrame({'x': x, 'y1': y1, 'y2': y2, 'y3': y3})
writer = pd.ExcelWriter('output.xlsx')
df.to_excel(writer,'Sheet1')
writer.save()
I want the excel output file shows the same color in mutual values of column x with other columns. :

You can use styles if colors are specify by dictionary:
def color(a):
d = {1:'yellow', 3:'green', 5:'blue', 7:'red'}
d1 = {k: 'background-color:' + v for k, v in d.items()}
df1 = pd.DataFrame(index=a.index, columns=a.columns)
df1 = a.applymap(d1.get).fillna('')
return df1
df.style.apply(color, axis=None).to_excel('styled.xlsx', engine='openpyxl', index=False)

Is it possible to change a graphs x and y axis major/minor units using openpyxl?

I have tried the following but none work.
chart.auto_axis = False
chart.x_axis.unit = 365
chart.set_y_axis({'minor_unit': 100, 'major_unit':365})
changing the max and min scale for both axis is straight forward
chart.x_axis.scaling.min = 0
chart.x_axis.scaling.max = 2190
chart.y_axis.scaling.min = 0
chart.y_axis.scaling.max = 2
so I'm hoping there is a straight forward solution to this. Here is a mcve.
from openpyxl import load_workbook, Workbook
import datetime
from openpyxl.chart import ScatterChart, Reference, Series
wb = Workbook()
ws = wb.active
rows = [
['data point 1', 'data point2'],
[25, 1],
[100, 2],
[500, 3],
[800, 4],
[1200, 5],
[2100, 6],]
for row in rows:
ws.append(row)
chart = ScatterChart()
chart.title = "Example Chart"
chart.style = 18
chart.y_axis.title = 'y'
chart.x_axis.title = 'x'
chart.x_axis.scaling.min = 0
chart.y_axis.scaling.min = 0
chart.X_axis.scaling.max = 2190
chart.y_axis.scaling.max = 6
xvalues = Reference(ws, min_col=1, min_row=2, max_row=7)
yvalues = Reference(ws, min_col=2, min_row=2, max_row=7)
series = Series(values=yvalues, xvalues=xvalues, title="DP 1")
chart.series.append(series)
ws.add_chart(chart, "D2")
wb.save("chart.xlsx")
I need to automate changing the axis to units of 365 or what ever.

Very late answer, but I figured out how to do this just after finding this question.
You need to set the major unit axis to 365.25, and the format to show just the year:
chart.x_axis.number_format = 'yyyy'
chart.x_axis.majorUnit = 365.25

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

ExcelWriter format number to currency by column name - pandas

Related

Creating new dataframe by search result in df

Saving multiple dataframe in sheets and books based on the condition

Dataframe index rows all 0's

coloring cell while exporting dataframe to excel

Is it possible to change a graphs x and y axis major/minor units using openpyxl?

Categories

Resources