df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
fileName = QFileDialog.getSaveFileName(self,"Save",os.getcwd(),"CSV Files (*.csv)")
if fileName:
with open(fileName, "w") as file:
file.write(df)
I am tring to save my dataframe to csv using QFileDialog instead of df.to_csv, but this doesn't work
fileName = QFileDialog.getSaveFileName(self,"Save",os.getcwd(),"CSV Files (*.csv)")
print(fileName)
this returns strings in tuple, first is path
you should write:
fileName, _ = QFileDialog.getSaveFileName(self,"Save",os.getcwd(),"CSV Files (*.csv)")
Related
i want to load the csv file in dataframes i'll try many option but is not working
df = pd.read_csv(r"C:\Users\DJ_PRATIK28\Downloads\titanic.xlsx","r", encoding="utf-8")
Use pd.read_excel:
df = pd.read_excel(r"C:\Users\DJ_PRATIK28\Downloads\titanic.xlsx")
from pandas import Series
data = [1000, 2000, 3000]
index = ['a', 'b', 'c']
s = Series(data = data, index = index)
s.to_excel('/content/drive/MyDrive/Colab Notebooks/data.xlsx')
This does not work and it says
[Errno 2] No such file or directory:
But I do have such directories and same file names!!
I have a few thousand files of text and would like to analyze them for trends/word patterns, etc. I am familiar with both Pandas and SQL but am not sure how to "load" all these files into a table/system such that I can run code on them. Any advice?
If you have all the same columns in all the text files you can use something like this.
import pandas as pd
import glob
path = r'C:/location_rawdata_files'#use the path where you stored all txt's
all_files = glob.glob(path + "/*.txt")
lst = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None)
lst .append(df)
df= pd.concat(lst, axis=0, ignore_index=True)
I have two functions for export and import dicts to and from csv, because I have many dictionaries of different kinds.
def export_dict(dict, filename):
df = pd.DataFrame.from_dict(dict, orient="index")
df.to_csv(CSV_PATH + filename)
def import_dict(filename):
df = pd.read_csv(filename, header = None, keep_default_na=False)
d = df.to_dict('split')
d = dict(d["data"])
return d
The dicts look like that:
d1 = {'123456': 8, '654321': 90, '123654': 157483}
d2 = {'ouwejfw': [4], 'uwlfhsn0': [3, 1, 89], 'afwefwe': [3, 4], 'a3rz0dsd': []}
So basically the first case is a simple one, where a string is a key (but it could be an int) and a number is the value.
The second case is where a string is the key and a list of values of different sizes is the value.
The first one, I can write and read without problems. But the second one breaks everything, because of the different sized lists. I cannot shorten them and I cannot use too much space by adding columns, because I have many Millions of data.
Can somebody help me, what can I do to read/write both dicts correctly?
Thank you!
You can consider the dictionary values as a single item while storing in dataframe
import pandas as pd
def export_dict(dict, filename):
df = pd.DataFrame()
df["keys"] = d2.keys()
df["values"] = d2.values()
df.to_csv(CSV_PATH + filename)
d2 = {'ouwejfw': [4], 'uwlfhsn0': [3, 1, 89], 'afwefwe': [3, 4], 'a3rz0dsd': []}
export_dict(d2, "your_filename.csv")
Per this example the to_excel method should save the Excel file with background color. However, my saved Excel file does not have any color in it.
I tried to write using both openpyxl and xlsxwriter engines. In both cases, the Excel file was saved, but the cell color/style was lost.
I can read the file back and reformat with openpyxl, but if this to_excel method is supposed to work, why doesn't it?
Here is the sample code.
import pandas as pd # version 0.24.2
dict = {'A': [1, 1, 1, 1, 1], 'B':[2, 1, 2, 1, 2], 'C':[1, 2, 1, 2, 1]}
df = pd.DataFrame(dict)
df_styled = df.style.apply(lambda x: ["background: #ffa31a" if x.iloc[0] < v else " " for v in x], axis=1)
df_styled
''' in my jupyter notebook, this displayed my dataframe with background color when condition is met, (all the 2s highlighted)'''
'''Save the styled data frame to excel using to_excel'''
df_styled.to_excel('example_file_openpyxl.xlsx', engine='openpyxl')
df_styled.to_excel('example_file_xlsxwriter.xlsx', engine='xlsxwriter')
I stumbled across this myself and as far as I'm aware there isn't support for exporting to excel like this yet. I've adjusted your code to match the output to excel in the documentation.
This is the documentation output to excel method.
df.style.\
applymap(color_negative_red).\
apply(highlight_max).\
to_excel('styled.xlsx', engine='openpyxl')
This is your code adjusted:
import pandas as pd
dict = {'A': [1, 1, 1, 1, 1], 'B':[2,1,2,1,2], 'C':[1,2,1,2,1]}
df = pd.DataFrame(dict)
def highlight(df, color = "yellow"):
attr = 'background-color: {}'.format(color)
df_bool = pd.DataFrame(df.apply(lambda x: [True if x.iloc[0] < v else False for v in x],axis=1).apply(pd.Series),
index=df.index)
df_bool.columns =df.columns
return pd.DataFrame(np.where(df_bool, attr, ""),
index= df.index, columns=df.columns)
df.style. \
apply(highlight, axis=None).\
to_excel("styled.xlsx", engine="openpyxl")
Inside the highlight function, I create a boolean dataframe based on the conditions applied in the list comprehension above. Then, I assign styling based on the result of this dataframe.