Pandas - xls to xlsx converter - pandas

I want python to take ANY .xls file from given location and save it as .xlsx with original file name? How I can do that so anytime I paste file to location it will be converted to xlsx with original file name?
import pandas as pd
import os
for filename in os.listdir('./'):
if filename.endswith('.xls'):
df = pd.read_excel(filename)
df.to_excel(??)

Your code seems to be perfectly fine. In case you are only missing the correct way to write it with the given name, here you go.
import pandas as pd
import os
for filename in os.listdir('./'):
if filename.endswith('.xls'):
df = pd.read_excel(filename)
df.to_excel(f"{os.path.splitext(filename)[0]}.xlsx")
A possible extension to convert any file that gets pasted inside the folder can be implemented with an infinite loop, for instance:
import pandas as pd
import os
import time
while True:
files = os.listdir('./')
for filename in files:
out_name = f"{os.path.splitext(filename)[0]}.xlsx"
if filename.endswith('.xls') and out_name not in files:
df = pd.read_excel(filename)
df.to_excel(out_name)
time.sleep(10)

Related

Import multiple files in pandas

I am trying to import multiple files in pandas. I have created 3 files in the folder
['File1.xlsx', 'File2.xlsx', 'File3.xlsx'] as read by files = os.listdir(cwd)
import os
import pandas as pd
cwd = os.path.abspath(r'C:\Users\abc\OneDrive\Import Multiple files')
files = os.listdir(cwd)
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(file), ignore_index=True)
df.head()
# df.to_excel('total_sales.xlsx')
print (files)
Upon running the code, I am getting the error (even though the file does exist in the folder)
FileNotFoundError: [Errno 2] No such file or directory: 'File1.xlsx'
Ideally, I want a code where I define a list of files in a LIST and then read the files through the loop using the path and the file LIST.
I think the following should work
import os
import pandas as pd
cwd = os.path.abspath(r'C:\Users\abc\OneDrive\Import Multiple files')
paths = [os.path.join(cwd,path) for path in os.listdir(cwd) if path.endswith('.xlsx')]
df = pd.concat(pd.read_excel(path,ignore_index=True) for path in paths)
df.head()
The idea is to get a list of full paths and then read them all in and concatenate them into a single dataframe on the next line

Cant import my csv file in jupyter notebbok

I have put my CSV file in the same folder as running jupyter notebook, still can't able to import it.
You need read to a df first:
df = pd.read_csv('name.csv') # (the file name of your csv)
df

How to convert the outcome from np.mean to csv?

so I wrote a script to get the average grey value of each image in a folder. when I execute print(np.mean(img) I get all the values on the terminal. But i don't know how to get the values to a csv data.
import glob
import cv2
import numpy as np
import csv
import pandas as pd
files = glob.glob("/media/rene/Windows8_OS/PROMON/Recorded Sequences/6gParticles/650rpm/*.png")
for file in files:
img = cv2.imread(file)
finalArray = np.mean(img)
print(finalArray)
so far it works but I need to have the values in a csv data. I tried csvwriter and pandas but did not mangage to get a file containing the grey scale values.
Is this what you're looking for?
files = glob.glob("/media/rene/Windows8_OS/PROMON/Recorded Sequences/6gParticles/650rpm/*.png")
mean_lst = []
for file in files:
img = cv2.imread(file)
mean_lst.append(np.mean(img))
pd.DataFrame({"mean": mean_lst}).to_csv("path/to/file.csv", index=False)

Generating a NetCDF from a text file

Using Python can I open a text file, read it into an array, then save the file as a NetCDF?
The following script I wrote was not successful.
import os
import pandas as pd
import numpy as np
import PIL.Image as im
path = 'C:\path\to\data'
grb = [[]]
for fn in os.listdir(path):
file = os.path.join(path,fn)
if os.path.isfile(file):
df = pd.read_table(file,skiprows=6)
grb.append(df)
df2 = pd.np.array(grb)
#imarray = im.fromarray(df2) ##cannot handle this data type
#imarray.save('Save_Array_as_TIFF.tif')
i once used xray or xarray (they renamed them selfs) to get a NetCDF file into an ascii dataframe... i just googled and appearantly they have a to_netcdf function
import xarray and it allows you to treat dataframes just like pandas.
so give this a try:
df.to_netcdf(file_path)
xarray slow to save netCDF

How can i download a zipped file from the internet using pandas 0.17.1 and python 3.5

What am i doing wrong? here is what i am trying to do:
import pandas as pd
url='http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip'
df = pd.read_csv(url, compression='gzip',
header=0, sep=',', quotechar='"',
engine = 'python')
#Abbas, thanks so much. Indeed i ran it step by step and here is what i came up with. Not the fastest indeed, but it works fine.
I ran it with pandas 0.18.1 on python 3.5.1 on Mac
from zipfile import ZipFile
from urllib.request import urlopen
import pandas as pd
import os
URL = \
'http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip'
# open and save the zip file onto computer
url = urlopen(URL)
output = open('zipFile.zip', 'wb') # note the flag: "wb"
output.write(url.read())
output.close()
# read the zip file as a pandas dataframe
df = pd.read_csv('zipFile.zip') # pandas version 0.18.1 takes zip files
# if keeping on disk the zip file is not wanted, then:
os.remove(zipName) # remove the copy of the zipfile on disk
I hope this helps. Thanks!
The answer by Cy Bu didn't quite work for me in Python 3.6 on Windows. I was getting an invalid argument error when trying to open the file. I modified it slightly:
import os
from urllib.request import urlopen, Request
r = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
b2 = [z for z in url.split('/') if '.zip' in z][0] #gets just the '.zip' part of the url
with open(b2, "wb") as target:
target.write(urlopen(r).read()) #saves to file to disk
data = pd.read_csv(b2, compression='zip') #opens the saved zip file
os.remove(b2) #removes the zip file
IIUC here is a solution instead of directly passing zip file to pandas, first unzip it and then pass the csv file:
from StringIO import StringIO
from zipfile import ZipFile
from urllib import urlopen
import pandas as pd
url = urlopen("http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip")
zipfile = ZipFile(StringIO(url.read()))
f = open(zipfile.NameToInfo.keys()[0],'wb')
f.write(zipfile.open(zipfile.NameToInfo.keys()[0]).read())
f.close()
df = pd.read_csv(zipfile.NameToInfo.keys()[0])
And will produce a DataFrame like this: