Generating a NetCDF from a text file - pandas

Using Python can I open a text file, read it into an array, then save the file as a NetCDF?
The following script I wrote was not successful.
import os
import pandas as pd
import numpy as np
import PIL.Image as im
path = 'C:\path\to\data'
grb = [[]]
for fn in os.listdir(path):
file = os.path.join(path,fn)
if os.path.isfile(file):
df = pd.read_table(file,skiprows=6)
grb.append(df)
df2 = pd.np.array(grb)
#imarray = im.fromarray(df2) ##cannot handle this data type
#imarray.save('Save_Array_as_TIFF.tif')

i once used xray or xarray (they renamed them selfs) to get a NetCDF file into an ascii dataframe... i just googled and appearantly they have a to_netcdf function
import xarray and it allows you to treat dataframes just like pandas.
so give this a try:
df.to_netcdf(file_path)
xarray slow to save netCDF

Related

How to access a dataframe from a Python dataframe list through a cell from a date column in the dataframe

I have created a list (df) which contains some dataframes after importing csv files. Instead of accessing this dataframes using df[0], df[1] etc, I would like to access them in a much easier way with something like df[20/04/22] or df[date=='20/04/22] or something similar. I am really new to Python and programming, thank you very much in advance. I attach the simplified code (contains only 2 items in the list) for simplyfying reasons.
I came up with two ways of achieving that but each time I have some trouble realising them.
Through my directory path names. Each csv (dataframe) file name includes the date in each original name file, something like : "5f05d5d83a442d4f78db0a19_2022-04-01.csv"
Each csv (dataframe), includes a date column (object type) which I have changed to datetime64 type so I can work with plots. So, I thought that maybe through this column what I ask would be possible.
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime
from datetime import date
from datetime import time
from pandas.tseries.offsets import DateOffset
import glob
import os
path = "C:/Users/dsdadsdsaa/"
all_files = glob.glob(path + '*.csv')
df = []
for filename in all_files:
dataframe = pd.read_csv(filename, index_col=None, header=0)
df.append(dataframe)
for i in range(0,2):
df[i]['date'] = pd.to_datetime(df[i]['date'])
df[i]['time'] = pd.to_datetime(df[i]['time'])
df[0]

How can I convert my text file into netcdf file. I have observation datasets of simply one meteorological station between 1980 and 2018

I tried to convert my text file into NetCDF (nc) file with the help of the youtube link I shared here. I cannot open this nc file in GrADS. I guess the reason is that I cannot add metadata or something into nc file with these lines of codes.
I would therefore like to improve the code in my hand so that I can open it up in other platforms. I need to open this NetCDF file in RCMES so I can carry out quantile mapping bias correction operations.
I am also open to suggestion for other ways/programming languages/platforms to perform this conversion task.
Below is the code I used.
import netCDF4 as nc
import numpy as np
import panda
import numpy as np
import pandas as pd
import xarray
# here csv file is converted into pandas dataframe
df = pd.read_csv('C:/Users/Asus/Documents/ArcGIS/ArcGIS Copy/evaporation/Downscaling Files/netcdfye dönecek csvler/Aydin_cnrm_Prec_rcp451.txt')
df
#converting pandas dataframe into xarray
xr = df.to_xarray()
xr
#lastly from xarray to nc file conversion
xr.to_nc('Aydin_cnrm_Prec_rcp451.nc')
Instead of using Python for creating a Netcdf file from a ASCII/txt file I tried using cdo that I installed on Ubuntu.
The following lines of code solved the problem
cdo -f nc input,r1x1 tmp.nc < Aydin_cnrm_Prec_rcp45.txt
cdo -r -chname,var1,prec -settaxis,1980-01-01,00:00:00,1mon tmp.nc prec.nc

Pandas - xls to xlsx converter

I want python to take ANY .xls file from given location and save it as .xlsx with original file name? How I can do that so anytime I paste file to location it will be converted to xlsx with original file name?
import pandas as pd
import os
for filename in os.listdir('./'):
if filename.endswith('.xls'):
df = pd.read_excel(filename)
df.to_excel(??)
Your code seems to be perfectly fine. In case you are only missing the correct way to write it with the given name, here you go.
import pandas as pd
import os
for filename in os.listdir('./'):
if filename.endswith('.xls'):
df = pd.read_excel(filename)
df.to_excel(f"{os.path.splitext(filename)[0]}.xlsx")
A possible extension to convert any file that gets pasted inside the folder can be implemented with an infinite loop, for instance:
import pandas as pd
import os
import time
while True:
files = os.listdir('./')
for filename in files:
out_name = f"{os.path.splitext(filename)[0]}.xlsx"
if filename.endswith('.xls') and out_name not in files:
df = pd.read_excel(filename)
df.to_excel(out_name)
time.sleep(10)

How to convert the outcome from np.mean to csv?

so I wrote a script to get the average grey value of each image in a folder. when I execute print(np.mean(img) I get all the values on the terminal. But i don't know how to get the values to a csv data.
import glob
import cv2
import numpy as np
import csv
import pandas as pd
files = glob.glob("/media/rene/Windows8_OS/PROMON/Recorded Sequences/6gParticles/650rpm/*.png")
for file in files:
img = cv2.imread(file)
finalArray = np.mean(img)
print(finalArray)
so far it works but I need to have the values in a csv data. I tried csvwriter and pandas but did not mangage to get a file containing the grey scale values.
Is this what you're looking for?
files = glob.glob("/media/rene/Windows8_OS/PROMON/Recorded Sequences/6gParticles/650rpm/*.png")
mean_lst = []
for file in files:
img = cv2.imread(file)
mean_lst.append(np.mean(img))
pd.DataFrame({"mean": mean_lst}).to_csv("path/to/file.csv", index=False)

How can i download a zipped file from the internet using pandas 0.17.1 and python 3.5

What am i doing wrong? here is what i am trying to do:
import pandas as pd
url='http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip'
df = pd.read_csv(url, compression='gzip',
header=0, sep=',', quotechar='"',
engine = 'python')
#Abbas, thanks so much. Indeed i ran it step by step and here is what i came up with. Not the fastest indeed, but it works fine.
I ran it with pandas 0.18.1 on python 3.5.1 on Mac
from zipfile import ZipFile
from urllib.request import urlopen
import pandas as pd
import os
URL = \
'http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip'
# open and save the zip file onto computer
url = urlopen(URL)
output = open('zipFile.zip', 'wb') # note the flag: "wb"
output.write(url.read())
output.close()
# read the zip file as a pandas dataframe
df = pd.read_csv('zipFile.zip') # pandas version 0.18.1 takes zip files
# if keeping on disk the zip file is not wanted, then:
os.remove(zipName) # remove the copy of the zipfile on disk
I hope this helps. Thanks!
The answer by Cy Bu didn't quite work for me in Python 3.6 on Windows. I was getting an invalid argument error when trying to open the file. I modified it slightly:
import os
from urllib.request import urlopen, Request
r = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
b2 = [z for z in url.split('/') if '.zip' in z][0] #gets just the '.zip' part of the url
with open(b2, "wb") as target:
target.write(urlopen(r).read()) #saves to file to disk
data = pd.read_csv(b2, compression='zip') #opens the saved zip file
os.remove(b2) #removes the zip file
IIUC here is a solution instead of directly passing zip file to pandas, first unzip it and then pass the csv file:
from StringIO import StringIO
from zipfile import ZipFile
from urllib import urlopen
import pandas as pd
url = urlopen("http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip")
zipfile = ZipFile(StringIO(url.read()))
f = open(zipfile.NameToInfo.keys()[0],'wb')
f.write(zipfile.open(zipfile.NameToInfo.keys()[0]).read())
f.close()
df = pd.read_csv(zipfile.NameToInfo.keys()[0])
And will produce a DataFrame like this: