Python and netCDF sciprt do not operate anymore - variables

I am using a python 2.6 script, that I have been using for quite a while now and I get an error that it shouldn't be there. The python script is run form the location of where the netCDF file is located, here is the code
from numpy import *
import numpy as numpy
from netCDF4 import Dataset
import datetime as DT
from time import strftime
import os
floc ='/media/USB-HDD/NCEP_NCAP data/data_2010/' #location of directory that the file resides
fname ='cfsr_Scotland_2010' # name of the netCDF file
in_ext = '.nc' # ending extentsion of the netCDF
basetime = DT.datetime(2010,01,01,0,0,0) # Initial time (start) for the netCDF
ncfile = Dataset(floc+fname+in_ext,'r') # netCDF assigned name
time = ncfile.variables['time']
lon = ncfile.variables['lon']
lat = ncfile.variables['lat']
uwind = ncfile.variables['10u']
vwind = ncfile.variables['10v']
ht = ncfile.variables['height']
I get the error in the ncfile naming, which is odd cause I checked the way its written
Traceback (most recent call last):
File "CFSR2WIND.py", line 24, in <module>
ncfile = Dataset(floc+fname+in_ext,'r') # netCDF assigned name
File "netCDF4.pyx", line 1317, in netCDF4.Dataset.__init__ (netCDF4.c:14608)
RuntimeError: No such file or directory
Does anybody know why and what caused this, and how can It be solved
thank you
george

Try using the netcdf module from scipy instead:
from scipy.io.netcdf import netcdf_file as Dataset
Couple other suggestions:
Importing numpy. You're importing it twice, and it's a bit dangerous to read in all instances using *. By convention, most people abbreviate numpy as np and load it as import numpy as np. Then you can call instances from numpy using np.mean() for example.
Concatenating the path, filename, and file extension. It's OK to use string concatenation using the + sign, but there is another way to do this using the join command. So, the total filename would be something like filename = ''.join([floc, fname, in_ext]).

Related

How can I convert my text file into netcdf file. I have observation datasets of simply one meteorological station between 1980 and 2018

I tried to convert my text file into NetCDF (nc) file with the help of the youtube link I shared here. I cannot open this nc file in GrADS. I guess the reason is that I cannot add metadata or something into nc file with these lines of codes.
I would therefore like to improve the code in my hand so that I can open it up in other platforms. I need to open this NetCDF file in RCMES so I can carry out quantile mapping bias correction operations.
I am also open to suggestion for other ways/programming languages/platforms to perform this conversion task.
Below is the code I used.
import netCDF4 as nc
import numpy as np
import panda
import numpy as np
import pandas as pd
import xarray
# here csv file is converted into pandas dataframe
df = pd.read_csv('C:/Users/Asus/Documents/ArcGIS/ArcGIS Copy/evaporation/Downscaling Files/netcdfye dönecek csvler/Aydin_cnrm_Prec_rcp451.txt')
df
#converting pandas dataframe into xarray
xr = df.to_xarray()
xr
#lastly from xarray to nc file conversion
xr.to_nc('Aydin_cnrm_Prec_rcp451.nc')
Instead of using Python for creating a Netcdf file from a ASCII/txt file I tried using cdo that I installed on Ubuntu.
The following lines of code solved the problem
cdo -f nc input,r1x1 tmp.nc < Aydin_cnrm_Prec_rcp45.txt
cdo -r -chname,var1,prec -settaxis,1980-01-01,00:00:00,1mon tmp.nc prec.nc

Reading CSV file and manipulating the components (a newbie question)

So, I have just started learning python. I am trying to read a .csv file (https://www.dropbox.com/s/fp1g32uv2cljd1n/adcpDat.csv?dl=0)
in python.
I can read in the file but then when I want to choose one of the components it returns Traceback (most recent call last) error.
import os
import csv
import pandas as pd
import numpy as np
os.chdir("/Users/K1/Documents/Work/UGA/Cruise/GC600-MP/Data/ADCP/")
print("Current Working Directory ", os.getcwd())
adcpDat = pd.read_csv("adcpDat.csv")
print(adcpDat.shape)
output is
Current Working Directory /Users/K1/Documents/Work/UGA/Cruise/GC600-MP/Data/ADCP
(805945, 1)
but when I run for example,
adcpDat[3]
it just returns an error.
How can I pick the components?
You first have to specify the column name, then the row number:
adcpDat['rowname'][3]
In the case of your csv file it would be:
adcpDat['tADCP'][3]
This is because the first line of the csv file specifies the row name which is tADCP

How to find all files in all subdirectories in python

I want to return a list of ALL files located from a certain point.
I am using python.
Currently,
import os
import pandas as pd
path='c://users.../'
f=[]
for currentpath, folders, files in os.walk(path):
for file in files:
# print(os.path.join(currentpath, file))
f.append(file)
df=pd.DataFrame(f)
df.columns=['file_name']
print(df.shape)
df
works fine, but I have ~70k files in ~10k subfolders/directories, and it is incredibly slow.
I heard glob.glob() is quicker, but:
import glob
root_dir='c://users/.../'
for filename in glob.iglob(root_dir + '**/*', recursive=True):
print(filename)
But this only returns the names of subfolders.
Is there a quick way to compile this into a file for future processing.
you can use pathlib to speed up this operation which uses scandir under the hood which is much quicker than the operation you are using
https://pypi.org/project/scandir/
from Pathlib import Path
def tree(directory):
print(f'+ {directory}')
for path in sorted(directory.rglob('*')):
depth = len(path.relative_to(directory).parts)
spacer = ' ' * depth
print(f'{spacer}+ {path.name}')
tree(path) # will print out your directory tree.
if you want to append you can do
files = [file for file in Path(path).rglob('*')]

Generating a NetCDF from a text file

Using Python can I open a text file, read it into an array, then save the file as a NetCDF?
The following script I wrote was not successful.
import os
import pandas as pd
import numpy as np
import PIL.Image as im
path = 'C:\path\to\data'
grb = [[]]
for fn in os.listdir(path):
file = os.path.join(path,fn)
if os.path.isfile(file):
df = pd.read_table(file,skiprows=6)
grb.append(df)
df2 = pd.np.array(grb)
#imarray = im.fromarray(df2) ##cannot handle this data type
#imarray.save('Save_Array_as_TIFF.tif')
i once used xray or xarray (they renamed them selfs) to get a NetCDF file into an ascii dataframe... i just googled and appearantly they have a to_netcdf function
import xarray and it allows you to treat dataframes just like pandas.
so give this a try:
df.to_netcdf(file_path)
xarray slow to save netCDF

Accessing carray of pointcloud using pytables

I am having a hard time understanding how to access the data in a carray.
http://carray.pytables.org/docs/manual/index.html
I have a carray that I can view in a group structure using vitables - but how to open it and retrieve the data it beyond me.
The data are a point cloud that is 3 levels down that I want to make a scatter plot of and extract as a .obj file..
I then have to loop through (many) clouds and do the same thing..
Is there anyone that can give me a simple example of how to do this?
This was my attempt:
import carray as ca
fileName = 'hdf5_example_db.h5'
a = ca.open(rootdir=fileName)
print a
I managed to solve my issue.. I wasn't treating the carray differently to the rest of the hierarchy. I needed to first load the entire db, then refer to the data I needed. I ended up not having to use carray, and just stuck to h5py:
from __future__ import print_function
import h5py
import numpy as np
# read the hdf5 format file
fileName = 'hdf5_example_db.h5'
f = h5py.File(fileName, 'r')
# full path of carry type data (which is in ply format)
dataspace = '/objects/object_000/object_model'
# view the data
print(f[dataspace])
# print to ply file
with open('object_000.ply', 'w') as fo:
for line in f[dataspace]:
fo.write(line+'\n')