Loop through multiple netcdf files

Loop through multiple netcdf files - grads

I would like to loop through multiple netcdf files in GrADS. The files run from May 1, 2018 to June 30, 2018 named as:
2018050100.nc...2018063000.nc
The function is to regrid the netcdf files to 0.17 degrees. How do I do this?

You don't have to use grads for that at all.
You can just go for CDO (climate data operators)
The command is simple.
1) Create a grid file and save it in text format with name "grd"
gridtype = lonlat
xsize =
ysize =
xfirst =
xinc =
yfirst =
yinc =
2)Go to your shell and type the following command
FILES=*.nc
for i in $FILES
cdo remapbil,grd $i $j
done
3)Now you have a folder full of your regridded files

Related

How can I use a loop to apply a function to a list of csv files?

I'm trying to loop through all files in a directory and add "indicator" data to them. I had the code working where I could select 1 file and do this, but now am trying to make it work on all files. The problem is when I make the loop it says
ValueError: Invalid file path or buffer object type: <class 'list'>
The goal would be for each loop to read another file from list, make changes, and save file back to folder with changes.
Here is complete code w/o imports. I copied 1 of the "file_path"s from the list and put in comment at bottom.
### open dialog to select file
#file_path = filedialog.askopenfilename()
###create list from dir
listdrs = os.listdir('c:/Users/17409/AppData/Local/Programs/Python/Python38/Indicators/Sentdex Tutorial/stock_dfs/')
###append full path to list
string = 'c:/Users/17409/AppData/Local/Programs/Python/Python38/Indicators/Sentdex Tutorial/stock_dfs/'
listdrs_path = [ string + x for x in listdrs]
print (listdrs_path)
###start loop, for each "file" in listdrs run the 2 functions below and overwrite saved csv.
for file in listdrs_path:
file_path = listdrs_path
data = pd.read_csv(file_path, index_col=0)
########################################
####function 1
def get_price_hist(ticker):
# Put stock price data in dataframe
data = pd.read_csv(file_path)
#listdr = os.listdir('Users\17409\AppData\Local\Programs\Python\Python38\Indicators\Sentdex Tutorial\stock_dfs')
print(listdr)
# Convert date to timestamp and make index
data.index = data["Date"].apply(lambda x: pd.Timestamp(x))
data.drop("Date", axis=1, inplace=True)
return data
df = data
##print(data)
######Indicator data#####################
def get_indicators(data):
# Get MACD
data["macd"], data["macd_signal"], data["macd_hist"] = talib.MACD(data['Close'])
# Get MA10 and MA30
data["ma10"] = talib.MA(data["Close"], timeperiod=10)
data["ma30"] = talib.MA(data["Close"], timeperiod=30)
# Get RSI
data["rsi"] = talib.RSI(data["Close"])
return data
#####end functions#######
data2 = get_indicators(data)
print(data2)
data2.to_csv(file_path)
###################################################
#here is an example of what path from list looks like
#'c:/Users/17409/AppData/Local/Programs/Python/Python38/Indicators/Sentdex Tutorial/stock_dfs/A.csv'

The problem is in line number 13 and 14. Your filename is in variable file but you are using file_path which you've assigned the file list. Because of this you are getting ValueError. Try this:
### open dialog to select file
#file_path = filedialog.askopenfilename()
###create list from dir
listdrs = os.listdir('c:/Users/17409/AppData/Local/Programs/Python/Python38/Indicators/Sentdex Tutorial/stock_dfs/')
###append full path to list
string = 'c:/Users/17409/AppData/Local/Programs/Python/Python38/Indicators/Sentdex Tutorial/stock_dfs/'
listdrs_path = [ string + x for x in listdrs]
print (listdrs_path)
###start loop, for each "file" in listdrs run the 2 functions below and overwrite saved csv.
for file_path in listdrs_path:
data = pd.read_csv(file_path, index_col=0)
########################################
####function 1
def get_price_hist(ticker):
# Put stock price data in dataframe
data = pd.read_csv(file_path)
#listdr = os.listdir('Users\17409\AppData\Local\Programs\Python\Python38\Indicators\Sentdex Tutorial\stock_dfs')
print(listdr)
# Convert date to timestamp and make index
data.index = data["Date"].apply(lambda x: pd.Timestamp(x))
data.drop("Date", axis=1, inplace=True)
return data
df = data
##print(data)
######Indicator data#####################
def get_indicators(data):
# Get MACD
data["macd"], data["macd_signal"], data["macd_hist"] = talib.MACD(data['Close'])
# Get MA10 and MA30
data["ma10"] = talib.MA(data["Close"], timeperiod=10)
data["ma30"] = talib.MA(data["Close"], timeperiod=30)
# Get RSI
data["rsi"] = talib.RSI(data["Close"])
return data
#####end functions#######
data2 = get_indicators(data)
print(data2)
data2.to_csv(file_path)
Let me know if it helps.

Can't convert 'bytes' object to str implicitly for DCM to raw file

I learn how to convert DCM file to Raw file .Got the code from Git Hub:
https://github.com/xiasun/dicom2raw/blob/master/dicom2raw.py
And it got a error"Can't convert 'bytes' object to str implicitly" on the line
"allInOne += dataset.PixelData"
I try to use "encode("utf-8")",but it make allInOne to be empty.
By the way ,Is there any code to generate the .mhd file corresponding to the .raw file?
import dicom
import os
import numpy
import sys
dicomPath = "C:/DataLuna16pen/dcmdata/"
lstFilesDCM = [] # create an empty list
for dirName, subdirList, fileList in os.walk(dicomPath):
allInOne = ""
print(subdirList)
i=0
for filename in fileList:
i+=1
if "".join(filename).endswith((".dcm", ".DCM")):
path = dicomPath + "".join(filename)
dataset = dicom.read_file(path)
for n,val in enumerate(dataset.pixel_array.flat):
dataset.pixel_array.flat[n] = val / 60
if val < 0:
dataset.pixel_array.flat[n] = 0
dataset.PixelData = numpy.uint8(dataset.pixel_array).tostring()
allInOne += dataset.PixelData
print ("slice " + "".join(filename) + " done ",end=" ")
print (i)
newFile = open("./all_in_one.raw", "wb")
newFile.write(allInOne)
newFile.close()
print ("RAW file generated")

There are several things:
PyDicom still doesn't read compressed DICOMs properly (loseless jpeg). You should check Transfer Syntax of the files to check if this is the case. As a workaround you can use GDCM tool dcmdjpeg
you should not convert byte array into string (np.array.tostring returns in fact the array of bytes)
for writing mha files, take a look at MedPy. You can also use ITK directly. There is python wrapper and SimpleITK - some kind lightweight modification of ITK

How to split a PDF every n page using PyPDF2?

I'm trying to learn how to split a pdf every n page.
In my case I want to split a 64p PDF into several chunks containing four pages each: file 1: p.1-4, file 2: p.5-8 etc.
I'm trying to understand PyPDF2 but my noobness overwhelms me:
from PyPDF2 import PdfFileWriter, PdfFileReader
pdf = PdfFileReader('my_pdf.pdf')
I guess I need to make a loop of sorts using addPage and write files till there's no pages left?

Little late but I ran into your question while looking for help trying to do the same thing.
I ended up doing the following, which does what you're asking. Mind you it's probably more than you're asking for, but the answer is in there. It's a rough first draft, in heavy need of refactoring and some variable renaming.
import os
from PyPDF2 import PdfFileReader, PdfFileWriter
def split_pdf(in_pdf, step=1):
"""Splits a given pdf into seperate pdfs and saves
those to a supfolder of the parent pdf's folder, called
splitted_pdf.
Arguments:
in_pdf: [str] Absolute path (and filename) of the
input pdf or just the filename, if the file
is in the current directory.
step: [int] Desired number of pages in each of the
output pdfs.
Returns:
dunno yet
"""
#TODO: Add choice for output dir
#TODO: Add logging instead of prints
#TODO: Refactor
try:
with open(in_pdf, 'rb') as in_file:
input_pdf = PdfFileReader(in_file)
num_pages = input_pdf.numPages
input_dir, filename = os.path.split(in_pdf)
filename = os.path.splitext(filename)[0]
output_dir = input_dir + "/" + filename + "_splitted/"
os.mkdir(output_dir)
intervals = range(0, num_pages, step)
intervals = dict(enumerate(intervals, 1))
naming = f'{filename}_p'
count = 0
for key, val in intervals.items():
output_pdf = PdfFileWriter()
if key == len(intervals):
for i in range(val, num_pages):
output_pdf.addPage(input_pdf.getPage(i))
nums = f'{val + 1}' if step == 1 else f'{val + 1}-{val + step}'
with open(f'{output_dir}{naming}{nums}.pdf', 'wb') as outfile:
output_pdf.write(outfile)
print(f'{naming}{nums}.pdf written to {output_dir}')
count += 1
else:
for i in range(val, intervals[key + 1]):
output_pdf.addPage(input_pdf.getPage(i))
nums = f'{val + 1}' if step == 1 else f'{val + 1}-{val + step}'
with open(f'{output_dir}{naming}{nums}.pdf', 'wb') as outfile:
output_pdf.write(outfile)
print(f'{naming}{nums}.pdf written to {output_dir}')
count += 1
except FileNotFoundError as err:
print('Cannot find the specified file. Check your input:')
print(f'{count} pdf files written to {output_dir}')
Hope it helps you.

from PyPDF2 import PdfFileReader, PdfFileWriter
import os
# Method to split the pdf at every given n pages.
def split_at_every(self,infile , step = 1):
# Copy the input file path to a local variable infile
input_pdf = PdfFileReader(open(infile, "rb"))
pdf_len = input_pdf.number_of_pages
# Get the complete file name along with its path and split the text to take only the first part.
fname = os.path.splitext(os.path.basename(infile))[0]
# Get the list of page numbers in the order of given step
# If there are 10 pages in a pdf, and the step is 2
# page_numbers = [0,2,4,6,8]
page_numbers = list(range(0,pdf_len,step))
# Loop through the pdf pages
for ind,val in enumerate(page_numbers):
# Check if the index is last in the given page numbers
# If the index is not the last one, carry on with the If block.
if(ind+1 != len(page_numbers)):
# Initialize the PDF Writer
output_1 = PdfFileWriter()
# Loop through the pdf pages starting from the value of current index till the value of next index
# Ex : page numbers = [0,2,4,6,8]
# If the current index is 0, loop from 1st page till the 2nd page in the pdf doc.
for page in range(page_numbers[ind], page_numbers[ind+1]):
# Get the data from the given page number
page_data = input_pdf.getPage(page)
# Add the page data to the pdf_writer
output_1.addPage(page_data)
# Frame the output file name
output_1_filename = '{}_page_{}.pdf'.format(fname, page + 1)
# Write the output content to the file and save it.
self.write_to_file(output_1_filename, output_1)
else:
output_final = PdfFileWriter()
output_final_filename = "Last_Pages"
# Loop through the pdf pages starting from the value of current index till the last page of the pdf doc.
# Ex : page numbers = [0,2,4,6,8]
# If the current index is 8, loop from 8th page till the last page in the pdf doc.
for page in range(page_numbers[ind], pdf_len):
# Get the data from the given page number
page_data = input_pdf.getPage(page)
# Add the page data to the pdf_writer
output_final.addPage(page_data)
# Frame the output file name
output_final_filename = '{}_page_{}.pdf'.format(fname, page + 1)
# Write the output content to the file and save it.
self.write_to_file(output_final_filename,output_final)

Adding all file feature data (shapefiles) from folder into an MXD with ArcPy

I want to ask about scripting using ArcPy for handling feature data inside an ArcGIS map document (MXD).
I have a folder that has some feature data in shapefile (shp) form.
D:\tes\2240.shp
D:\tes\2250.shp
D:\tes\22460.shp
etc.
I want to create an ArcPy script that can add the data above to an MXD. I can add files individually using this script:
import arcpy
mxd = arcpy.mapping.MapDocument(r"D:\tes\Operation.mxd")
df = arcpy.mapping.ListDataFrames(mxd, "Layers")[0]
targetGroupLayer = arcpy.mapping.ListLayers(mxd, "Actual", df)[0]
addLayer = arcpy.mapping.Layer(r"D:\data\2440.shp")
arcpy.mapping.AddLayerToGroup(df, targetGroupLayer, addLayer, "TOP")
addLayer = arcpy.mapping.Layer(r"D:\data\2450.shp")
arcpy.mapping.AddLayerToGroup(df, targetGroupLayer, addLayer, "TOP")
addLayer = arcpy.mapping.Layer(r"D:\data\2460.shp")
arcpy.mapping.AddLayerToGroup(df, targetGroupLayer, addLayer, "TOP")
mxd.saveACopy(r"D:\tes\Operation_2.mxd")
del mxd, addLayer
I want to change the path source data of the script above, at this part
addLayer = arcpy.mapping.Layer(r"D:\data\2440.shp")
so the script can add all shp data in the folder using the extension, not each file name hardcoded. Something kind of like this:
addLayer = arcpy.mapping.Layer(r"D:\data\*.shp")
What's the proper way to do that?

Create a list of all the shapefiles in the directory, then loop through it.
import arcpy
mxd = arcpy.mapping.MapDocument(r"D:\tes\Operation.mxd")
df = arcpy.mapping.ListDataFrames(mxd, "Layers")[0]
# set workspace to directory of interest
arcpy.env.workspace = r"D:\data"
# create list of all files ending in .shp
list_shapefiles = arcpy.ListFiles("*.shp")
targetGroupLayer = arcpy.mapping.ListLayers(mxd, "Actual", df)[0]
# loop through list, adding each shapefile to group layer
for shapefile in list_shapefiles:
addLayer = shapefile
arcpy.mapping.AddLayerToGroup(df, targetGroupLayer, addLayer, "TOP")
mxd.saveACopy(r"D:\tes\Operation_2.mxd")

How to set the "band description" option/tag of a GeoTIFF file using GDAL (gdalwarp/gdal_translate)

Does anybody know how to change or set the "Description" option/tag of a GeoTIFF file using GDAL?
To specify what I mean, this is an example of gdalinfo return from a GeoTIFF file with set "Description":
Band 1 Block=64x64 Type=UInt16, ColorInterp=Undefined
Description = AVHRR Channel 1: 0.58 micrometers -- 0.68 micrometers
Min=0.000 Max=814.000
Minimum=0.000, Maximum=814.000, Mean=113.177, StdDev=152.897
Metadata:
LAYER_TYPE=athematic
STATISTICS_MAXIMUM=814
STATISTICS_MEAN=113.17657236931
STATISTICS_MINIMUM=0
STATISTICS_STDDEV=152.89720574652
In the example you can see: Description = AVHRR Channel 1: 0.58 micrometers -- 0.68 micrometers
How do I set this parameter using GDAL?

In Python you can set the band description like this:
from osgeo import gdal, osr
import numpy
# Define output image name, size and projection info:
OutputImage = 'test.tif'
SizeX = 20
SizeY = 20
CellSize = 1
X_Min = 563220.0
Y_Max = 699110.0
N_Bands = 10
srs = osr.SpatialReference()
srs.ImportFromEPSG(2157)
srs = srs.ExportToWkt()
GeoTransform = (X_Min, CellSize, 0, Y_Max, 0, -CellSize)
# Create the output image:
Driver = gdal.GetDriverByName('GTiff')
Raster = Driver.Create(OutputImage, SizeX, SizeY, N_Bands, 2) # Datatype = 2 same as gdal.GDT_UInt16
Raster.SetProjection(srs)
Raster.SetGeoTransform(GeoTransform)
# Iterate over each band
for band in range(N_Bands):
BandNumber = band + 1
BandName = 'SomeBandName '+ str(BandNumber).zfill(3)
RasterBand = Raster.GetRasterBand(BandNumber)
RasterBand.SetNoDataValue(0)
RasterBand.SetDescription(BandName) # This sets the band name!
RasterBand.WriteArray(numpy.ones((SizeX, SizeY)))
# close the output image
Raster = None
print("Done.")
Unfortunately, I'm not sure if ArcGIS or QGIS are able to read the band descriptions. However, the band names are clearly visible in Tuiview:

GDAL includes a python application called gdal_edit.py which can be used to modify the metadata of a file in place. I am not familiar with the Description field you are referring to, but this tool should be the one to use.
Here is the man page: gdal_edit.py
Here is an example script using an ortho-image I downloaded from the USGS Earth-Explorer.
#!/bin/sh
# Image to modify
IMAGE_PATH='11skd505395.tif'
# Field to modify
IMAGE_FIELD='TIFFTAG_IMAGEDESCRIPTION'
# Print the tiff image description tag
gdalinfo $IMAGE_PATH | grep $IMAGE_FIELD
# Change the Field
CMD="gdal_edit.py -mo ${IMAGE_FIELD}='Lake-Tahoe' $IMAGE_PATH"
echo $CMD
$CMD
# Print the new field value
gdalinfo $IMAGE_PATH | grep $IMAGE_FIELD
Output
$ ./gdal-script.py
TIFFTAG_IMAGEDESCRIPTION=OrthoVista
gdal_edit.py -mo TIFFTAG_IMAGEDESCRIPTION='Lake-Tahoe' 11skd505395.tif
TIFFTAG_IMAGEDESCRIPTION='Lake-Tahoe'
Here is another link that should provide useful info.
https://gis.stackexchange.com/questions/111610/how-to-overwrite-metadata-in-a-tif-file-with-gdal

Here's a single purpose python commandline script to edit band description in place.
''' Set image band description to specified text'''
import os
import sys
from osgeo import gdal
gdal.UseExceptions()
if len(sys.argv) < 4:
print(f"Usage: {sys.argv[0]} [in_file] [band#] [text]")
sys.exit(1)
infile = sys.argv[1] # source filename and path
inband = int(sys.argv[2]) # source band number
descrip = sys.argv[3] # description text
data_in = gdal.Open(infile, gdal.GA_Update)
band_in = data_in.GetRasterBand(inband)
old_descrip = band_in.GetDescription()
band_in.SetDescription(descrip)
new_descrip = band_in.GetDescription()
# de-reference the datasets, which triggers gdal to save
data_in = None
data_out = None
print(f"Description was: {old_descrip}")
print(f"Description now: {new_descrip}")
In use:
$ python scripts\gdal-edit-band-desc.py test-edit.tif 1 "Red please"
Description was:
Description now: Red please
$ gdal-edit-band-desc test-edit.tif 1 "Red please also"
$ python t:\ENV.558\scripts\gdal-edit-band-desc.py test-edit.tif 1 "Red please also"
Description was: Red please
Description now: Red please also
Properly it should be added to gdal_edit.py but I don't know enough do feel safe adding it directly.

gdal_edit.py with the -mo flag can be used to edit the band descriptions, with the bands numbered starting from 1:
gdal_edit.py -mo BAND_1=AVHRR_Channel_1_p58_p68_um -mo BAND_2=AVHRR_Channel_2 avhrr.tif
I didn't try it with the special characters but that might work if you use the right quotes.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Loop through multiple netcdf files - grads

I would like to loop through multiple netcdf files in GrADS. The files run from May 1, 2018 to June 30, 2018 named as: 2018050100.nc...2018063000.nc The function is to regrid the netcdf files to 0.17 degrees. How do I do this?

Related

How can I use a loop to apply a function to a list of csv files?

Can't convert 'bytes' object to str implicitly for DCM to raw file

How to split a PDF every n page using PyPDF2?

Adding all file feature data (shapefiles) from folder into an MXD with ArcPy

How to set the "band description" option/tag of a GeoTIFF file using GDAL (gdalwarp/gdal_translate)

Categories

Resources