Exporting a 3D numpy to a VTK file for viewing in Paraview/Mayavi - numpy

For those that want to export a simple 3D numpy array (along with axes) to a .vtk (or .vtr) file for post-processing and display in Paraview or Mayavi there's a little module called PyEVTK that does exactly that. The module supports structured and unstructured data etc..
Unfortunately, even though the code works fine in unix-based systems I couldn't make it work (keeps crashing) on any windows installation which simply makes things complicated. Ive contacted the developer but his suggestions did not work
Therefore my question is:
How can one use the from vtk.util import numpy_support function to export a 3D array (the function itself doesn't support 3D arrays) to a .vtk file? Is there a simple way to do it without creating vtkDatasets etc etc?
Thanks a lot!

It's been forever and I had entirely forgotten asking this question but I ended up figuring it out. I've written a post about it in my blog (PyScience) providing a tutorial on how to convert between NumPy and VTK. Do take a look if interested:
pyscience.wordpress.com/2014/09/06/numpy-to-vtk-converting-your-numpy-arrays-to-vtk-arrays-and-files/

It's not a direct answer to your question, but if you have tvtk (if you have mayavi, you should have it), you can use it to write your data to vtk format. (See: http://code.enthought.com/projects/files/ETS3_API/enthought.tvtk.misc.html )
It doesn't use PyEVTK, and it supports a broad range of data sources (more than just structured and unstructured grids), so it will probably work where other things aren't.
As a quick example (Mayavi's mlab interface can make this much less verbose, especially if you're already using it.):
import numpy as np
from enthought.tvtk.api import tvtk, write_data
data = np.random.random((10,10,10))
grid = tvtk.ImageData(spacing=(10, 5, -10), origin=(100, 350, 200),
dimensions=data.shape)
grid.point_data.scalars = np.ravel(order='F')
grid.point_data.scalars.name = 'Test Data'
# Writes legacy ".vtk" format if filename ends with "vtk", otherwise
# this will write data using the newer xml-based format.
write_data(grid, 'test.vtk')
And a portion of the output file:
# vtk DataFile Version 3.0
vtk output
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 10 10 10
SPACING 10 5 -10
ORIGIN 100 350 200
POINT_DATA 1000
SCALARS Test%20Data double
LOOKUP_TABLE default
0.598189 0.228948 0.346975 0.948916 0.0109774 0.30281 0.643976 0.17398 0.374673
0.295613 0.664072 0.307974 0.802966 0.836823 0.827732 0.895217 0.104437 0.292796
0.604939 0.96141 0.0837524 0.498616 0.608173 0.446545 0.364019 0.222914 0.514992
...
...

TVTK of Mayavi has a beautiful way of writing vtk files. Here is a test example I have written for myself following #Joe and tvtk documentation. The advantage it has over evtk, is the support for both ascii and html.Hope it will help other people.
from tvtk.api import tvtk, write_data
import numpy as np
#data = np.random.random((3, 3, 3))
#
#i = tvtk.ImageData(spacing=(1, 1, 1), origin=(0, 0, 0))
#i.point_data.scalars = data.ravel()
#i.point_data.scalars.name = 'scalars'
#i.dimensions = data.shape
#
#w = tvtk.XMLImageDataWriter(input=i, file_name='spoints3d.vti')
#w.write()
points = np.array([[0,0,0], [1,0,0], [1,1,0], [0,1,0]], 'f')
(n1, n2) = points.shape
poly_edge = np.array([[0,1,2,3]])
print n1, n2
## Scalar Data
#temperature = np.array([10., 20., 30., 40.])
#pressure = np.random.rand(n1)
#
## Vector Data
#velocity = np.random.rand(n1,n2)
#force = np.random.rand(n1,n2)
#
##Tensor Data with
comp = 5
stress = np.random.rand(n1,comp)
#
#print stress.shape
## The TVTK dataset.
mesh = tvtk.PolyData(points=points, polys=poly_edge)
#
## Data 0 # scalar data
#mesh.point_data.scalars = temperature
#mesh.point_data.scalars.name = 'Temperature'
#
## Data 1 # additional scalar data
#mesh.point_data.add_array(pressure)
#mesh.point_data.get_array(1).name = 'Pressure'
#mesh.update()
#
## Data 2 # Vector data
#mesh.point_data.vectors = velocity
#mesh.point_data.vectors.name = 'Velocity'
#mesh.update()
#
## Data 3 additional vector data
#mesh.point_data.add_array( force)
#mesh.point_data.get_array(3).name = 'Force'
#mesh.update()
mesh.point_data.tensors = stress
mesh.point_data.tensors.name = 'Stress'
# Data 4 additional tensor Data
#mesh.point_data.add_array(stress)
#mesh.point_data.get_array(4).name = 'Stress'
#mesh.update()
write_data(mesh, 'polydata.vtk')
# XML format
# Method 1
#write_data(mesh, 'polydata')
# Method 2
#w = tvtk.XMLPolyDataWriter(input=mesh, file_name='polydata.vtk')
#w.write()

I know it is a bit late and I do love your tutorials #somada141. This should work too.
def numpy2VTK(img, spacing=[1.0, 1.0, 1.0]):
# evolved from code from Stou S.,
# on http://www.siafoo.net/snippet/314
# This function, as the name suggests, converts numpy array to VTK
importer = vtk.vtkImageImport()
img_data = img.astype('uint8')
img_string = img_data.tostring() # type short
dim = img.shape
importer.CopyImportVoidPointer(img_string, len(img_string))
importer.SetDataScalarType(VTK_UNSIGNED_CHAR)
importer.SetNumberOfScalarComponents(1)
extent = importer.GetDataExtent()
importer.SetDataExtent(extent[0], extent[0] + dim[2] - 1,
extent[2], extent[2] + dim[1] - 1,
extent[4], extent[4] + dim[0] - 1)
importer.SetWholeExtent(extent[0], extent[0] + dim[2] - 1,
extent[2], extent[2] + dim[1] - 1,
extent[4], extent[4] + dim[0] - 1)
importer.SetDataSpacing(spacing[0], spacing[1], spacing[2])
importer.SetDataOrigin(0, 0, 0)
return importer
Hope it helps!

Here's a SimpleITK version with the function load_itk taken from here:
import SimpleITK as sitk
import numpy as np
if len(sys.argv)<3:
print('Wrong number of arguments.', file=sys.stderr)
print('Usage: ' + __file__ + ' input_sitk_file' + ' output_sitk_file', file=sys.stderr)
sys.exit(1)
def quick_read(filename):
# Read image information without reading the bulk data.
file_reader = sitk.ImageFileReader()
file_reader.SetFileName(filename)
file_reader.ReadImageInformation()
print('image size: {0}\nimage spacing: {1}'.format(file_reader.GetSize(), file_reader.GetSpacing()))
# Some files have a rich meta-data dictionary (e.g. DICOM)
for key in file_reader.GetMetaDataKeys():
print(key + ': ' + file_reader.GetMetaData(key))
def load_itk(filename):
# Reads the image using SimpleITK
itkimage = sitk.ReadImage(filename)
# Convert the image to a numpy array first and then shuffle the dimensions to get axis in the order z,y,x
data = sitk.GetArrayFromImage(itkimage)
# Read the origin of the ct_scan, will be used to convert the coordinates from world to voxel and vice versa.
origin = np.array(list(reversed(itkimage.GetOrigin())))
# Read the spacing along each dimension
spacing = np.array(list(reversed(itkimage.GetSpacing())))
return data, origin, spacing
def convert(data, output_filename):
image = sitk.GetImageFromArray(data)
writer = sitk.ImageFileWriter()
writer.SetFileName(output_filename)
writer.Execute(image)
def wait():
print('Press Enter to load & convert or exit using Ctrl+C')
input()
quick_read(sys.argv[1])
print('-'*20)
wait()
data, origin, spacing = load_itk(sys.argv[1])
convert(sys.argv[2])

Related

passing panda dataframe data to functions and its not outputting the results

In my code, I am trying to extract data from csv file to use in the function, but it doesnt output anything, and gives no error. My code works because I tried it with just numpy array as inputs. not sure why it doesnt work with panda.
import numpy as np
import pandas as pd
import os
# change the current directory to the directory where the running script file is
os.chdir(os.path.dirname(os.path.abspath(__file__)))
# finding best fit line for y=mx+b by iteration
def gradient_descent(x,y):
m_iter = b_iter = 1 #starting point
iteration = 10000
n = len(x)
learning_rate = 0.05
last_mse = 10000
#take baby steps to reach global minima
for i in range(iteration):
y_predicted = m_iter*x + b_iter
#mse = 1/n*sum([value**2 for value in (y-y_predicted)]) # cost function to minimize
mse = 1/n*sum((y-y_predicted)**2) # cost function to minimize
if (last_mse - mse)/mse < 0.001:
break
# recall MSE formula is 1/n*sum((yi-y_predicted)^2), where y_predicted = m*x+b
# using partial deriv of MSE formula, d/dm and d/db
dm = -(2/n)*sum(x*(y-y_predicted))
db = -(2/n)*sum((y-y_predicted))
# use current predicted value to get the next value for prediction
# by using learning rate
m_iter = m_iter - learning_rate*dm
b_iter = b_iter - learning_rate*db
print('m is {}, b is {}, cost is {}, iteration {}'.format(m_iter,b_iter,mse,i))
last_mse = mse
#x = np.array([1,2,3,4,5])
#y = np.array([5,7,8,10,13])
#gradient_descent(x,y)
df = pd.read_csv('Linear_Data.csv')
x = df['Area']
y = df['Price']
gradient_descent(x,y)
My code works because I tried it with just numpy array as inputs. not sure why it doesnt work with panda.
Well no, your code also works with pandas dataframes:
df = pd.DataFrame({'Area': [1,2,3,4,5], 'Price': [5,7,8,10,13]})
x = df['Area']
y = df['Price']
gradient_descent(x,y)
Above will give you the same output as with numpy arrays.
Try to check what's in Linear_Data.csv and/or add some print statements in the gradient_descent function just to check your assumptions. I would suggest to first of all add a print statement before the condition with the break statement:
print(last_mse, mse)
if (last_mse - mse)/mse < 0.001:
break

Idiomatic tensorflow expression of for-loop which considers value of preceding iteration

It might not even be possible, but I would like to express the following code without the for-loop.
tf.scan is prohibitively slow and therefore not a good solution.
I am perfectly happy to accept any answer which gives a solution or an argument why this is not possible.
import tensorflow as tf
import matplotlib.pyplot as plt
# Some data
random_series = tf.reshape(tf.math.cumsum(tf.random.normal([100])),[1,-1])
# a mesh_grid of "co-distance"
random_mesh_gain = 1 - tf.matmul(random_series,tf.math.reciprocal(random_series), True, False)
# lower triangular matrix
random_tri = tf.linalg.band_part(random_mesh_gain, 0, -1)
# some lambda working on a series-type data
drop = 0.5
lambda_map = lambda series: tf.math.floor(series/interval)*interval - drop
# apply map
random_img = tf.map_fn(lambda_map, random_tri)
# init preceding and output list sl_full
preceding = - drop * tf.ones(tf.transpose(random_img).shape[0],dtype=tf.float32)
sl_full = []
for a in tf.transpose(random_img):
prec_a_max = tf.reduce_max(tf.stack([preceding, a]),axis=0)
preceding = prec_a_max
sl_full.append(prec_a_max)
# create tensor from list
sl_full = tf.transpose(tf.stack(sl_full))
plt.imshow(sl_full,origin="lower")

Why does histogram equalization on a 16-bit image show a strange result?

I have a 16-bit image which I want to rescale to 8-bit while achieving a high contrast. Now I tried histogram equalization as follows:
image_equ = cv.equalizeHist(cv_image.astype(np.uint8))
But the output is super strange:
What is happening? Is the rescaling to 8-bit first maybe the problem?
cv2.equalizeHist does not support uint16 input, and cv_image.astype(np.uint8) results overflows.
The solution is using different library, or implement the equalization using NumPy.
We can find the NumPy implementation of uint8 equalization in the OpenCV documentation:
Histograms - 2: Histogram Equalization
We can adjust the code (using NumPy) for uint16 input and output:
Replace 256 with 65536 (256 = 2^8 and 65536 = 2^16).
Replace 255 with 65535.
Replace uint8 with uint16.
Assuming the original code is correct, the following should work for uint16:
hist, bins = np.histogram(img.flatten(), 65536, [0, 65536]) # Collect 16 bits histogram (65536 = 2^16).
cdf = hist.cumsum()
cdf_m = np.ma.masked_equal(cdf, 0) # Find the minimum histogram value (excluding 0)
cdf_m = (cdf_m - cdf_m.min())*65535/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint16')
# Now we have the look-up table...
img2 = cdf[img]
Complete code sample (building sample 16 bits input):
import cv2
import numpy as np
# Build sample input for testing.
################################################################################
img = cv2.imread('chelsea.png', cv2.IMREAD_GRAYSCALE) # Read sample input image.
cv2.imshow('img', img) # Show input for testing.
img = img.astype(np.uint16) * 16 + 1000 # Make the image 16 bit, but the pixels range is going to be [1000, 5080] not full range (for example).
################################################################################
#equ = cv2.equalizeHist(img) # error: (-215:Assertion failed) _src.type() == CV_8UC1 in function 'cv::equalizeHist'
# https://docs.opencv.org/4.x/d5/daf/tutorial_py_histogram_equalization.html
hist, bins = np.histogram(img.flatten(), 65536, [0, 65536]) # Collect 16 bits histogram (65536 = 2^16).
cdf = hist.cumsum()
cdf_m = np.ma.masked_equal(cdf, 0) # Find the minimum histogram value (excluding 0)
cdf_m = (cdf_m - cdf_m.min())*65535/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint16')
# Now we have the look-up table...
equ = cdf[img]
# Show result for testing.
cv2.imshow('equ', equ)
cv2.waitKey()
cv2.destroyAllWindows()
Input (before scaling to 16 bits):
Output:

Getting data from odo.resource(source) to odo.resource(target)

I'm trying to extend the odo library with functionality to convert a GDAL dataset (raster with spatial information) to a NetCDF file.
Reading in the gdal dataset goes fine. But in the creation stage of the netcdf I need some metadata of the gdal dataset (metadata that is not know yet when calling odo.odo(source,target) ). How could I achieve this?
a short version of my code so far:
import odo
from odo import resource, append
import gdal
import netCDF4 as nc4
import numpy as np
#resource.register('.+\.tif')
def resource_gdal(uri, **kwargs):
ds = gdal.Open(uri)
# metadata I need to transfer to netcdf
b = ds.GetGeoTransform() #bbox, interval
return ds
#resource.register('.+\.nc')
def resource_netcdf(uri, dshape=None, **kwargs):
ds = nc4.Dataset(uri,'w')
# create lat lon dimensions and variables
ds.createDimension(lat, dshape[0].val)
ds.createDimension(lon, dshape[1].val)
lat = ds.createVariable('lat','f4', ('lat',))
lon = ds.createVariable('lon','f4', ('lon',))
# create a range from the **gdal metadata**
lat_array = np.arange(dshape[0].val)*b[1]+b[0]
lon_array = np.arange(dshape[1].val)*b[5]+b[3]
# assign the range to the netcdf variable
lat[:] = lat_array
lon[:] = lon_array
# create the variable which will hold the gdal data
data = ds.createVariable('data', 'f4', ('lat', 'lon',))
return data
#append.register(nc4.Variable, gdal.Dataset)
def append_gdal_to_nc4(tgt, src, **kwargs):
arr = src.ReadAsArray()
tgt[:] = arr
return tgt
Thanks!
I don't have much experience with odo, but from browsing the source code and docs it looks like resource_netcdf() should not be involved in translating gdal data to netcdf. Translating should be the job of a gdal_to_netcdf() function decorated by convert.register. In such a case, the gdal.Dataset object returned by resource_gdal would have all sufficient information (georeferencing, pixel size) to make a netcdf.

live plotting using pyserial and matplotlib

I can capture data from serial device via pyserial, at this time I can only export data to text file, the text file has format like below, it's have 3 columns
>21 21 0
>
>41 41 0.5
>
>73 73 1
>
....
>2053 2053 5
>
>2084 2084 5.5
>
>2125 2125 6
Now I want to use matplotlib to generate live graph has 2 figure (x,y) x,y are second and third column, first comlumn,'>', and lines don't have data can be remove
thank folks!
============================
Update : today, after follow these guides from
http://www.blendedtechnologies.com/realtime-plot-of-arduino-serial-data-using-python/231
http://eli.thegreenplace.net/2008/08/01/matplotlib-with-wxpython-guis
pyserial - How to read the last line sent from a serial device
now I can live plot with threading but eliben said that this Guis only plot single value each time, that lead to me the very big limitation, beacause my purpose is plotting 2 or 3 column, here is the code was modified from blendedtechnologies
Here is serial handler :
from threading import Thread
import time
import serial
last_received = ''
def receiving(ser):
global last_received
buffer = ''
while True:
buffer = buffer + ser.read(ser.inWaiting())
if '\n' in buffer:
lines = buffer.split('\n') # Guaranteed to have at least 2 entries
last_received = lines[-2]
#If the Arduino sends lots of empty lines, you'll lose the
#last filled line, so you could make the above statement conditional
#like so: if lines[-2]: last_received = lines[-2]
buffer = lines[-1]
class SerialData(object):
def __init__(self, init=50):
try:
self.ser = ser = serial.Serial(
port='/dev/ttyS0',
baudrate=9600,
bytesize=serial.EIGHTBITS,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
timeout=0.1,
xonxoff=0,
rtscts=0,
interCharTimeout=None
)
except serial.serialutil.SerialException:
#no serial connection
self.ser = None
else:
Thread(target=receiving, args=(self.ser,)).start()
def next(self):
if not self.ser:
return 100 #return anything so we can test when Arduino isn't connected
#return a float value or try a few times until we get one
for i in range(40):
raw_line = last_received[1:].split(' ').pop(0)
try:
return float(raw_line.strip())
except ValueError:
print 'bogus data',raw_line
time.sleep(.005)
return 0.
def __del__(self):
if self.ser:
self.ser.close()
if __name__=='__main__':
s = SerialData()
for i in range(500):
time.sleep(.015)
print s.next()
For me I modified this segment so it can grab my 1st column data
for i in range(40):
raw_line = last_received[1:].split(' ').pop(0)
try:
return float(raw_line.strip())
except ValueError:
print 'bogus data',raw_line
time.sleep(.005)
return 0.
and generate graph base on these function on the GUI file
from Arduino_Monitor import SerialData as DataGen
def __init__(self):
wx.Frame.__init__(self, None, -1, self.title)
self.datagen = DataGen()
self.data = [self.datagen.next()]
................................................
def init_plot(self):
self.dpi = 100
self.fig = Figure((3.0, 3.0), dpi=self.dpi)
self.axes = self.fig.add_subplot(111)
self.axes.set_axis_bgcolor('black')
self.axes.set_title('Arduino Serial Data', size=12)
pylab.setp(self.axes.get_xticklabels(), fontsize=8)
pylab.setp(self.axes.get_yticklabels(), fontsize=8)
# plot the data as a line series, and save the reference
# to the plotted line series
#
self.plot_data = self.axes.plot(
self.data,
linewidth=1,
color=(1, 1, 0),
)[0]
So my next question is how to realtime grab at least 2 column and passing 2 columns'data to the GUIs that it can generate graph with 2 axis.
self.plot_data.set_xdata(np.arange(len(self.data))) #my 3rd column data
self.plot_data.set_ydata(np.array(self.data)) #my 2nd column data
Well, this reads your string and converts the numbers to floats. I assume you'll be able to adapt this as needed.
import numpy as np
import pylab as plt
str = '''>21 21 0
>
>41 41 0.5
>
>73 73 1
>
>2053 2053 5
>
>2084 2084 5.5
>
>2125 2125 6'''
nums = np.array([[float(n) for n in sub[1:].split(' ') if len(n)>0] for sub in str.splitlines() if len(sub)>1])
fig = plt.figure(0)
ax = plt.subplot(2,1,1)
ax.plot(nums[:,0], nums[:,1], 'k.')
ax = plt.subplot(2,1,2)
ax.plot(nums[:,0], nums[:,2], 'r+')
plt.show()
Here you have an Eli Bendersky's example of how plotting data arriving from a serial port
some time back I had the same problem. I wasted a lot of writing same ting over and over again. so I wrote a python package for it.
https://github.com/girish946/plot-cat
you just have to write the logic for obtaining the data from serial port.
the example is here: https://github.com/girish946/plot-cat/blob/master/examples/test-ser.py