Argparse passing constants and variables - argparse

I'm trying to utilize argparse for a scale-able solution for SNMP (Nagios).
The issue i'm running into is trying to have constants and vars be passed along through the add_argument()
example :
./SNMP.py -j 10 20 -l
-j would store the str ".1.5.5.8"
the arguments after would set the warn integer level and the critical integer level bypassing the defaults set in parser.add_argument()
-l would store a different OID str but would use the default warn and critical levels stored in parser.add_argument()
Thanks!
In short the code i have to get around this dilemma :
parser = argparse.ArgumentParser(description = "This is used to parse latency, jitter, and packet loss on an HDX")
parser.add_argument("-j", action = 'append', dest = 'jitter',
default = [".2.51.5.9.4","20 40"])
args = parser.parse_args()
warn, crit = args.jitter[-1].split()

In [16]: parser=argparse.ArgumentParser()
In [17]: parser.add_argument("-j", action = 'append', dest = 'jitter',
...: default = [".2.51.5.9.4","20 40"])
Out[17]: _AppendAction(option_strings=['-j'], dest='jitter', nargs=None, const=None, default=['.2.51.5.9.4', '20 40'], type=None, choices=None, help=None, metavar=None)
In [18]: parser.parse_args([])
Out[18]: Namespace(jitter=['.2.51.5.9.4', '20 40'])
In [19]: parser.parse_args(['-j','1'])
Out[19]: Namespace(jitter=['.2.51.5.9.4', '20 40', '1'])
So the append action puts the default in the Namespace, and appends any values supplied with -j to that list. Also -j may be repeated, adding more values.
Some people think this an error and that values should be appended to [], and the default should only appear with -j is not used at all. The current behavior is simple and predicable.
An alternative is to leave the default as None or [], and add the default values yourself after parsing if args.jitter is None:
In [22]: parser.add_argument("-j", action = 'append', dest = 'jitter', nargs=2)
Out[22]: _AppendAction(option_strings=['-j'], dest='jitter', nargs=2, const=None, default=None, type=None, choices=None, help=None, metavar=None)
In [23]: parser.parse_args([])
Out[23]: Namespace(jitter=None)
In [24]: parser.parse_args(['-j','20','40'])
Out[24]: Namespace(jitter=[['20', '40']])
So testing would be something like:
if args.jitter is None:
args.jitter= [...]
I added nargs to show that what gets appended is a sublist.
See http://bugs.python.org/issue16399 for more discussion of append with defaults.

Related

How do I make `mypy` recognize a type as `_DTypeLike[_ScalarType]`?

In the following snipping, mypy reports some Any component in the type of the result of np.ndarray.astype:
"""Bug."""
import numpy as np
from typing_extensions import reveal_type
# How do I make mypy recognize the type as `_DTypeLike[_ScalarType]`?
reveal_type(np.array([]).astype(float))
reveal_type(np.array([]).astype(np.float64))
reveal_type(np.array([]).astype(np.dtype(float)))
reveal_type(np.array([]).astype(np.dtype(np.float64)))
# > mypy bug.py
# bug.py:6: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
# bug.py:7: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]"
# bug.py:8: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[Any]]]"
# bug.py:9: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]"
# ^^^
Type stubs for numpy contain this:
#overload
def astype(
self,
dtype: _DTypeLike[_ScalarType],
order: _OrderKACF = ...,
casting: _CastingKind = ...,
subok: bool = ...,
copy: bool | _CopyMode = ...,
) -> NDArray[_ScalarType]: ...
#overload
def astype(
self,
dtype: DTypeLike,
order: _OrderKACF = ...,
casting: _CastingKind = ...,
subok: bool = ...,
copy: bool | _CopyMode = ...,
) -> NDArray[Any]: ...
# A subset of `npt.DTypeLike` that can be parametrized w.r.t. `np.generic`
_DTypeLike = Union[
"np.dtype[_SCT]",
Type[_SCT],
_SupportsDType["np.dtype[_SCT]"],
]
DTypeLike = Union[
DType[Any],
# default data type (float64)
None,
# array-scalar types and generic types
Type[Any], # NOTE: We're stuck with `Type[Any]` due to object dtypes
# anything with a dtype attribute
_SupportsDType[DType[Any]],
# character codes, type strings or comma-separated fields, e.g., 'float64'
str,
_VoidDTypeLike,
]
I verified that isinstance(np.float64(1), np.generic).
Also, I read that
If there are multiple equally good matching variants, mypy will select the variant that was defined first.
https://mypy.readthedocs.io/en/stable/more_types.html#type-checking-calls-to-overloads
How should I call astype in order to make mypy understand that the returned array is always of type (np.)float(64)?

How to convert a string inside a function to a variable name that holds a Pandas datframe outside the function? [duplicate]

I know that some other languages, such as PHP, support a concept of "variable variable names" - that is, the contents of a string can be used as part of a variable name.
I heard that this is a bad idea in general, but I think it would solve some problems I have in my Python code.
Is it possible to do something like this in Python? What can go wrong?
If you are just trying to look up an existing variable by its name, see How can I select a variable by (string) name?. However, first consider whether you can reorganize the code to avoid that need, following the advice in this question.
You can use dictionaries to accomplish this. Dictionaries are stores of keys and values.
>>> dct = {'x': 1, 'y': 2, 'z': 3}
>>> dct
{'y': 2, 'x': 1, 'z': 3}
>>> dct["y"]
2
You can use variable key names to achieve the effect of variable variables without the security risk.
>>> x = "spam"
>>> z = {x: "eggs"}
>>> z["spam"]
'eggs'
For cases where you're thinking of doing something like
var1 = 'foo'
var2 = 'bar'
var3 = 'baz'
...
a list may be more appropriate than a dict. A list represents an ordered sequence of objects, with integer indices:
lst = ['foo', 'bar', 'baz']
print(lst[1]) # prints bar, because indices start at 0
lst.append('potatoes') # lst is now ['foo', 'bar', 'baz', 'potatoes']
For ordered sequences, lists are more convenient than dicts with integer keys, because lists support iteration in index order, slicing, append, and other operations that would require awkward key management with a dict.
Use the built-in getattr function to get an attribute on an object by name. Modify the name as needed.
obj.spam = 'eggs'
name = 'spam'
getattr(obj, name) # returns 'eggs'
It's not a good idea. If you are accessing a global variable you can use globals().
>>> a = 10
>>> globals()['a']
10
If you want to access a variable in the local scope you can use locals(), but you cannot assign values to the returned dict.
A better solution is to use getattr or store your variables in a dictionary and then access them by name.
New coders sometimes write code like this:
my_calculator.button_0 = tkinter.Button(root, text=0)
my_calculator.button_1 = tkinter.Button(root, text=1)
my_calculator.button_2 = tkinter.Button(root, text=2)
...
The coder is then left with a pile of named variables, with a coding effort of O(m * n), where m is the number of named variables and n is the number of times that group of variables needs to be accessed (including creation). The more astute beginner observes that the only difference in each of those lines is a number that changes based on a rule, and decides to use a loop. However, they get stuck on how to dynamically create those variable names, and may try something like this:
for i in range(10):
my_calculator.('button_%d' % i) = tkinter.Button(root, text=i)
They soon find that this does not work.
If the program requires arbitrary variable "names," a dictionary is the best choice, as explained in other answers. However, if you're simply trying to create many variables and you don't mind referring to them with a sequence of integers, you're probably looking for a list. This is particularly true if your data are homogeneous, such as daily temperature readings, weekly quiz scores, or a grid of graphical widgets.
This can be assembled as follows:
my_calculator.buttons = []
for i in range(10):
my_calculator.buttons.append(tkinter.Button(root, text=i))
This list can also be created in one line with a comprehension:
my_calculator.buttons = [tkinter.Button(root, text=i) for i in range(10)]
The result in either case is a populated list, with the first element accessed with my_calculator.buttons[0], the next with my_calculator.buttons[1], and so on. The "base" variable name becomes the name of the list and the varying identifier is used to access it.
Finally, don't forget other data structures, such as the set - this is similar to a dictionary, except that each "name" doesn't have a value attached to it. If you simply need a "bag" of objects, this can be a great choice. Instead of something like this:
keyword_1 = 'apple'
keyword_2 = 'banana'
if query == keyword_1 or query == keyword_2:
print('Match.')
You will have this:
keywords = {'apple', 'banana'}
if query in keywords:
print('Match.')
Use a list for a sequence of similar objects, a set for an arbitrarily-ordered bag of objects, or a dict for a bag of names with associated values.
Whenever you want to use variable variables, it's probably better to use a dictionary. So instead of writing
$foo = "bar"
$$foo = "baz"
you write
mydict = {}
foo = "bar"
mydict[foo] = "baz"
This way you won't accidentally overwrite previously existing variables (which is the security aspect) and you can have different "namespaces".
Use globals() (disclaimer: this is a bad practice, but is the most straightforward answer to your question, please use other data structure as in the accepted answer).
You can actually assign variables to global scope dynamically, for instance, if you want 10 variables that can be accessed on a global scope i_1, i_2 ... i_10:
for i in range(10):
globals()['i_{}'.format(i)] = 'a'
This will assign 'a' to all of these 10 variables, of course you can change the value dynamically as well. All of these variables can be accessed now like other globally declared variable:
>>> i_5
'a'
Instead of a dictionary you can also use namedtuple from the collections module, which makes access easier.
For example:
# using dictionary
variables = {}
variables["first"] = 34
variables["second"] = 45
print(variables["first"], variables["second"])
# using namedtuple
Variables = namedtuple('Variables', ['first', 'second'])
v = Variables(34, 45)
print(v.first, v.second)
The SimpleNamespace class could be used to create new attributes with setattr, or subclass SimpleNamespace and create your own function to add new attribute names (variables).
from types import SimpleNamespace
variables = {"b":"B","c":"C"}
a = SimpleNamespace(**variables)
setattr(a,"g","G")
a.g = "G+"
something = a.a
If you don't want to use any object, you can still use setattr() inside your current module:
import sys
current_module = module = sys.modules[__name__] # i.e the "file" where your code is written
setattr(current_module, 'variable_name', 15) # 15 is the value you assign to the var
print(variable_name) # >>> 15, created from a string
You have to use globals() built in method to achieve that behaviour:
def var_of_var(k, v):
globals()[k] = v
print variable_name # NameError: name 'variable_name' is not defined
some_name = 'variable_name'
globals()[some_name] = 123
print(variable_name) # 123
some_name = 'variable_name2'
var_of_var(some_name, 456)
print(variable_name2) # 456
Variable variables in Python
"""
<?php
$a = 'hello';
$e = 'wow'
?>
<?php
$$a = 'world';
?>
<?php
echo "$a ${$a}\n";
echo "$a ${$a[1]}\n";
?>
<?php
echo "$a $hello";
?>
"""
a = 'hello' #<?php $a = 'hello'; ?>
e = 'wow' #<?php $e = 'wow'; ?>
vars()[a] = 'world' #<?php $$a = 'world'; ?>
print(a, vars()[a]) #<?php echo "$a ${$a}\n"; ?>
print(a, vars()[vars()['a'][1]]) #<?php echo "$a ${$a[1]}\n"; ?>
print(a, hello) #<?php echo "$a $hello"; ?>
Output:
hello world
hello wow
hello world
Using globals(), locals(), or vars() will produce the same results
#<?php $a = 'hello'; ?>
#<?php $e = 'wow'; ?>
#<?php $$a = 'world'; ?>
#<?php echo "$a ${$a}\n"; ?>
#<?php echo "$a ${$a[1]}\n"; ?>
#<?php echo "$a $hello"; ?>
print('locals():\n')
a = 'hello'
e = 'wow'
locals()[a] = 'world'
print(a, locals()[a])
print(a, locals()[locals()['a'][1]])
print(a, hello)
print('\n\nglobals():\n')
a = 'hello'
e = 'wow'
globals()[a] = 'world'
print(a, globals()[a])
print(a, globals()[globals()['a'][1]])
print(a, hello)
Output:
locals():
hello world
hello wow
hello world
globals():
hello world
hello wow
hello world
Bonus (creating variables from strings)
# Python 2.7.16 (default, Jul 13 2019, 16:01:51)
# [GCC 8.3.0] on linux2
Creating variables and unpacking tuple:
g = globals()
listB = []
for i in range(10):
g["num%s" % i] = i ** 10
listB.append("num{0}".format(i))
def printNum():
print "Printing num0 to num9:"
for i in range(10):
print "num%s = " % i,
print g["num%s" % i]
printNum()
listA = []
for i in range(10):
listA.append(i)
listA = tuple(listA)
print listA, '"Tuple to unpack"'
listB = str(str(listB).strip("[]").replace("'", "") + " = listA")
print listB
exec listB
printNum()
Output:
Printing num0 to num9:
num0 = 0
num1 = 1
num2 = 1024
num3 = 59049
num4 = 1048576
num5 = 9765625
num6 = 60466176
num7 = 282475249
num8 = 1073741824
num9 = 3486784401
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9) "Tuple to unpack"
num0, num1, num2, num3, num4, num5, num6, num7, num8, num9 = listA
Printing num0 to num9:
num0 = 0
num1 = 1
num2 = 2
num3 = 3
num4 = 4
num5 = 5
num6 = 6
num7 = 7
num8 = 8
num9 = 9
I'm answering the question How to get the value of a variable given its name in a string?
which is closed as a duplicate with a link to this question. (Editor's note: It is now closed as a duplicate of How can I select a variable by (string) name?)
If the variables in question are part of an object (part of a class for example) then some useful functions to achieve exactly that are hasattr, getattr, and setattr.
So for example you can have:
class Variables(object):
def __init__(self):
self.foo = "initial_variable"
def create_new_var(self, name, value):
setattr(self, name, value)
def get_var(self, name):
if hasattr(self, name):
return getattr(self, name)
else:
raise "Class does not have a variable named: " + name
Then you can do:
>>> v = Variables()
>>> v.get_var("foo")
'initial_variable'
>>> v.create_new_var(v.foo, "is actually not initial")
>>> v.initial_variable
'is actually not initial'
I have tried both in python 3.7.3, you can use either globals() or vars()
>>> food #Error
>>> milkshake #Error
>>> food="bread"
>>> drink="milkshake"
>>> globals()[food] = "strawberry flavor"
>>> vars()[drink] = "chocolate flavor"
>>> bread
'strawberry flavor'
>>> milkshake
'chocolate flavor'
>>> globals()[drink]
'chocolate flavor'
>>> vars()[food]
'strawberry flavor'
Reference:
https://www.daniweb.com/programming/software-development/threads/111526/setting-a-string-as-a-variable-name#post548936
The consensus is to use a dictionary for this - see the other answers. This is a good idea for most cases, however, there are many aspects arising from this:
you'll yourself be responsible for this dictionary, including garbage collection (of in-dict variables) etc.
there's either no locality or globality for variable variables, it depends on the globality of the dictionary
if you want to rename a variable name, you'll have to do it manually
however, you are much more flexible, e.g.
you can decide to overwrite existing variables or ...
... choose to implement const variables
to raise an exception on overwriting for different types
etc.
That said, I've implemented a variable variables manager-class which provides some of the above ideas. It works for python 2 and 3.
You'd use the class like this:
from variableVariablesManager import VariableVariablesManager
myVars = VariableVariablesManager()
myVars['test'] = 25
print(myVars['test'])
# define a const variable
myVars.defineConstVariable('myconst', 13)
try:
myVars['myconst'] = 14 # <- this raises an error, since 'myconst' must not be changed
print("not allowed")
except AttributeError as e:
pass
# rename a variable
myVars.renameVariable('myconst', 'myconstOther')
# preserve locality
def testLocalVar():
myVars = VariableVariablesManager()
myVars['test'] = 13
print("inside function myVars['test']:", myVars['test'])
testLocalVar()
print("outside function myVars['test']:", myVars['test'])
# define a global variable
myVars.defineGlobalVariable('globalVar', 12)
def testGlobalVar():
myVars = VariableVariablesManager()
print("inside function myVars['globalVar']:", myVars['globalVar'])
myVars['globalVar'] = 13
print("inside function myVars['globalVar'] (having been changed):", myVars['globalVar'])
testGlobalVar()
print("outside function myVars['globalVar']:", myVars['globalVar'])
If you wish to allow overwriting of variables with the same type only:
myVars = VariableVariablesManager(enforceSameTypeOnOverride = True)
myVars['test'] = 25
myVars['test'] = "Cat" # <- raises Exception (different type on overwriting)
Any set of variables can also be wrapped up in a class.
"Variable" variables may be added to the class instance during runtime by directly accessing the built-in dictionary through __dict__ attribute.
The following code defines Variables class, which adds variables (in this case attributes) to its instance during the construction. Variable names are taken from a specified list (which, for example, could have been generated by program code):
# some list of variable names
L = ['a', 'b', 'c']
class Variables:
def __init__(self, L):
for item in L:
self.__dict__[item] = 100
v = Variables(L)
print(v.a, v.b, v.c)
#will produce 100 100 100
It should be extremely risky...
but you can use exec():
a = 'b=5'
exec(a)
c = b*2
print (c)
Result:
10
The setattr() method sets the value of the specified attribute of the specified object.
Syntax goes like this –
setattr(object, name, value)
Example –
setattr(self,id,123)
which is equivalent to self.id = 123
As you might have observed, setattr() expects an object to be passed along with the value to generate/modify a new attribute.
We can use setattr() with a workaround to be able to use within modules. Here’ how –
import sys
x = "pikachu"
value = 46
thismodule = sys.modules[__name__]
setattr(thismodule, x, value)
print(pikachu)

Idiomatic tensorflow expression of for-loop which considers value of preceding iteration

It might not even be possible, but I would like to express the following code without the for-loop.
tf.scan is prohibitively slow and therefore not a good solution.
I am perfectly happy to accept any answer which gives a solution or an argument why this is not possible.
import tensorflow as tf
import matplotlib.pyplot as plt
# Some data
random_series = tf.reshape(tf.math.cumsum(tf.random.normal([100])),[1,-1])
# a mesh_grid of "co-distance"
random_mesh_gain = 1 - tf.matmul(random_series,tf.math.reciprocal(random_series), True, False)
# lower triangular matrix
random_tri = tf.linalg.band_part(random_mesh_gain, 0, -1)
# some lambda working on a series-type data
drop = 0.5
lambda_map = lambda series: tf.math.floor(series/interval)*interval - drop
# apply map
random_img = tf.map_fn(lambda_map, random_tri)
# init preceding and output list sl_full
preceding = - drop * tf.ones(tf.transpose(random_img).shape[0],dtype=tf.float32)
sl_full = []
for a in tf.transpose(random_img):
prec_a_max = tf.reduce_max(tf.stack([preceding, a]),axis=0)
preceding = prec_a_max
sl_full.append(prec_a_max)
# create tensor from list
sl_full = tf.transpose(tf.stack(sl_full))
plt.imshow(sl_full,origin="lower")

Redis not returning result after upgrading Celery from 3.1 to 4.0

I recently upgraded my Celery installation to 4.0. After a few days of wrestling with the upgrade process, I finally got it to work... sort of. Some tasks will return, but the final task will not.
I have a class, SFF, that takes in and parses a file:
# Constructor with I/O file
def __init__(self, file):
# File data that's gonna get used a lot
sffDescriptor = file.fileno()
fileName = abspath(file.name)
# Get the pointer to the file
filePtr = mmap.mmap(sffDescriptor, 0, flags=mmap.MAP_SHARED, prot=mmap.PROT_READ)
# Get the header info
hdr = filePtr.read(HEADER_SIZE)
self.header = SFFHeader._make(unpack(HEADER_FMT, hdr))
# Read in the palette maps
print self.header.onDemandDataSize
print self.header.onLoadDataSize
palMapsResult = getPalettes.delay(fileName, self.header.palBankOff - HEADER_SIZE, self.header.onDemandDataSize, self.header.numPals)
# Read the sprite list nodes
nodesStart = self.header.sprListOff
nodesEnd = self.header.palBankOff
print nodesEnd - nodesStart
sprNodesResult = getSprNodes.delay(fileName, nodesStart, nodesEnd, self.header.numSprites)
# Get palette data
self.palettes = palMapsResult.get()
# Get sprite data
spriteNodes = sprNodesResult.get()
# TESTING
spritesResultSet = ResultSet([])
numSpriteNodes = len(spriteNodes)
# Split the nodes into chunks of size 32 elements
for x in xrange(0, numSpriteNodes, 32):
spritesResult = getSprites.delay(spriteNodes, x, x+32, fileName, self.palettes, self.header.palBankOff, self.header.onDemandDataSizeTotal)
spritesResultSet.add(spritesResult)
break # REMEMBER TO REMOVE FOR ENTIRE SFF
self.sprites = spritesResultSet.join_native()
It doesn't matter if it's a single task that returns the entire spritesResult, or if I split it using a ResultSet, the outcome is always the same: the Python console I'm using just hangs at either spritesResultSet.join_native() or spritesResult.get() (depending on how I format it).
Here is the task in question:
#task
def getSprites(nodes, start, end, fileName, palettes, palBankOff, onDemandDataSizeTotal):
sprites = []
with open(fileName, "rb") as file:
sffDescriptor = file.fileno()
sffData = mmap.mmap(sffDescriptor, 0, flags=mmap.MAP_SHARED, prot=mmap.PROT_READ)
for node in nodes[start:end]:
sprListNode = dict(SprListNode._make(node)._asdict()) # Need to convert it to a dict since values may change.
#print node
#print sprListNode
# If it's a linked sprite, the data length is 0, so get the linked index.
if sprListNode['dataLen'] == 0:
sprListNodeTemp = SprListNode._make(nodes[sprListNode['index']])
sprListNode['dataLen'] = sprListNodeTemp.dataLen
sprListNode['dataOffset'] = sprListNodeTemp.dataOffset
sprListNode['compression'] = sprListNodeTemp.compression
# What does the offset need to be?
dataOffset = sprListNode['dataOffset']
if sprListNode['loadMode'] == 0:
dataOffset += palBankOff #- HEADER_SIZE
elif sprListNode['loadMode'] == 1:
dataOffset += onDemandDataSizeTotal #- HEADER_SIZE
#print sprListNode
# Seek to the data location and "read" it in. First 4 bytes are just the image length
start = dataOffset + 4
end = dataOffset + sprListNode['dataLen']
#sffData.seek(start)
compressedSprite = sffData[start:end]
# Create the sprite
sprite = Sprite(sprListNode, palettes[sprListNode['palNo']], np.fromstring(compressedSprite, dtype=np.uint8))
sprites.append(sprite)
return json.dumps(sprites, cls=SpriteJSONEncoder)
I know it reaches the return statement, because if I put a print right above it, it will print in the Celery window. I also know that the task is running to completion because I get the following message from the worker:
[2016-11-16 00:03:33,639: INFO/PoolWorker-4] Task framedatabase.tasks.getSprites[285ac9b1-09b4-4cf1-a251-da6212863832] succeeded in 0.137236133218s: '[{"width": 120, "palNo": 30, "group": 9000, "xAxis": 0, "yAxis": 0, "data":...'
Here are my celery settings in settings.py:
# Celery settings
BROKER_URL='redis://localhost:1717/1'
CELERY_RESULT_BACKEND='redis://localhost:1717/0'
CELERY_IGNORE_RESULT=False
CELERY_IMPORTS = ("framedatabase.tasks", )
... and my celery.py:
from __future__ import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'framedatabase.settings')
from django.conf import settings # noqa
app = Celery('framedatabase', backend='redis://localhost:1717/1', broker="redis://localhost:1717/0",
include=['framedatabase.tasks'])
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
Found the problem. Apparently it was leading to deadlock as mentioned in the section "Avoid launching synchronous subtasks" in the Celery documentation here: http://docs.celeryproject.org/en/latest/userguide/tasks.html#tips-and-best-practices
So I got rid of the line:
sprNodesResult.get()
And changed the final result to a chain:
self.sprites = chain(getSprNodes.s(fileName, nodesStart, nodesEnd, self.header.numSprites),
getSprites.s(0,32,fileName,self.palettes,self.header.palBankOff,self.header.onDemandDataSizeTotal))().get()
And it works! Now I just have to find a way to split this the way I want!

Exporting a 3D numpy to a VTK file for viewing in Paraview/Mayavi

For those that want to export a simple 3D numpy array (along with axes) to a .vtk (or .vtr) file for post-processing and display in Paraview or Mayavi there's a little module called PyEVTK that does exactly that. The module supports structured and unstructured data etc..
Unfortunately, even though the code works fine in unix-based systems I couldn't make it work (keeps crashing) on any windows installation which simply makes things complicated. Ive contacted the developer but his suggestions did not work
Therefore my question is:
How can one use the from vtk.util import numpy_support function to export a 3D array (the function itself doesn't support 3D arrays) to a .vtk file? Is there a simple way to do it without creating vtkDatasets etc etc?
Thanks a lot!
It's been forever and I had entirely forgotten asking this question but I ended up figuring it out. I've written a post about it in my blog (PyScience) providing a tutorial on how to convert between NumPy and VTK. Do take a look if interested:
pyscience.wordpress.com/2014/09/06/numpy-to-vtk-converting-your-numpy-arrays-to-vtk-arrays-and-files/
It's not a direct answer to your question, but if you have tvtk (if you have mayavi, you should have it), you can use it to write your data to vtk format. (See: http://code.enthought.com/projects/files/ETS3_API/enthought.tvtk.misc.html )
It doesn't use PyEVTK, and it supports a broad range of data sources (more than just structured and unstructured grids), so it will probably work where other things aren't.
As a quick example (Mayavi's mlab interface can make this much less verbose, especially if you're already using it.):
import numpy as np
from enthought.tvtk.api import tvtk, write_data
data = np.random.random((10,10,10))
grid = tvtk.ImageData(spacing=(10, 5, -10), origin=(100, 350, 200),
dimensions=data.shape)
grid.point_data.scalars = np.ravel(order='F')
grid.point_data.scalars.name = 'Test Data'
# Writes legacy ".vtk" format if filename ends with "vtk", otherwise
# this will write data using the newer xml-based format.
write_data(grid, 'test.vtk')
And a portion of the output file:
# vtk DataFile Version 3.0
vtk output
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 10 10 10
SPACING 10 5 -10
ORIGIN 100 350 200
POINT_DATA 1000
SCALARS Test%20Data double
LOOKUP_TABLE default
0.598189 0.228948 0.346975 0.948916 0.0109774 0.30281 0.643976 0.17398 0.374673
0.295613 0.664072 0.307974 0.802966 0.836823 0.827732 0.895217 0.104437 0.292796
0.604939 0.96141 0.0837524 0.498616 0.608173 0.446545 0.364019 0.222914 0.514992
...
...
TVTK of Mayavi has a beautiful way of writing vtk files. Here is a test example I have written for myself following #Joe and tvtk documentation. The advantage it has over evtk, is the support for both ascii and html.Hope it will help other people.
from tvtk.api import tvtk, write_data
import numpy as np
#data = np.random.random((3, 3, 3))
#
#i = tvtk.ImageData(spacing=(1, 1, 1), origin=(0, 0, 0))
#i.point_data.scalars = data.ravel()
#i.point_data.scalars.name = 'scalars'
#i.dimensions = data.shape
#
#w = tvtk.XMLImageDataWriter(input=i, file_name='spoints3d.vti')
#w.write()
points = np.array([[0,0,0], [1,0,0], [1,1,0], [0,1,0]], 'f')
(n1, n2) = points.shape
poly_edge = np.array([[0,1,2,3]])
print n1, n2
## Scalar Data
#temperature = np.array([10., 20., 30., 40.])
#pressure = np.random.rand(n1)
#
## Vector Data
#velocity = np.random.rand(n1,n2)
#force = np.random.rand(n1,n2)
#
##Tensor Data with
comp = 5
stress = np.random.rand(n1,comp)
#
#print stress.shape
## The TVTK dataset.
mesh = tvtk.PolyData(points=points, polys=poly_edge)
#
## Data 0 # scalar data
#mesh.point_data.scalars = temperature
#mesh.point_data.scalars.name = 'Temperature'
#
## Data 1 # additional scalar data
#mesh.point_data.add_array(pressure)
#mesh.point_data.get_array(1).name = 'Pressure'
#mesh.update()
#
## Data 2 # Vector data
#mesh.point_data.vectors = velocity
#mesh.point_data.vectors.name = 'Velocity'
#mesh.update()
#
## Data 3 additional vector data
#mesh.point_data.add_array( force)
#mesh.point_data.get_array(3).name = 'Force'
#mesh.update()
mesh.point_data.tensors = stress
mesh.point_data.tensors.name = 'Stress'
# Data 4 additional tensor Data
#mesh.point_data.add_array(stress)
#mesh.point_data.get_array(4).name = 'Stress'
#mesh.update()
write_data(mesh, 'polydata.vtk')
# XML format
# Method 1
#write_data(mesh, 'polydata')
# Method 2
#w = tvtk.XMLPolyDataWriter(input=mesh, file_name='polydata.vtk')
#w.write()
I know it is a bit late and I do love your tutorials #somada141. This should work too.
def numpy2VTK(img, spacing=[1.0, 1.0, 1.0]):
# evolved from code from Stou S.,
# on http://www.siafoo.net/snippet/314
# This function, as the name suggests, converts numpy array to VTK
importer = vtk.vtkImageImport()
img_data = img.astype('uint8')
img_string = img_data.tostring() # type short
dim = img.shape
importer.CopyImportVoidPointer(img_string, len(img_string))
importer.SetDataScalarType(VTK_UNSIGNED_CHAR)
importer.SetNumberOfScalarComponents(1)
extent = importer.GetDataExtent()
importer.SetDataExtent(extent[0], extent[0] + dim[2] - 1,
extent[2], extent[2] + dim[1] - 1,
extent[4], extent[4] + dim[0] - 1)
importer.SetWholeExtent(extent[0], extent[0] + dim[2] - 1,
extent[2], extent[2] + dim[1] - 1,
extent[4], extent[4] + dim[0] - 1)
importer.SetDataSpacing(spacing[0], spacing[1], spacing[2])
importer.SetDataOrigin(0, 0, 0)
return importer
Hope it helps!
Here's a SimpleITK version with the function load_itk taken from here:
import SimpleITK as sitk
import numpy as np
if len(sys.argv)<3:
print('Wrong number of arguments.', file=sys.stderr)
print('Usage: ' + __file__ + ' input_sitk_file' + ' output_sitk_file', file=sys.stderr)
sys.exit(1)
def quick_read(filename):
# Read image information without reading the bulk data.
file_reader = sitk.ImageFileReader()
file_reader.SetFileName(filename)
file_reader.ReadImageInformation()
print('image size: {0}\nimage spacing: {1}'.format(file_reader.GetSize(), file_reader.GetSpacing()))
# Some files have a rich meta-data dictionary (e.g. DICOM)
for key in file_reader.GetMetaDataKeys():
print(key + ': ' + file_reader.GetMetaData(key))
def load_itk(filename):
# Reads the image using SimpleITK
itkimage = sitk.ReadImage(filename)
# Convert the image to a numpy array first and then shuffle the dimensions to get axis in the order z,y,x
data = sitk.GetArrayFromImage(itkimage)
# Read the origin of the ct_scan, will be used to convert the coordinates from world to voxel and vice versa.
origin = np.array(list(reversed(itkimage.GetOrigin())))
# Read the spacing along each dimension
spacing = np.array(list(reversed(itkimage.GetSpacing())))
return data, origin, spacing
def convert(data, output_filename):
image = sitk.GetImageFromArray(data)
writer = sitk.ImageFileWriter()
writer.SetFileName(output_filename)
writer.Execute(image)
def wait():
print('Press Enter to load & convert or exit using Ctrl+C')
input()
quick_read(sys.argv[1])
print('-'*20)
wait()
data, origin, spacing = load_itk(sys.argv[1])
convert(sys.argv[2])