Spin constraining in PyIron+Sphinx - pyiron

I want to constrain the spin of the bulk atoms while letting the free surface atoms of my supercell relax their magnetic moment. Is it possible in PyIron+SPhinx to constrain the spin of a subset of atoms (not all of them) in the supercell?

Yes in principle it is possible:
from pyiron import Project
import numpy as np
spx = pr.create.job.Sphinx('spx')
spx.structure = pr.create.structure.bulk('Fe', a=2.83, cubic=True)
spx.structure.set_initial_magnetic_moments([2, 2])
spx.fix_spin_constraint = True
spx.structure.spin_constraint = np.array([True, False])
spx.calc_static()
spx.run()
Short explanation: spx.fix_spin_constraint = True initializes the attribute spx.structure.spin_constraint, which contains only True for all atoms at the beginning. For the atoms which should not be constrained, you can set False.

Related

how to use cycler in matplotlib? [duplicate]

Is it possible to query the current state of the matplotlib color cycle? In other words is there a function get_cycle_state that will behave in the following way?
>>> plot(x1, y1)
>>> plot(x2, y2)
>>> state = get_cycle_state()
>>> print state
2
Where I expect the state to be the index of the next color that will be used in a plot. Alternatively, if it returned the next color ("r" for the default cycle in the example above), that would be fine too.
Accessing the color cycle iterator
There's no "user-facing" (a.k.a. "public") method to access the underlying iterator, but you can access it through "private" (by convention) methods. However, you'd can't get the state of an iterator without changing it.
Setting the color cycle
Quick aside: You can set the color/property cycle in a variety of ways (e.g. ax.set_color_cycle in versions <1.5 or ax.set_prop_cycler in >=1.5). Have a look at the example here for version 1.5 or greater, or the previous style here.
Accessing the underlying iterator
However, while there's no public-facing method to access the iterable, you can access it for a given axes object (ax) through the _get_lines helper class instance. ax._get_lines is a touch confusingly named, but it's the behind-the-scenes machinery that allows the plot command to process all of the odd and varied ways that plot can be called. Among other things, it's what keeps track of what colors to automatically assign. Similarly, there's ax._get_patches_for_fill to control cycling through default fill colors and patch properties.
At any rate, the color cycle iterable is ax._get_lines.color_cycle for lines and ax._get_patches_for_fill.color_cycle for patches. On matplotlib >=1.5, this has changed to use the cycler library, and the iterable is called prop_cycler instead of color_cycle and yields a dict of properties instead of only a color.
All in all, you'd do something like:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
color_cycle = ax._get_lines.color_cycle
# or ax._get_lines.prop_cycler on version >= 1.5
# Note that prop_cycler cycles over dicts, so you'll want next(cycle)['color']
You can't view the state of an iterator
However, this object is a "bare" iterator. We can easily get the next item (e.g. next_color = next(color_cycle), but that means that the next color after that is what will be plotted. By design, there's no way to get the current state of an iterator without changing it.
In v1.5 or greater, it would be nice to get the cycler object that's used, as we could infer its current state. However, the cycler object itself isn't accessible (publicly or privately) anywhere. Instead, only the itertools.cycle instance created from the cycler object is accessible. Either way, there's no way to get to the underlying state of the color/property cycler.
Match the color of the previously plotted item instead
In your case, it sounds like you're wanting to match the color of something that was just plotted. Instead of trying to determine what the color/property will be, set the color/etc of your new item based on the properties of what's plotted.
For example, in the case you described, I'd do something like this:
import matplotlib.pyplot as plt
import numpy as np
def custom_plot(x, y, **kwargs):
ax = kwargs.pop('ax', plt.gca())
base_line, = ax.plot(x, y, **kwargs)
ax.fill_between(x, 0.9*y, 1.1*y, facecolor=base_line.get_color(), alpha=0.5)
x = np.linspace(0, 1, 10)
custom_plot(x, x)
custom_plot(x, 2*x)
custom_plot(x, -x, color='yellow', lw=3)
plt.show()
It's not the only way, but its cleaner than trying to get the color of the plotted line before-hand, in this case.
Here's a way that works in 1.5 which will hopefully be future-proof as it doesn't rely on methods prepended with underscores:
colors = plt.rcParams["axes.prop_cycle"].by_key()["color"]
This will give you a list of the colors defined in order for the present style.
Note: In the latest versions of matplotlib (>= 1.5) _get_lines has changed. You now need to use next(ax._get_lines.prop_cycler)['color'] in Python 2 or 3 (or ax._get_lines.prop_cycler.next()['color'] in Python 2) to get the next color from the color cycle.
Wherever possible use the more direct approach shown in the lower part of #joe-kington's answer. As _get_lines is not API-facing it might change again in a not backward compatible manner in the future.
Sure, this will do it.
#rainbow
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,2*np.pi)
ax= plt.subplot(1,1,1)
ax.plot(np.sin(x))
ax.plot(np.cos(x))
rainbow = ax._get_lines.color_cycle
print rainbow
for i, color in enumerate(rainbow):
if i<10:
print color,
Gives:
<itertools.cycle object at 0x034CB288>
r c m y k b g r c m
Here is the itertools function that matplotlib uses itertools.cycle
Edit: Thanks for the comment, it seems that it is not possible to copy an iterator. An idea would be to dump a full cycle and keep track of which value you are using, let me get back on that.
Edit2: Allright, this will give you the next color and make a new iterator that behaves as if next was not called. This does not preserve the order of coloring, just the next color value, I leave that to you.
This gives the following output, notice that steepness in the plot corresponds to index, eg first g is the bottomest graph and so on.
#rainbow
import matplotlib.pyplot as plt
import numpy as np
import collections
import itertools
x = np.linspace(0,2*np.pi)
ax= plt.subplot(1,1,1)
def create_rainbow():
rainbow = [ax._get_lines.color_cycle.next()]
while True:
nextval = ax._get_lines.color_cycle.next()
if nextval not in rainbow:
rainbow.append(nextval)
else:
return rainbow
def next_color(axis_handle=ax):
rainbow = create_rainbow()
double_rainbow = collections.deque(rainbow)
nextval = ax._get_lines.color_cycle.next()
double_rainbow.rotate(-1)
return nextval, itertools.cycle(double_rainbow)
for i in range(1,10):
nextval, ax._get_lines.color_cycle = next_color(ax)
print "Next color is: ", nextval
ax.plot(i*(x))
plt.savefig("SO_rotate_color.png")
plt.show()
Console
Next color is: g
Next color is: c
Next color is: y
Next color is: b
Next color is: r
Next color is: m
Next color is: k
Next color is: g
Next color is: c
I just want to add onto what #Andi said above. Since color_cycle is deprecated in matplotlib 1.5, you have to use prop_cycler, however, Andi's solution (ax._get_lines.prop_cycler.next()['color']) returned this error for me:
AttributeError: 'itertools.cycle' object has no attribute 'next'
The code that worked for me was: next(ax._get_lines.prop_cycler), which actually isn't far off from #joe-kington's original response.
Personally, I ran into this problem when making a twinx() axis, which reset the color cycler. I needed a way to make the colors cycle correctly because I was using style.use('ggplot'). There might be an easier/better way to do this, so feel free to correct me.
Since matplotlib uses itertools.cycle we can actually look through the entire color cycle and then restore the iterator to its previous state:
def list_from_cycle(cycle):
first = next(cycle)
result = [first]
for current in cycle:
if current == first:
break
result.append(current)
# Reset iterator state:
for current in cycle:
if current == result[-1]:
break
return result
This should return the list without changing the state of the iterator.
Use it with matplotlib >= 1.5:
>>> list_from_cycle(ax._get_lines.prop_cycler)
[{'color': 'r'}, {'color': 'g'}, {'color': 'b'}]
or with matplotlib < 1.5:
>>> list_from_cycle(ax._get_lines.color_cycle)
['r', 'g', 'b']
The simplest way possible I could find without doing the whole loop through the cycler is ax1.lines[-1].get_color().
How to access the color (and complete style) cycle?
The current state is stored in ax._get_lines.prop_cycler.
There are no built-in methods to expose the "base list" for a generic itertools.cycle, and in particular for ax._get_lines.prop_cycler (see below).
I have posted here a few functions to get info on a itertools.cycle.
One could then use
style_cycle = ax._get_lines.prop_cycler
curr_style = get_cycle_state(style_cycle) # <-- my (non-builtin) function
curr_color = curr_style['color']
to get the current color without changing the state of the cycle.
TL;DR
Where is the color (and complete style) cycle stored?
The style cycle is stored in two different places, one for the default, and one for the current axes (assuming import matplotlib.pyplot as plt and ax is an axis handler):
default_prop_cycler = plt.rcParams['axes.prop_cycle']
current_prop_cycle = ax._get_lines.prop_cycler
Note these have different classes.
The default is a "base cycle setting" and it does not know about any current state for any axes, while the current knows about the cycle to follow and its current state:
print('type(default_prop_cycler) =', type(default_prop_cycler))
print('type(current_prop_cycle) =', type(current_prop_cycle))
[]: type(default_prop_cycler) = <class 'cycler.Cycler'>
[]: type(current_prop_cycle) = <class 'itertools.cycle'>
The default cycle may have several keys (properties) to cycle, and one can get only the colors:
print('default_prop_cycler.keys =', default_prop_cycler.keys)
default_prop_cycler2 = plt.rcParams['axes.prop_cycle'].by_key()
print(default_prop_cycler2)
print('colors =', default_prop_cycler2['color'])
[]: default_prop_cycler.keys = {'color', 'linestyle'}
[]: {'color': ['r', 'g', 'b', 'y'], 'linestyle': ['-', '--', ':', '-.']}
[]: colors = ['r', 'g', 'b', 'y']
One could even change the cycler to use for a given axes, after defining that custom_prop_cycler, with
ax.set_prop_cycle(custom_prop_cycler)
But there are no built-in methods to expose the "base list" for a generic itertools.cycle, and in particular for ax._get_lines.prop_cycler.
In matplotlib version 2.2.3 there is a get_next_color() method on the _get_lines property:
import from matplotlib import pyplot as plt
fig, ax = plt.subplots()
next_color = ax._get_lines.get_next_color()
get_next_color() returns an html color string, and advances the color cycle iterator.
minimal working example
I struggelt with this quite a few times already.
This is a minimal working example for Andis answer.
code
import numpy as np
import matplotlib.pyplot as plt
xs = np.arange(10)
fig, ax = plt.subplots()
for ii in range(3):
color = next(ax._get_lines.prop_cycler)['color']
lbl = 'line {:d}, color {:}'.format(ii, color)
ys = np.random.rand(len(xs))
ax.plot(xs, ys, color=color, label=lbl)
ax.legend()

Locally weighted smoothing for binary valued random variable

I have a random variable as follows:
f(x) = 1 with probability g(x)
f(x) = 0 with probability 1-g(x)
where 0 < g(x) < 1.
Assume g(x) = x. Let's say I am observing this variable without knowing the function g and obtained 100 samples as follows:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binned_statistic
list = np.ndarray(shape=(200,2))
g = np.random.rand(200)
for i in range(len(g)):
list[i] = (g[i], np.random.choice([0, 1], p=[1-g[i], g[i]]))
print(list)
plt.plot(list[:,0], list[:,1], 'o')
Plot of 0s and 1s
Now, I would like to retrieve the function g from these points. The best I could think is to use draw a histogram and use the mean statistic:
bin_means, bin_edges, bin_number = binned_statistic(list[:,0], list[:,1], statistic='mean', bins=10)
plt.hlines(bin_means, bin_edges[:-1], bin_edges[1:], lw=2)
Histogram mean statistics
Instead, I would like to have a continuous estimation of the generating function.
I guess it is about kernel density estimation but I could not find the appropriate pointer.
straightforward without explicitly fitting an estimator:
import seaborn as sns
g = sns.lmplot(x= , y= , y_jitter=.02 , logistic=True)
plug in x= your exogenous variable and analogously y = dependent variable. y_jitter is jitter the point for better visibility if you have a lot of data points. logistic = True is the main point here. It will give you the logistic regression line of the data.
Seaborn is basically tailored around matplotlib and works great with pandas, in case you want to extend your data to a DataFrame.

nesting openmdao "assemblies"/drivers - working from a 0.13 analogy, is this possible to implement in 1.X?

I am using NREL's DAKOTA_driver openmdao plugin for parallelized Monte Carlo sampling of a model. In 0.X, I was able to nest assemblies, allowing an outer optimization driver to direct the DAKOTA_driver sampling evaluations. Is it possible for me to nest this setup within an outer optimizer? I would like the outer optimizer's workflow to call the DAKOTA_driver "assembly" then the get_dakota_output component.
import pandas as pd
import subprocess
from subprocess import call
import os
import numpy as np
from dakota_driver.driver import pydakdriver
from openmdao.api import IndepVarComp, Component, Problem, Group
from mpi4py import MPI
import sys
from itertools import takewhile
sigm = .005
n_samps = 20
X_bar=[0.065 , sigm] #2.505463e+03*.05]
dacout = 'dak.sout'
class get_dak_output(Component):
mean_coe = 0
def execute(self):
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
nam ='ape.net_aep'
csize = 10000
with open(dacout) as f:
for i,l in enumerate(f):
pass
numlines = i
dakchunks = pd.read_csv(dacout, skiprows=0, chunksize = csize, sep='there_are_no_seperators')
linespassed = 0
vals = []
for dchunk in dakchunks:
for line in dchunk.values:
linespassed += 1
if linespassed < 49 or linespassed > numlines - 50: continue
else:
split_line = ''.join(str(s) for s in line).split()
if len(split_line)==2:
if (len(split_line) != 2 or
split_line[0] in ('nan', '-nan') or
split_line[1] != nam):
continue
else:vals.append(float(split_line[0]))
self.coe_vals = sorted(vals)
self.mean_coe = np.mean(self.coe_vals)
class ape(Component):
def __init__(self):
super(ape, self).__init__()
self.add_param('x', val=0.0)
self.add_output('net_aep', val=0.0)
def solve_nonlinear(self, params, unknowns, resids):
print 'hello'
x = params['x']
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
outp = subprocess.check_output("python test/exampleCall.py %f"%(float(x)),
shell=True)
unknowns['net_aep'] = float(outp.split()[-1])
top = Problem()
root = top.root = Group()
root.add('ape', ape())
root.add('p1', IndepVarComp('x', 13.0))
root.connect('p1.x', 'ape.x')
drives = pydakdriver(name = 'top.driver')
drives.UQ('sampling', use_seed=False)
#drives.UQ()
top.driver = drives
#top.driver = ScipyOptimizer()
#top.driver.options['optimizer'] = 'SLSQP'
top.driver.add_special_distribution('p1.x','normal', mean=0.065, std_dev=0.01, lower_bounds=-50, upper_bounds=50)
top.driver.samples = n_samps
top.driver.stdout = dacout
#top.driver.add_desvar('p2.y', lower=-50, upper=50)
#top.driver.add_objective('ape.f_xy')
top.driver.add_objective('ape.net_aep')
top.setup()
top.run()
bak = get_dak_output()
bak.execute()
print('\n')
print('E(aep) is %f'%bak.mean_coe)
There are two different options for this situation. Both will work in parallel, and both can be currently supported. But only one of them will work when you want to use analytic derivatives:
1) Nested Problems: You create one problem class that has a DOE driver in it. You pass the list of cases you want run into that driver, and it runs them in parallel. Then you put that problem into a parent problem as a component.
The parent problem doesn't know that it has a sub-problem. It just thinks it has a single component that uses multiple processors.
This is the most similar way to how you would have done it in 0.x. However I don't recommend going this route because it won't work if you want to use ever want to use analytic derivatives.
If you use this way, the dakota driver can stay pretty much as is. But you'll have to use a special sub-problem class. This isn't an officially supported feature yet, but its very doable.
2) Using a multi-point approach, you would create a Group class that represent your model. You would then create one instance of that group for each monte-carlo run you want to do. You put all of these instances into a parallel group inside your overall problem.
This approach avoids the sub-problem messiness. Its also much more efficient for actual execution. It will have a somewhat greater setup-cost than the first method. But in my opinion its well worth the one time setup cost to get the advantage of analytic derivatives. The only issue is that it would probably require some changes to the way the dakota_driver works. You would want to get a list of evaluations from the driver, then hand them them out to the individual children groups.

Contour plotting orbitals in pyquante2 using matplotlib

I'm currently writing line and contour plotting functions for my PyQuante quantum chemistry package using matplotlib. I have some great functions that evaluate basis sets along a (npts,3) array of points, e.g.
from somewhere import basisset, line
bfs = basisset(h2) # Generate a basis set
points = line((0,0,-5),(0,0,5)) # Create a line in 3d space
bfmesh = bfs.mesh(points)
for i in range(bfmesh.shape[1]):
plot(bfmesh[:,i])
This is fast because it evaluates all of the basis functions at once, and I got some great help from stackoverflow here and here to make them extra-nice.
I would now like to update this to do contour plotting as well. The slow way I've done this in the past is to create two one-d vectors using linspace(), mesh these into a 2D grid using meshgrid(), and then iterating over all xyz points and evaluating each one:
f = np.empty((50,50),dtype=float)
xvals = np.linspace(0,10)
yvals = np.linspace(0,20)
z = 0
for x in xvals:
for y in yvals:
f = bf(x,y,z)
X,Y = np.meshgrid(xvals,yvals)
contourplot(X,Y,f)
(this isn't real code -- may have done something dumb)
What I would like to do is to generate the mesh in more or less the same way I do in the contour plot example, "unravel" it to a (npts,3) list of points, evaluate the basis functions using my new fast routines, then "re-ravel" it back to X,Y matrices for plotting with contourplot.
The problem is that I don't have anything that I can simply call .ravel() on: I either have 1d meshes of xvals and yvals, the 2D versions X,Y, and the single z value.
Can anyone think of a nice, pythonic way to do this?
If you can express f as a function of X and Y, you could avoid the Python for-loops this way:
import matplotlib.pyplot as plt
import numpy as np
def bf(x, y):
return np.sin(np.sqrt(x**2+y**2))
xvals = np.linspace(0,10)
yvals = np.linspace(0,20)
X, Y = np.meshgrid(xvals,yvals)
f = bf(X,Y)
plt.contour(X,Y,f)
plt.show()
yields

pandas access axis by user-defined name

I am wondering whether there is any way to access axes of pandas containers (DataFrame, Panel, etc...) by user-defined name instead of integer or "index", "columns", "minor_axis" etc...
For example, with the following data container:
df = DataFrame(randn(3,2),columns=['c1','c2'],index=['i1','i2','i3'])
df.index.name = 'myaxis1'
df.columns.name = 'myaxis2'
I would like to do this:
df.sum(axis='myaxis1')
df.xs('c1', axis='myaxis2') # cross section
Also very useful would be:
df.reshape(['myaxis2','myaxis1'])
(in this case not so relevant, but it could become so if the dimension increases)
The reason is that I work a lot with multi-dimensional arrays of varying dimensions, like "time", "variable", "percentile" etc...and a same piece of code is often applied to objects which can be DataFrame, Panel or even Panel4D or DataFrame with MultiIndex. For now I often make test on the shape of the object, or on the general settings of the script in order to know which axis is the relevant one to compute a sum or mean. But I think it would be much more convenient to forget about how the container is implemented in the detail (DataFrame, Panel etc...), and simply think about the nature of the problem (say I want to average over the time, I do not want to think about whether I work with in "probabilistic" mode with several percentiles, or in "deterministic" mode with a single time series).
Writing this post I have (re)discovered the very useful axes attribute. The above code could be translated into:
nms = [ax.name for ax in df.axes]
axid1 = nms.index('myaxis1')
axid2 = nms.index('myaxis2')
df.sum(axis=axid1)
df.xs('c1', axis=axid2) # cross section
and the "reshape" feature (does not apply to 3-d case though...):
newshape = ['myaxis2','myaxis1']
axid = [nms.index(nm) for nm in newshape]
df.swapaxes(*axid)
Well, I have to admit that I have found these solutions while writing this post (and this is already very convenient), but it could be generalized to account for DataFrame (or other) with MultiIndex axes, do a search on all axes and labels...
In my opinion it would be a major improvement to the user-friendliness of pandas (ok, forgetting about the actual structure could have a performance cost, but the user worried about performance can be careful in how he/she organizes the data).
What do you think?
This is still experimental, but look at this page:
http://pandas.pydata.org/pandas-docs/dev/dsintro.html#panelnd-experimental
import pandas
import numpy as np
from pandas.core import panelnd
MyPanel4D = panelnd.create_nd_panel_factory(
klass_name = 'MyPanel4D',
axis_orders = ['axis4', 'axis3', 'axis2', 'axis1'],
axis_slices = {'axis3': 'items',
'axis2': 'major_axis',
'axis1': 'minor_axis'},
slicer = 'Panel',
stat_axis=2)
mp4d = MyPanel4D(np.random.rand(5,4,3,2))
print mp4d
Results in this
<class 'pandas.core.panelnd.MyPanel4D'>
Dimensions: 5 (axis4) x 4 (axis3) x 3 (axis2) x 2 (axis1)
Axis4 axis: 0 to 4
Axis3 axis: 0 to 3
Axis2 axis: 0 to 2
Axis1 axis: 0 to 1
Here's the caveat, when you slice it like mp4d[0] you are going to get back a Panel, unless you create a hierarchy of custom objects (unfortunately will need to wait for 0.12-dev for support for 'renaming' Panel/DataFrame, its non-trivial and haven't had any requests)
So for higher dim objects you can impose your own name structure. The axis
aliasing should work like you are suggesting, but I think there are some bugs there