I search a way to find all the vector from a np.meshgrid(xrange, xrange, xrange) that are related by k = -k.
For the moment I do that :
#numba.njit
def find_pairs(array):
boolean = np.ones(len(array), dtype=np.bool_)
pairs = []
idx = [i for i in range(len(array))]
while len(idx) > 1:
e1 = idx[0]
for e2 in idx:
if (array[e1] == -array[e2]).all():
boolean[e2] = False
pairs.append([e1, e2])
idx.remove(e1)
if e2 != e1:
idx.remove(e2)
break
return boolean, pairs
# Give array of 3D vectors
krange = np.fft.fftfreq(N)
comb_array = np.array(np.meshgrid(krange, krange, krange)).T.reshape(-1, 3)
# Take idx of the pairs k, -k vector and boolean selection that give position of -k vectors
boolean, pairs = find_pairs(array)
It works but the execution time grow rapidly with N...
Maybe someone has already deal with that?
The main problem is that comb_array has a shape of (R, 3) where R = N**3 and the nested loop in find_pairs runs at least in quadratic time since idx.remove runs in linear time and is called in the for loop. Moreover, there are cases where the for loop does not change the size of idx and the loop appear to run forever (eg. with N=4).
One solution to solve this problem in O(R log R) is to sort the array and then check for opposite values in linear time:
import numpy as np
import numba as nb
# Give array of 3D vectors
krange = np.fft.fftfreq(N)
comb_array = np.array(np.meshgrid(krange, krange, krange)).T.reshape(-1, 3)
# Sorting
packed = comb_array.view([('x', 'f8'), ('y', 'f8'), ('z', 'f8')])
idx = np.argsort(packed, axis=0).ravel()
sorted_comb = comb_array[idx]
# Find pairs
#nb.njit
def findPairs(sorted_comb, idx):
n = idx.size
boolean = np.zeros(n, dtype=np.bool_)
pairs = []
cur = n-1
for i in range(n):
while cur >= i:
if np.all(sorted_comb[i] == -sorted_comb[cur]):
boolean[idx[i]] = True
pairs.append([idx[i], idx[cur]])
cur -= 1
break
cur -= 1
return boolean, pairs
findPairs(sorted_comb, idx)
Note that the algorithm assume that for each row, there are only up to one valid matching pair. If there are several equal rows, they are paired 2 by two. If your goal is to extract all the combination of equal rows in this case, then please note that the output will grow exponentially (which is not reasonable IMHO).
This solution is pretty fast even for N = 100. Most of the time is spent in the sort that is not very efficient (unfortunately Numpy does not provide a way to do a lexicographic argsort of the row efficiently yet though this operation is fundamentally expensive).
def signed_angle_between_vecs(target_vec, start_vec, plane_normal=None):
start_vec = np.array(start_vec)
target_vec = np.array(target_vec)
start_vec = start_vec/np.linalg.norm(start_vec)
target_vec = target_vec/np.linalg.norm(target_vec)
if plane_normal is None:
arg1 = np.dot(np.cross(start_vec, target_vec), np.cross(start_vec, target_vec))
else:
arg1 = np.dot(np.cross(start_vec, target_vec), plane_normal)
arg2 = np.dot(start_vec, target_vec)
return np.arctan2(arg1, arg2)
from scipy.spatial.transform import Rotation as R
world_frame_axis = input_rotation_object.apply(canonical_axis)
angle = signed_angle_between_vecs(canonical_axis, world_frame_axis)
axis_angle = np.cross(world_frame_axis, canonical_axis) * angle
C = R.from_rotvec(axis_angle)
transformed_world_frame_axis_to_canonical = C.apply(world_frame_axis)
I am trying to align world_frame_axis to canonical_axis by performing a rotation around the normal vector generated by the cross product between the two vectors, using the signed angle between the two axes.
However, this code does not work. If you start with some arbitrary rotation as input_rotation_object you will see that transformed_world_frame_axis_to_canonical does not match canonical_axis.
What am I doing wrong?
not a python coder so I might be wrong but this looks suspicious:
start_vec = start_vec/np.linalg.norm(start_vec)
from the names I would expect that np.linalg.norm normalizes the vector already so the line should be:
start_vec = np.linalg.norm(start_vec)
and all the similar lines too ...
Also the atan2 operands are not looking right to me. I would (using math):
a = start_vec / |start_vec | // normalized start
b = target_vec / |target_vec| // normalized end
u = a // normalized one axis of plane
v = cross(u ,b)
v = cross(v ,u)
v = v / |v| // normalized second axis of plane perpendicular to u
dx = dot(u,b) // target vector in 2D aligned to start
dy = dot(v,b)
ang = atan2(dy,dx)
beware the ang might negated (depending on your notations) if the case either add minus sign or reverse the order in cross(u,v) to cross(v,u) Also you can do sanity check with comparing result to unsigned:
ang' = acos(dot(a,b))
in absolute values they should be the same (+/- rounding error).
I'm trying to come up with a function that plots n points inside the unit circle, but I need them to be sufficiently spread out.
ie. something that looks like this:
Is it possible to write a function with two parameters, n (number of points) and min_d (minimum distance apart) such that the points are:
a) equidistant
b) no pairwise distance exceeds a given min_d
The problem with sampling from a uniform distribution is that it could happen that two points are almost on top of each other, which I do not want to happen. I need this kind of input for a network diagram representing node clusters.
EDIT: I have found an answer to a) here: Generator of evenly spaced points in a circle in python, but b) still eludes me.
At the time this answer was provided, the question asked for random numbers. This answer thus gives a solution drawing random numbers. It ignores any edits made to the question afterwards.
On may simply draw random points and for each one check if the condition of the minimum distance is fulfilled. If not, the point can be discarded. This can be done until a list is filled with enough points or some break condition is met.
import numpy as np
import matplotlib.pyplot as plt
class Points():
def __init__(self,n=10, r=1, center=(0,0), mindist=0.2, maxtrials=1000 ) :
self.success = False
self.n = n
self.r = r
self.center=np.array(center)
self.d = mindist
self.points = np.ones((self.n,2))*10*r+self.center
self.c = 0
self.trials = 0
self.maxtrials = maxtrials
self.tx = "rad: {}, center: {}, min. dist: {} ".format(self.r, center, self.d)
self.fill()
def dist(self, p, x):
if len(p.shape) >1:
return np.sqrt(np.sum((p-x)**2, axis=1))
else:
return np.sqrt(np.sum((p-x)**2))
def newpoint(self):
x = (np.random.rand(2)-0.5)*2
x = x*self.r-self.center
if self.dist(self.center, x) < self.r:
self.trials += 1
if np.all(self.dist(self.points, x) > self.d):
self.points[self.c,:] = x
self.c += 1
def fill(self):
while self.trials < self.maxtrials and self.c < self.n:
self.newpoint()
self.points = self.points[self.dist(self.points,self.center) < self.r,:]
if len(self.points) == self.n:
self.success = True
self.tx +="\n{} of {} found ({} trials)".format(len(self.points),self.n,self.trials)
def __repr__(self):
return self.tx
center =(0,0)
radius = 1
p = Points(n=40,r=radius, center=center)
fig, ax = plt.subplots()
x,y = p.points[:,0], p.points[:,1]
plt.scatter(x,y)
ax.add_patch(plt.Circle(center, radius, fill=False))
ax.set_title(p)
ax.relim()
ax.autoscale_view()
ax.set_aspect("equal")
plt.show()
If the number of points should be fixed, you may try to run find this number of points for decreasing distances until the desired number of points are found.
In the following case, we are looking for 60 points and start with a minimum distance of 0.6 which we decrease stepwise by 0.05 until there is a solution found. Note that this will not necessarily be the optimum solution, as there is only maxtrials of retries in each step. Increasing maxtrials will of course bring us closer to the optimum but requires more runtime.
center =(0,0)
radius = 1
mindist = 0.6
step = 0.05
success = False
while not success:
mindist -= step
p = Points(n=60,r=radius, center=center, mindist=mindist)
print p
if p.success:
break
fig, ax = plt.subplots()
x,y = p.points[:,0], p.points[:,1]
plt.scatter(x,y)
ax.add_patch(plt.Circle(center, radius, fill=False))
ax.set_title(p)
ax.relim()
ax.autoscale_view()
ax.set_aspect("equal")
plt.show()
Here the solution is found for a minimum distance of 0.15.
Given an array of 2D points (#pts x 2) and an array of which points are connected to which (#bonds x 2 int array with indices of pts), how can I efficiently return an array of polygons formed from the bonds?
There can be 'dangling' bonds (like in the top left of the image below) that don't close a polygon, and these should be ignored.
Here's an example:
import numpy as np
xy = np.array([[2.72,-2.976], [2.182,-3.40207],
[-3.923,-3.463], [2.1130,4.5460], [2.3024,3.4900], [.96979,-.368],
[-2.632,3.7555], [-.5086,.06170], [.23409,-.6588], [.20225,-.9540],
[-.5267,-1.981], [-2.190,1.4710], [-4.341,3.2331], [-3.318,3.2654],
[.58510,4.1406], [.74331,2.9556], [.39622,3.6160], [-.8943,1.0643],
[-1.624,1.5259], [-1.414,3.5908], [-1.321,3.6770], [1.6148,1.0070],
[.76172,2.4627], [.76935,2.4838], [3.0322,-2.124], [1.9273,-.5527],
[-2.350,-.8412], [-3.053,-2.697], [-1.945,-2.795], [-1.905,-2.767],
[-1.904,-2.765], [-3.546,1.3208], [-2.513,1.3117], [-2.953,-.5855],
[-4.368,-.9650]])
BL= np.array([[22,23], [28,29], [8,9],
[12,31], [18,19], [31,32], [3,14],
[32,33], [24,25], [10,30], [15,23],
[5,25], [12,13], [0,24], [27,28],
[15,16], [5,8], [0,1], [11,18],
[2,27], [11,13], [33,34], [26,33],
[29,30], [7,17], [9,10], [26,30],
[17,22], [5,21], [19,20], [17,18],
[14,16], [7,26], [21,22], [3,4],
[4,15], [11,32], [6,19], [6,13],
[16,20], [27,34], [7,8], [1,9]])
I can't tell you how to implement it with numpy, but here's an outline of a possible algorithm:
Add a list of attached bonds to each point.
Remove the points that have only one bond attached, remove this bond as well (these are the dangling bonds)
Attach two boolean markers to each bond, indicating if the bond has already been added to a polygon in one of the two possible directions. Each bond can only be used in two polygons. Initially set all markers to false.
Select any initial point and repeat the following step until all bonds have been used in both directions:
Select a bond that has not been used (in the respective direction). This is the first edge of the polygon. Of the bonds attached to the end point of the selected one, choose the one with minimal angle in e.g. counter-clockwise direction. Add this to the polygon and continue until you return to the initial point.
This algorithm will also produce a large polygon containing all the outer bonds of the network. I guess you will find a way to recognize this one and remove it.
For future readers, the bulk of the implementation of Frank's suggestion in numpy is below. The extraction of the boundary follows essentially the same algorithm as walking around a polygon, except using the minimum angle bond, rather than the max.
def extract_polygons_lattice(xy, BL, NL, KL):
''' Extract polygons from a lattice of points.
Parameters
----------
xy : NP x 2 float array
points living on vertices of dual to triangulation
BL : Nbonds x 2 int array
Each row is a bond and contains indices of connected points
NL : NP x NN int array
Neighbor list. The ith row has neighbors of the ith particle, padded with zeros
KL : NP x NN int array
Connectivity list. The ith row has ones where ith particle is connected to NL[i,j]
Returns
----------
polygons : list
list of lists of indices of each polygon
PPC : list
list of patches for patch collection
'''
NP = len(xy)
NN = np.shape(KL)[1]
# Remove dangling bonds
# dangling bonds have one particle with only one neighbor
finished_dangles = False
while not finished_dangles:
dangles = np.where([ np.count_nonzero(row)==1 for row in KL])[0]
if len(dangles) >0:
# Make sorted bond list of dangling bonds
dpair = np.sort(np.array([ [d0, NL[d0,np.where(KL[d0]!=0)[0]] ] for d0 in dangles ]), axis=1)
# Remove those bonds from BL
BL = setdiff2d(BL,dpair.astype(BL.dtype))
print 'dpair = ', dpair
print 'ending BL = ', BL
NL, KL = BL2NLandKL(BL,NP=NP,NN=NN)
else:
finished_dangles = True
# bond markers for counterclockwise, clockwise
used = np.zeros((len(BL),2), dtype = bool)
polygons = []
finished = False
while (not finished) and len(polygons)<20:
# Check if all bond markers are used in order A-->B
todoAB = np.where(~used[:,0])[0]
if len(todoAB) > 0:
bond = BL[todoAB[0]]
# bb will be list of polygon indices
# Start with orientation going from bond[0] to bond[1]
nxt = bond[1]
bb = [ bond[0], nxt ]
dmyi = 1
# as long as we haven't completed the full outer polygon, add next index
while nxt != bond[0]:
n_tmp = NL[ nxt, np.argwhere(KL[nxt]).ravel()]
# Exclude previous boundary particle from the neighbors array, unless its the only one
# (It cannot be the only one, if we removed dangling bonds)
if len(n_tmp) == 1:
'''The bond is a lone bond, not part of a triangle.'''
neighbors = n_tmp
else:
neighbors = np.delete(n_tmp, np.where(n_tmp == bb[dmyi-1])[0])
angles = np.mod( np.arctan2(xy[neighbors,1]-xy[nxt,1],xy[neighbors,0]-xy[nxt,0]).ravel() \
- np.arctan2( xy[bb[dmyi-1],1]-xy[nxt,1], xy[bb[dmyi-1],0]-xy[nxt,0] ).ravel(), 2*np.pi)
nxt = neighbors[angles == max(angles)][0]
bb.append( nxt )
# Now mark the current bond as used
thisbond = [bb[dmyi-1], bb[dmyi]]
# Get index of used matching thisbond
mark_used = np.where((BL == thisbond).all(axis=1))
if len(mark_used)>0:
#print 'marking bond [', thisbond, '] as used'
used[mark_used,0] = True
else:
# Used this bond in reverse order
used[mark_used,1] = True
dmyi += 1
polygons.append(bb)
else:
# Check for remaining bonds unused in reverse order (B-->A)
todoBA = np.where(~used[:,1])[0]
if len(todoBA) >0:
bond = BL[todoBA[0]]
# bb will be list of polygon indices
# Start with orientation going from bond[0] to bond[1]
nxt = bond[0]
bb = [ bond[1], nxt ]
dmyi = 1
# as long as we haven't completed the full outer polygon, add nextIND
while nxt != bond[1]:
n_tmp = NL[ nxt, np.argwhere(KL[nxt]).ravel()]
# Exclude previous boundary particle from the neighbors array, unless its the only one
# (It cannot be the only one, if we removed dangling bonds)
if len(n_tmp) == 1:
'''The bond is a lone bond, not part of a triangle.'''
neighbors = n_tmp
else:
neighbors = np.delete(n_tmp, np.where(n_tmp == bb[dmyi-1])[0])
angles = np.mod( np.arctan2(xy[neighbors,1]-xy[nxt,1],xy[neighbors,0]-xy[nxt,0]).ravel() \
- np.arctan2( xy[bb[dmyi-1],1]-xy[nxt,1], xy[bb[dmyi-1],0]-xy[nxt,0] ).ravel(), 2*np.pi)
nxt = neighbors[angles == max(angles)][0]
bb.append( nxt )
# Now mark the current bond as used --> note the inversion of the bond order to match BL
thisbond = [bb[dmyi], bb[dmyi-1]]
# Get index of used matching [bb[dmyi-1],nxt]
mark_used = np.where((BL == thisbond).all(axis=1))
if len(mark_used)>0:
used[mark_used,1] = True
dmyi += 1
polygons.append(bb)
else:
# All bonds have been accounted for
finished = True
# Check for duplicates (up to cyclic permutations) in polygons
# Note that we need to ignore the last element of each polygon (which is also starting pt)
keep = np.ones(len(polygons),dtype=bool)
for ii in range(len(polygons)):
polyg = polygons[ii]
for p2 in polygons[ii+1:]:
if is_cyclic_permutation(polyg[:-1],p2[:-1]):
keep[ii] = False
polygons = [polygons[i] for i in np.where(keep)[0]]
# Remove the polygon which is the entire lattice boundary, except dangling bonds
boundary = extract_boundary_from_NL(xy,NL,KL)
print 'boundary = ', boundary
keep = np.ones(len(polygons),dtype=bool)
for ii in range(len(polygons)):
polyg = polygons[ii]
if is_cyclic_permutation(polyg[:-1],boundary.tolist()):
keep[ii] = False
elif is_cyclic_permutation(polyg[:-1],boundary[::-1].tolist()):
keep[ii] = False
polygons = [polygons[i] for i in np.where(keep)[0]]
# Prepare a polygon patch collection
PPC = []
for polyINDs in polygons:
pp = Path(xy[polyINDs],closed=True)
ppp = patches.PathPatch(pp, lw=2)
PPC.append(ppp)
return polygons, PPC
I asked this question yesterday about storing a plot within an object. I tried implementing the first approach (aware that I did not specify that I was using qplot() in my original question) and noticed that it did not work as expected.
library(ggplot2) # add ggplot2
string = "C:/example.pdf" # Setup pdf
pdf(string,height=6,width=9)
x_range <- range(1,50) # Specify Range
# Create a list to hold the plot objects.
pltList <- list()
pltList[]
for(i in 1 : 16){
# Organise data
y = (1:50) * i * 1000 # Get y col
x = (1:50) # get x col
y = log(y) # Use natural log
# Regression
lm.0 = lm(formula = y ~ x) # make linear model
inter = summary(lm.0)$coefficients[1,1] # Get intercept
slop = summary(lm.0)$coefficients[2,1] # Get slope
# Make plot name
pltName <- paste( 'a', i, sep = '' )
# make plot object
p <- qplot(
x, y,
xlab = "Radius [km]",
ylab = "Services [log]",
xlim = x_range,
main = paste("Sample",i)
) + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
print(p)
pltList[[pltName]] = p
}
# close the PDF file
dev.off()
I have used sample numbers in this case so the code runs if it is just copied. I did spend a few hours puzzling over this but I cannot figure out what is going wrong. It writes the first set of pdfs without problem, so I have 16 pdfs with the correct plots.
Then when I use this piece of code:
string = "C:/test_tabloid.pdf"
pdf(string, height = 11, width = 17)
grid.newpage()
pushViewport( viewport( layout = grid.layout(3, 3) ) )
vplayout <- function(x, y){viewport(layout.pos.row = x, layout.pos.col = y)}
counter = 1
# Page 1
for (i in 1:3){
for (j in 1:3){
pltName <- paste( 'a', counter, sep = '' )
print( pltList[[pltName]], vp = vplayout(i,j) )
counter = counter + 1
}
}
dev.off()
the result I get is the last linear model line (abline) on every graph, but the data does not change. When I check my list of plots, it seems that all of them become overwritten by the most recent plot (with the exception of the abline object).
A less important secondary question was how to generate a muli-page pdf with several plots on each page, but the main goal of my code was to store the plots in a list that I could access at a later date.
Ok, so if your plot command is changed to
p <- qplot(data = data.frame(x = x, y = y),
x, y,
xlab = "Radius [km]",
ylab = "Services [log]",
xlim = x_range,
ylim = c(0,10),
main = paste("Sample",i)
) + geom_abline(intercept = inter, slope = slop, colour = "red", size = 1)
then everything works as expected. Here's what I suspect is happening (although Hadley could probably clarify things). When ggplot2 "saves" the data, what it actually does is save a data frame, and the names of the parameters. So for the command as I have given it, you get
> summary(pltList[["a1"]])
data: x, y [50x2]
mapping: x = x, y = y
scales: x, y
faceting: facet_grid(. ~ ., FALSE)
-----------------------------------
geom_point:
stat_identity:
position_identity: (width = NULL, height = NULL)
mapping: group = 1
geom_abline: colour = red, size = 1
stat_abline: intercept = 2.55595281266726, slope = 0.05543539319091
position_identity: (width = NULL, height = NULL)
However, if you don't specify a data parameter in qplot, all the variables get evaluated in the current scope, because there is no attached (read: saved) data frame.
data: [0x0]
mapping: x = x, y = y
scales: x, y
faceting: facet_grid(. ~ ., FALSE)
-----------------------------------
geom_point:
stat_identity:
position_identity: (width = NULL, height = NULL)
mapping: group = 1
geom_abline: colour = red, size = 1
stat_abline: intercept = 2.55595281266726, slope = 0.05543539319091
position_identity: (width = NULL, height = NULL)
So when the plot is generated the second time around, rather than using the original values, it uses the current values of x and y.
I think you should use the data argument in qplot, i.e., store your vectors in a data frame.
See Hadley's book, Section 4.4:
The restriction on the data is simple: it must be a data frame. This is restrictive, and unlike other graphics packages in R. Lattice functions can take an optional data frame or use vectors directly from the global environment. ...
The data is stored in the plot object as a copy, not a reference. This has two
important consequences: if your data changes, the plot will not; and ggplot2 objects are entirely self-contained so that they can be save()d to disk and later load()ed and plotted without needing anything else from that session.
There is a bug in your code concerning list subscripting. It should be
pltList[[pltName]]
not
pltList[pltName]
Note:
class(pltList[1])
[1] "list"
pltList[1] is a list containing the first element of pltList.
class(pltList[[1]])
[1] "ggplot"
pltList[[1]] is the first element of pltList.
For your second question: Multi-page pdfs are easy -- see help(pdf):
onefile: logical: if true (the default) allow multiple figures in one
file. If false, generate a file with name containing the
page number for each page. Defaults to ‘TRUE’.
For your main question, I don't understand if you want to store the plot inputs in a list for later processing, or the plot outputs. If it is the latter, I am not sure that plot() returns an object you can store and retrieve.
Another suggestion regarding your second question would be to use either Sweave or Brew as they will give you complete control over how you display your multi-page pdf.
Have a look at this related question.