Graph drawing using graph-tool - graph-tool

I was wondering whether anyone used the drawing capabilities of graph-tool and ran into the issue of overlapping nodes after calculating layout in various ways?
On the same note, did anyone find a solution for increasing the size of some of the nodes, say based on their degree, and ensuring that they won't then overlap with other nodes?

For the variable size of the degree, you can define a node property in the graph for that. If you have a dictionary containing the degree for instance, you can do:
import graph_tool as gt
from graph_tool.draw import sfdp_layout,graph_draw
G = gt.Graph(directed=False)
v_size = G.new_vertex_property("int")
for n in nodes:
v = G.add_vertex()
v_size[v] = degree[n]
pos = sfdp_layout(G)
graph_draw(G0,pos,
vertex_size=v_size,
output="graph.png"
)
Hope that helps.

Related

How to vertically align dagre parent nodes in cytoscape.js

I have a graph made in cytoscape.js, using the dagre extension, that looks like this:
Graph
How can I get the parent nodes to line up vertically? Since applying a separate layout to only parent nodes does not work (it applies to all nodes), I am stumped.
Unfortunately they are all poorly maintained visualization algorithms, so they don't have as many features.
I suggest you to open an issue in the algorithm repository where you explain how it can be improved.
In this case you would like to have a better aspect of the visualization.
https://github.com/cytoscape/cytoscape.js-dagre
You can also contribute to the dagre project adding this aesthetic criteria on to the graph.
At the end if you would like to have a better aspect you can apply a tweek to the graph after the layout execution.
So you can think to an algorithm for making parent nodes line up vertically and then apply in the code.
For example something you can do for having nodes nearest to their father and also a good aspect ratio you can sort nodes in the level n + 1 in the barycenter of their father in the lever n.
(let me know if I have made it clear)
I saw from the photo that you have groups, and the nodes within the group have different fathers, so if you put the nodes aligned with their fathers then you could have
Nodes that are overlapping
overlapping groups
groups with too large a width
(let me know if I have made the problem clear)
I remember you how to position nodes in cytoscape.js
cyGraph.startBatch(); // for bach the differences and apply only once at the end
// random layout. you have to use yours
cyGraph.nodes().positions(( node, i ) => {
return {
x: Math.random() * cyGraph.width(),
y: Math.random() * cyGraph.height(),
};
});
cyGraph.endBatch();

How to speed up graph coloring problem in python PuLP

I am trying to solve the classic graph coloring problem using python PuLP. We have n nodes, a collection of edges in the form edges = [(node1, node2), (node2, node4), ...], and we are trying to find the minimum number of node colors so that no connected nodes share a color.
My implementation works, but is slow. It is made of three constraints, plus the one optimization of initializing node0 to color 0 to somewhat limit the search space. The code is as follows:
nodes = range(node_count)
n_colors = 10
# colors = range(node_count)
colors = range(n_colors)
prob = LpProblem("coloring", LpMinimize)
# variable xnc shows if node n has color c
xnc = LpVariable.dicts("x", (nodes, colors), cat='Binary')
# array of colors to indicate which ones were used
used_colors = LpVariable.dicts("used", colors, cat='Binary')
# minimize how many colors are used, and minimize int value for those colors
prob += lpSum([used_colors[c] * c for c in colors])
# prob += lpSum([used_colors[c] for c in colors])
# set the first node to color 0 to constrain starting point
prob += xnc[0][0] == 1
# Every node uses one color
for n in nodes:
prob += lpSum([xnc[n][c] for c in colors]) == 1
# Any connected nodes have different colors
for e in edges:
e1, e2 = e[0], e[1]
for c in colors:
prob += xnc[e1][c] + xnc[e2][c] <= 1
# mark color as used if node has that color
for n in nodes:
for c in colors:
prob += xnc[n][c] <= used_colors[c]
prob.solve()
I see that there are symmetries, and I know I could reduce this by making any new color used at most max(colors_already_used) + 1, so that if node 0 is color 0, node 1 will either have the same color, or color 1. But I am not sure how to encode this because max is not allowed the linear nature of the problem in PuLP as far as I know. I achieve a similar effect above by multiplying all colors used by their integer values, which speeds things up a bit but I do not think works as quite the efficient/deterministic constraint I seek.
Also limiting the number of colors seems to have a nice effect on the speed, but I am not sure if it is worth the preprocessing cost to try and find a heuristic before starting the optimization, since it is not clear how many colors will be needed in advance.
What other constraints could I add, or other ways I could speed it up? I am mostly interested in better ways to formulate the problem, but also open to computational optimizations ie parallelization, if they can be done in PuLP.

How can I find if a point lies inside or outside of ConvexHull?

I have array of points in numpy and I compute their ConvexHull by (scipy.spatial.ConvexHull not scipy.spatial.Delaunay.convex_hull), now I wanna know if I add new point to my array is this point lies inside the convexhull of the point cloud or not? what is the best way to do it in numoy?
I should mention that I saw this question and it's not solving my problem (since it's using cipy.spatial.Delaunay.convex_hull for computing ConvexHull)
I know this question is old, but in case some people find it as I did, here is the answer:
I'm using scipy 1.1.0 and python 3.6
def isInHull(P,hull):
'''
Datermine if the list of points P lies inside the hull
:return: list
List of boolean where true means that the point is inside the convex hull
'''
A = hull.equations[:,0:-1]
b = np.transpose(np.array([hull.equations[:,-1]]))
isInHull = np.all((A # np.transpose(P)) <= np.tile(-b,(1,len(P))),axis=0)
Here we use the equations of all the plan to determine if the point is outside the hull. We never build the Delaunay object. This function takes as insput P mxn an array of m points in n dimensions. The hull is constrcuted using
hull = ConvexHull(X)
where X is the pxn array of p points that constitute the cloud of points on which the convex hull should be constructed.

Constructing a bubble trellis plot with lattice in R

First off, this is a homework question. The problem is ex. 2.6 from pg.26 of An Introduction to Applied Multivariate Analysis. It's laid out as:
Construct a bubble plot of the earthquake data using latitude and longitude as the scatterplot and depth as the circles, with greater depths giving smaller circles. In addition, divide the magnitudes into three equal ranges and label the points in your bubble plot with a different symbol depending on the magnitude group into which the point falls.
I have figured out that symbols, which is in base graphics does not work well with lattice. Also, I haven't figured out if lattice has the functionality to change symbol size (i.e. bubble size). I bought the lattice book in a fit of desperation last night, and as I see in some of the examples, it is possible to symbol color and shape for each "cut" or panel. I am then working under the assumption that symbol size could then also be manipulated, but I haven't been able to figure out how.
My code looks like:
plot(xyplot(lat ~ long | cut(mag, 3), data=quakes,
layout=c(3,1), xlab="Longitude", ylab="Latitude",
panel = function(x,y){
grid.circle(x,y,r=sqrt(quakes$depth),draw=TRUE)
}
))
Where I attempt to use the grid package to draw the circles, but when this executes, I just get a blank plot. Could anyone please point me in the right direction? I would be very grateful!
Here is the some code for creating the plot that you need without using the lattice package. I obviously had to generate my own fake data so you can disregard all of that stuff and go straight to the plotting commands if you want.
####################################################################
#Pseudo Data
n = 20
latitude = sample(1:100,n)
longitude = sample(1:100,n)
depth = runif(n,0,.5)
magnitude = sample(1:100,n)
groups = rep(NA,n)
for(i in 1:n){
if(magnitude[i] <= 33){
groups[i] = 1
}else if (magnitude[i] > 33 & magnitude[i] <=66){
groups[i] = 2
}else{
groups[i] = 3
}
}
####################################################################
#The actual code for generating the plot
plot(latitude[groups==1],longitude[groups==1],col="blue",pch=19,ylim=c(0,100),xlim=c(0,100),
xlab="Latitude",ylab="Longitude")
points(latitude[groups==2],longitude[groups==2],col="red",pch=15)
points(latitude[groups==3],longitude[groups==3],col="green",pch=17)
points(latitude[groups==1],longitude[groups==1],col="blue",cex=1/depth[groups==1])
points(latitude[groups==2],longitude[groups==2],col="red",cex=1/depth[groups==2])
points(latitude[groups==3],longitude[groups==3],col="green",cex=1/depth[groups==3])
You just need to add default.units = "native" to grid.circle()
plot(xyplot(lat ~ long | cut(mag, 3), data=quakes,
layout=c(3,1), xlab="Longitude", ylab="Latitude",
panel = function(x,y){
grid.circle(x,y,r=sqrt(quakes$depth),draw=TRUE, default.units = "native")
}
))
Obviously you need to tinker with some of the settings to get what you want.
I have written a package called tactile that adds a function for producing bubbleplots using lattice.
tactile::bubbleplot(depth ~ lat*long | cut(mag, 3), data=quakes,
layout=c(3,1), xlab="Longitude", ylab="Latitude")

Creating grid and interpolating (x,y,z) for contour plot sagemath

!I have values in the form of (x,y,z). By creating a list_plot3d plot i can clearly see that they are not quite evenly spaced. They usually form little "blobs" of 3 to 5 points on the xy plane. So for the interpolation and the final "contour" plot to be better, or should i say smoother(?), do i have to create a rectangular grid (like the squares on a chess board) so that the blobs of data are somehow "smoothed"? I understand that this might be trivial to some people but i am trying this for the first time and i am struggling a bit. I have been looking at the scipy packages like scipy.interplate.interp2d but the graphs produced at the end are really bad. Maybe a brief tutorial on 2d interpolation in sagemath for an amateur like me? Some advice? Thank you.
EDIT:
https://docs.google.com/file/d/0Bxv8ab9PeMQVUFhBYWlldU9ib0E/edit?pli=1
This is mostly the kind of graphs it produces along with this message:
Warning: No more knots can be added because the number of B-spline
coefficients
already exceeds the number of data points m. Probably causes:
either
s or m too small. (fp>s)
kx,ky=3,3 nx,ny=17,20 m=200 fp=4696.972223 s=0.000000
To get this graph i just run this command:
f_interpolation = scipy.interpolate.interp2d(*zip(*matrix(C)),kind='cubic')
plot_interpolation = contour_plot(lambda x,y:
f_interpolation(x,y)[0], (22.419,22.439),(37.06,37.08) ,cmap='jet', contours=numpy.arange(0,1400,100), colorbar=True)
plot_all = plot_interpolation
plot_all.show(axes_labels=["m", "m"])
Where matrix(c) can be a huge matrix like 10000 X 3 or even a lot more like 1000000 x 3. The problem of bad graphs persists even with fewer data like the picture i attached now where matrix(C) was only 200 x 3. That's why i begin to think that it could be that apart from a possible glitch with the program my approach to the use of this command might be totally wrong, hence the reason for me to ask for advice about using a grid and not just "throwing" my data into a command.
I've had a similar problem using the scipy.interpolate.interp2d function. My understanding is that the issue arises because the interp1d/interp2d and related functions use an older wrapping of FITPACK for the underlying calculations. I was able to get a problem similar to yours to work using the spline functions, which rely on a newer wrapping of FITPACK. The spline functions can be identified because they seem to all have capital letters in their names here http://docs.scipy.org/doc/scipy/reference/interpolate.html. Within the scipy installation, these newer functions appear to be located in scipy/interpolate/fitpack2.py, while the functions using the older wrappings are in fitpack.py.
For your purposes, RectBivariateSpline is what I believe you want. Here is some sample code for implementing RectBivariateSpline:
import numpy as np
from scipy import interpolate
# Generate unevenly spaced x/y data for axes
npoints = 25
maxaxis = 100
x = (np.random.rand(npoints)*maxaxis) - maxaxis/2.
y = (np.random.rand(npoints)*maxaxis) - maxaxis/2.
xsort = np.sort(x)
ysort = np.sort(y)
# Generate the z-data, which first requires converting
# x/y data into grids
xg, yg = np.meshgrid(xsort,ysort)
z = xg**2 - yg**2
# Generate the interpolated, evenly spaced data
# Note that the min/max of x/y isn't necessarily 0 and 100 since
# randomly chosen points were used. If we want to avoid extrapolation,
# the explicit min/max must be found
interppoints = 100
xinterp = np.linspace(xsort[0],xsort[-1],interppoints)
yinterp = np.linspace(ysort[0],ysort[-1],interppoints)
# Generate the kernel that will be used for interpolation
# Note that the default version uses three coefficients for
# interpolation (i.e. parabolic, a*x**2 + b*x +c). Higher order
# interpolation can be used by setting kx and ky to larger
# integers, i.e. interpolate.RectBivariateSpline(xsort,ysort,z,kx=5,ky=5)
kernel = interpolate.RectBivariateSpline(xsort,ysort,z)
# Now calculate the linear, interpolated data
zinterp = kernel(xinterp, yinterp)