Verifying iterative edge insertion in Gremlinpython - tinkerpop

Trying to iteratively add vertices and edges. It seems to work, there are no errors, but I wish to verify that the edges are also correctly added.
The loops below insert at least the nodes, as the printing of the list length at the end shows, but the edges are either 1) not inserted, or 2) the way to collect them in a list is incorrect.
Any help is much appreciated!
def vertices01(nodename, rangelb, rangeub, prop1name, prop1val, prop2name):
t = g.addV(nodename).property(prop1name, prop1val).property(prop2name, rangelb)
for i in range(rangelb + 1, rangeub):
t.addV(nodename).property(prop1name, prop1val).property(prop2name, i)
t.iterate()
def edges01(from_propname, from_propval, to_propname, rangelb, rangeub, edge_name, edge_prop1name):
to_propval = rangelb
edge_prop1val = rangelb
t = g.V().has(from_propname, from_propval).as_("a").V().has(to_propname, to_propval).as_("b").addE(edge_name).from_("a").to("b").property(edge_prop1name, edge_prop1val)
for i in range(rangelb, rangeub):
to_propval = i + 1
edge_prop1val = i
# changing this to t.has(...) seems to not influence the results (still 0 picked up by the loop)
t.has(from_propname, from_propval).as_("a").V().has(to_propname, to_propval).as_("b").addE(edge_name).from_("a").to("b").property(edge_prop1name, edge_prop1val)
t.iterate()
vertices01("ABC", 1, 21, "aa01", 1, "bb01")
edges01("aa01", 1, "bb01", 1, 10 , "aa01-to-bb01", "aa01-to-bb01-propX")
ls1 = []
ls1 = g.V().outE("aa01-to-bb01").has("aa01-to-bb01-propX", 2).toList()
print(len(ls1))
ls2 = []
ls2 = g.V().has("aa01", 1).toList()
print(len(ls2))
> results:
0
20
Expected results:
> results:
1
20
EDIT: I have changed this bit in the edges01 loop:
t = g.V().has(from_propname, from_propval) ...
to
t.has(from_propname, from_propval) ...
But the results are still 0.

You are starting the traversal over again each time with t = g.V()... in the code that adds edges. Only the very last traversal created is going to get iterated. In the code that creates the vertices you are extending the traversal. That is the difference.
UPDATED
You should be able to do something along these lines
t = g.V().has('some-property','some-value').as_('a').
V().has('some-property','some-value').as_('b')
and then inside the loop
t.addE('myedge').from_('a').to('b')

Related

How to extract variable values that equal a certain value (pyomo)?

I am building a routing optimization model using pyomo on python.
I have solved my model but I am trying to extract the decision variable information for my model. My model is binary, and the values I am looking for are values of my model.z decision variable that equal to 1.
When I write instance.pprint() I get the following sample of output. I therefore want to code something that gives me only the decision variables that are equal to 1 such as z(1,4).
Sample of my code is shown below:
model.I = RangeSet(5)
model.J = RangeSet(5)
model.z = Var(model.I, model.J, domain = Binary)
def constraint (model,i):
return sum(model.z[i,j] - model.z[j,i] for j in model.J if i != j) == 0
model.constraint = Constraint(model.I, rule=constraint)
print()
z_values = pd.Series(model.z[i,j].extract_values(), name = model.z.name)
print(z_values)
I have tried the above code but as some of my values are 0 (because they have not being visited), I have been getting the following error message.
ValueError: Error retrieving component z[5,4]: The component has not been constructed.
Ideally the output should be something like this:
(0,3) -- 1
(1,2) -- 1
(2,4) -- 1
(3,1) -- 1
(4,5) -- 1
(5,0) -- 1
Any ideas?
This should work (and answer your other derivative question)
# value extract
import pyomo.environ as pyo
nodes = [1,2,3,4,5,6]
model = pyo.ConcreteModel()
model.N = pyo.Set(initialize=nodes)
model.Z = pyo.Var(model.N, model.N, domain=pyo.Binary, initialize=0) # only initializing here for demo...
# blah blah constraints & solve
# stuff in some fake results...
model.Z[1, 2] = 1
model.Z[2, 6] = 1
model.Z[3, 5] = 1
model.Z[6, 3] = 1
# model.display()
# make a dictionary of the route ...
# recall that binary "1" variables evaluate as True
route = {start: stop for (start, stop) in model.Z.index_set() if pyo.value(model.Z[start, stop])}
# print(route)
start_node = 1
print(f'from {start_node} ', end='')
while start_node in route.keys():
end_node = route.get(start_node)
print(f'-> {end_node} ' , end='')
start_node = end_node

Adding multiple labels to a branch in a phylogenetic tree using geom_label

I am very new to R, so I am sorry if this question is obvious. I would like to add multiple labels to branches in a phylogenetic tree, but I can only figure out how to add one label per branch. I am using the following code:
treetext = "(Japon[&&NHX:S=2],(((Al,Luteo),(Loam,(Niet,Cal))),(((Car,Bar),(Aph,Long[&&NHX:S=1],((Yam,Stig),((Zey,Semp),(A,(Hap,(This,That))))))));"
mytree <- read.nhx(textConnection(treetext))
ggtree(mytree) + geom_tiplab() +
geom_label(aes(x=branch, label=S))
I can add multiple symbols to a branch using the code below, but is so labor-intensive that I may as well do it by hand:
ggtree(mytree) +
geom_tiplab()+
geom_nodepoint(aes(subset = node == 32, x = x - .5),
size = 5, colour = "black", shape = 15) +
geom_nodepoint(aes(subset = node == 32, x = x - 2),
size = 5, colour = "gray", shape = 15)
A solution using the "ape" package would be:
library(ape)
mytree <- rtree(7) # A random tree
labels1 <- letters[1:6]
labels2 <- LETTERS[1:6]
plot(mytree)
# Setting location of label with `adj`
nodelabels(labels1, adj = c(1, -1), frame = 'none')
# We could also use `pos =` 1: below node; 3: above node
nodelabels(labels2, pos = 1, frame = 'n')
You might want to tweak the adj parameter to set the location as you desire it.
As I couldn't parse the treetext object you provided as an example (unbalanced braces), and I'm not familiar with how read.nhx() stores node labels, you might need a little R code to extract the labels elements; you can use a bare nodelabels() to plot the numbers of the nodes on the tree to be sure that your vectors are in the correct sequence.
If you wanted labels to appear on edges rather than at nodes, the function is ape::edgelabels().

Create line network from closest points with boundaries

I have a set of points and I want to create line / road network from those points. Firstly, I need to determine the closest point from each of the points. For that, I used the KD Tree and developed a code like this:
def closestPoint(source, X = None, Y = None):
df = pd.DataFrame(source).copy(deep = True) #Ensure source is a dataframe, working on a copy to keep the datasource
if(X is None and Y is None):
raise ValueError ("Please specify coordinate")
elif(not X in df.keys() and not Y in df.keys()):
raise ValueError ("X and/or Y is/are not in column names")
else:
df["coord"] = tuple(zip(df[X],df[Y])) #create a coordinate
if (df["coord"].duplicated):
uniq = df.drop_duplicates("coord")["coord"]
uniqval = list(uniq.get_values())
dupl = df[df["coord"].duplicated()]["coord"]
duplval = list(dupl.get_values())
for kq,vq in uniq.items():
clstu = spatial.KDTree(uniqval).query(vq, k = 3)[1]
df.at[kq,"coord"] = [vq,uniqval[clstu[1]]]
if([uniqval[clstu[1]],vq] in list(df["coord"]) ):
df.at[kq,"coord"] = [vq,uniqval[clstu[2]]]
for kd,vd in dupl.items():
clstd = spatial.KDTree(duplval).query(vd,k = 1)[1]
df.at[kd,"coord"] = [vd,duplval[clstd]]
else:
val = df["coord"].get_values()
for k,v in df["coord"].items():
clst = spatial.KDTree(val).query(vd, k = 3)[1]
df.at[k,"coord"] = [v,val[clst[1]]]
if([val[clst[1]],v] in list (df["coord"])):
df.at[k,"coord"] = [v,val[clst[2]]]
return df["coord"]
The code can return the the closest points around. However, I need to ensure that no double lines are created (e.g (x,y) to (x1,y1) and (x1,y1) to (x,y)) and also I need to ensure that each point can only be used as a starting point of a line and an end point of a line despite the point being the closest one to the other points.
Below is the visualization of the result:
Result of the code
What I want:
What I want
I've also tried to separate the origin and target coordinate and do it like this:
df["coord"] = tuple(zip(df[X],df[Y])) #create a coordinate
df["target"] = "" #create a column for target points
count = 2 # create a count iteration
if (df["coord"].duplicated):
uniq = df.drop_duplicates("coord")["coord"]
uniqval = list(uniq.get_values())
for kq,vq in uniq.items():
clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
while not vq in (list(df["target"]) and list(df["coord"])):
clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
df.set_value(kq, "target", uniqval[clstu[count-1]])
else:
count += 1
clstu = spatial.KDTree(uniqval).query(vq, k = count)[1]
df.set_value(kq, "target", uniqval[clstu[count-1]])
but this return an error
IndexError: list index out of range
Can anyone help me with this? Many thanks!
Answering now about the global strategy, here is what I would do (rough pseudo-algorithm):
current_point = one starting point in uniqval
while (uniqval not empty)
construct KDTree from uniqval and use it for next line
next_point = point in uniqval closest to current_point
record next_point as target for current_point
remove current_point from uniqval
current_point = next_point
What you will obtain is a linear graph joining all your points, using closest neighbors "in some way". I don't know if it will fit your needs. You would also obtain a linear graph by taking next_point at random...
It is hard to comment on your global strategy without further detail about the kind of road network your want to obtain. So let me just comment your specific code and explain why the "out of range" error happens. I hope this can help.
First, are you aware that (list_a and list_b) will return list_a if it is empty, else list_b? Second, isn't the condition (vq in list(df["coord"]) always True? If yes, then your while loop is just always executing the else statement, and at the last iteration of the for loop, (count-1) will be greater than the total number of (unique) points. Hence your KDTree query does not return enough points and clstu[count-1] is out of range.

Retrieve indices for rows of a PyTables table matching a condition using `Table.where()`

I need the indices (as numpy array) of the rows matching a given condition in a table (with billions of rows) and this is the line I currently use in my code, which works, but is quite ugly:
indices = np.array([row.nrow for row in the_table.where("foo == 42")])
It also takes half a minute, and I'm sure that the list creation is one of the reasons why.
I could not find an elegant solution yet and I'm still struggling with the pytables docs, so does anybody know any magical way to do this more beautifully and maybe also a bit faster? Maybe there is special query keyword I am missing, since I have the feeling that pytables should be able to return the matched rows indices as numpy array.
tables.Table.get_where_list() gives indices of the rows matching a given condition
I read the source of pytables, where() is implemented in Cython, but it seems not fast enough. Here is a complex method that can speedup:
Create some data first:
from tables import *
import numpy as np
class Particle(IsDescription):
name = StringCol(16) # 16-character String
idnumber = Int64Col() # Signed 64-bit integer
ADCcount = UInt16Col() # Unsigned short integer
TDCcount = UInt8Col() # unsigned byte
grid_i = Int32Col() # 32-bit integer
grid_j = Int32Col() # 32-bit integer
pressure = Float32Col() # float (single-precision)
energy = Float64Col() # double (double-precision)
h5file = open_file("tutorial1.h5", mode = "w", title = "Test file")
group = h5file.create_group("/", 'detector', 'Detector information')
table = h5file.create_table(group, 'readout', Particle, "Readout example")
particle = table.row
for i in range(1001000):
particle['name'] = 'Particle: %6d' % (i)
particle['TDCcount'] = i % 256
particle['ADCcount'] = (i * 256) % (1 << 16)
particle['grid_i'] = i
particle['grid_j'] = 10 - i
particle['pressure'] = float(i*i)
particle['energy'] = float(particle['pressure'] ** 4)
particle['idnumber'] = i * (2 ** 34)
# Insert a new particle record
particle.append()
table.flush()
h5file.close()
Read the column in chunks and append the indices into a list and concatenate the list to array finally. You can change the chunk size according to your memory size:
h5file = open_file("tutorial1.h5")
table = h5file.get_node("/detector/readout")
size = 10000
col = "energy"
buf = np.zeros(batch, dtype=table.coldtypes[col])
res = []
for start in range(0, table.nrows, size):
length = min(size, table.nrows - start)
data = table.read(start, start + batch, field=col, out=buf[:length])
tmp = np.where(data > 10000)[0]
tmp += start
res.append(tmp)
res = np.concatenate(res)

QUANTSTRAT - apply.paramset issue

I am trying to optimize MACD parameters for a trading strategy but unfortunately I am stuck with paramset.label value. This is the code:
################################# MACD PARAMETERS OPTIMIZATION
.fastMA <- (20:40)
.slowMA <- (30:70)
.nsamples = 10
strat.st <- 'volStrat'
# Paramset
add.distribution(strat.st,
paramset.label = 'EMA',
component.type = 'indicator',
component.label = 'macd.out',
variable = list(n = .fastMA),
label = 'nFast'
)
add.distribution(strat.st,
paramset.label = 'EMA',
component.type = 'indicator',
component.label = 'macd.out',
variable = list(n = .slowMA),
label = 'nSlow'
)
add.distribution.constraint(strat.st,
paramset.label = 'EMA',
distribution.label.1 = 'nFast',
distribution.label.2 = 'nSlow',
operator = '<',
label = 'nFast<nSlow'
)
results <- apply.paramset(strat.st,
paramset.label = 'EMA',
portfolio = portfolio2.st,
account = account.st,
nsamples = .nsamples,
verbose = TRUE)
stats <- results$tradeStats
print(stats)
When I run it, this error occurs for every sample:
evaluation # 1:
$param.combo
nFast nSlow
379 23 51
[1] "Processing param.combo 379"
nFast nSlow
379 23 51
result of evaluating expression:
<simpleError in strategy[[components.type]][[index]]: subscript out of bounds>
got results for task 1
numValues: 1, numResults: 1, stopped: FALSE
returning status FALSE
And then, for the last one, this is the error:
evaluation # 10:
$param.combo
nFast nSlow
585 40 60
[1] "Processing param.combo 585"
nFast nSlow
585 40 60
result of evaluating expression:
<simpleError in strategy[[components.type]][[index]]: subscript out of bounds>
got results for task 10
numValues: 10, numResults: 10, stopped: FALSE
first call to combine function
evaluating call object to combine results:
fun(result.1, result.2, result.3, result.4, result.5, result.6,
result.7, result.8, result.9, result.10)
error calling combine function:
<simpleError in fun(result.1, result.2, result.3, result.4, result.5, result.6, result.7, result.8, result.9, result.10): attempt to select less than one element>
numValues: 10, numResults: 10, stopped: TRUE
I really don't understand how can I fix it.
Can anyone how can I solve this?
Thank you so much
You didn't give the code before OPTIMIZATION part, so here is only my guess direction.
I understand you want to test 20:40 and 30:70, but in your OPTIMIZATION code, you add 2 distribution both pointing to " component.label = 'macd.out' ".
I did similar test, although they both use MA type indicators, they generally should not point to the same MA data(" component.label = 'macd.out' "), these code worked one distribution points to "component.label = 'fast'" and another points to "component.label = 'slow'" as they are pointing different datas so that they can be compared.
You can try to debug in this direction.