Graph longest path using linear programming - optimization

I have a weighted directed graph where there are no cycles, and I wish to define the constraints so that I can solve a maximization of the weights of a path with linear programming. However, I can't wrap my head around how to do that.
For this I wish to use the LPSolve tool. I thought about making an adjacency matrix, but I don't know how I could make that work with LPSolve.
How can I define the possible paths from each node using constraints and make it generic enough that it would be simple to adapt to other graphs?

Since you have a weighted directed graph, it is sufficient to define a binary variable x_e for each edge e and to add constraints specifying that the source node has flow balance 1 (there is one more outgoing edge selected than incoming edge), the destination node has flow balance -1 (there is one more incoming edge than outgoing edge selected), and every other node has flow balance 0 (there are the same number of outgoing and incoming edges selected). Since your graph has no cycles, this will result in a path from the source to the destination (assuming one exists). You can maximize the weights of the selected edges.
I'll continue the exposition in R using the lpSolve package. Consider a graph with the following edges:
(edges <- data.frame(source=c(1, 1, 2, 3), dest=c(2, 3, 4, 4), weight=c(2, 7, 3, -4)))
# source dest weight
# 1 1 2 2
# 2 1 3 7
# 3 2 4 3
# 4 3 4 -4
The shortest path from 1 to 4 is 1 -> 2 -> 4, with weight 5 (1 -> 3 -> 4 has weight 3).
We need the flow balance constraints for each of our four nodes:
source <- 1
dest <- 4
(nodes <- unique(c(edges$source, edges$dest)))
# [1] 1 2 3 4
(constr <- t(sapply(nodes, function(n) (edges$source == n) - (edges$dest == n))))
# [,1] [,2] [,3] [,4]
# [1,] 1 1 0 0
# [2,] -1 0 1 0
# [3,] 0 -1 0 1
# [4,] 0 0 -1 -1
(rhs <- ifelse(nodes == source, 1, ifelse(nodes == dest, -1, 0)))
# [1] 1 0 0 -1
Now we can put everything together into our model and solve:
mod <- lp(direction = "max", = edges$weight,
const.mat = constr,
const.dir = rep("=", length(nodes)),
const.rhs = rhs,
all.bin = TRUE)
edges[mod$solution > 0.999,]
# source dest weight
# 1 1 2 2
# 3 2 4 3
# [1] 5


Is there a way to implement equations as Dymos path constraints?

For example, if I have a function h_max(mach) and I want the altitude to always respect this predefined altitude-mach relationship throughout the flight enveloppe, how could I impliment this?
I have tried calculating the limit quantity (in this case, h_max) as its own state and then calculating another state as h_max-h and then constraining that through a path constraint to being greater than 0. This type of approach has worked, but involved two explicit components, a group and alot of extra coding just to get a constraint working. I was wondering if there was a better way?
Thanks so much in advance.
The next version of Dymos, 1.7.0 will be released soon and will support this.
In the mean time, you can install the latest developmental version of Dymos directly from github to have access to this capability:
python -m pip install git+
Then, you can define boundary and path constraints with an equation. Note the equation must have an equals sign in it, and then lower, upper, or equals will apply to the result of the equation.
In reality, dymos is just inserting an OpenMDAO ExecComp for you under the hood, so the one caveat to this is that your expression must be compatible with complex-step differentiation.
Here's an example of the brachistochrone that uses constraint expressions to set the final y value to a specific value while satisfying a path constraint defined with a second equation.
import openmdao.api as om
import dymos as dm
from dymos.examples.plotting import plot_results
from dymos.examples.brachistochrone import BrachistochroneODE
import matplotlib.pyplot as plt
# Initialize the Problem and the optimization driver
p = om.Problem(model=om.Group())
p.driver = om.ScipyOptimizeDriver()
# Create a trajectory and add a phase to it
traj = p.model.add_subsystem('traj', dm.Trajectory())
phase = traj.add_phase('phase0',
# Set the variables
phase.set_time_options(fix_initial=True, duration_bounds=(.5, 10))
phase.add_state('x', fix_initial=True, fix_final=True)
phase.add_state('y', fix_initial=True, fix_final=False)
phase.add_state('v', fix_initial=True, fix_final=False)
phase.add_control('theta', continuity=True, rate_continuity=True,
units='deg', lower=0.01, upper=179.9)
phase.add_parameter('g', units='m/s**2', val=9.80665)
Y_FINAL = 5.0
Y_MIN = 5.0
phase.add_boundary_constraint(f'bcf_y = y - {Y_FINAL}', loc='final', equals=0.0)
phase.add_path_constraint(f'path_y = y - {Y_MIN}', lower=0.0)
# Minimize time at the end of the phase
phase.add_objective('time', loc='final', scaler=10)
p.model.linear_solver = om.DirectSolver()
# Setup the Problem
# Set the initial values
p['traj.phase0.t_initial'] = 0.0
p['traj.phase0.t_duration'] = 2.0
p.set_val('traj.phase0.states:x', phase.interp('x', ys=[0, 10]))
p.set_val('traj.phase0.states:y', phase.interp('y', ys=[10, 5]))
p.set_val('traj.phase0.states:v', phase.interp('v', ys=[0, 9.9]))
p.set_val('traj.phase0.controls:theta', phase.interp('theta', ys=[5, 100.5]))
# Solve for the optimal trajectory
# Check the results
print('final time')
Note the constraints from the list_problem_vars() call that come from timeseries_exec_comp - this is the OpenMDAO ExecComp that Dymos automatically inserts for you.
--- Constraint Report [traj] ---
--- phase0 ---
[final] 0.0000e+00 == bcf_y [None]
[path] 0.0000e+00 <= path_y [None]
/usr/local/lib/python3.8/dist-packages/openmdao/recorders/ UserWarning:The existing case recorder file, dymos_solution.db, is being overwritten.
Model viewer data has already been recorded for Driver.
Full total jacobian was computed 3 times, taking 0.057485 seconds.
Total jacobian shape: (71, 51)
Jacobian shape: (71, 51) (12.51% nonzero)
FWD solves: 12 REV solves: 0
Total colors vs. total size: 12 vs 51 (76.5% improvement)
Sparsity computed using tolerance: 1e-25
Time to compute sparsity: 0.057485 sec.
Time to compute coloring: 0.054118 sec.
Memory to compute coloring: 0.000000 MB.
/usr/local/lib/python3.8/dist-packages/openmdao/core/ DerivativesWarning:Constraints or objectives [('traj.phases.phase0.timeseries.timeseries_exec_comp.path_y', inds=[(0, 0)])] cannot be impacted by the design variables of the problem.
Optimization terminated successfully (Exit mode 0)
Current function value: [18.02999766]
Iterations: 14
Function evaluations: 14
Gradient evaluations: 14
Optimization Complete
final time
Design Variables
name val size indices
-------------------------- -------------- ---- ---------------------------------------------
traj.phase0.t_duration [1.80299977] 1 None
traj.phase0.states:x |12.14992234| 9 [1 2 3 4 5 6 7 8 9]
traj.phase0.states:y |22.69124774| 10 [ 1 2 3 4 5 6 7 8 9 10]
traj.phase0.states:v |24.46289861| 10 [ 1 2 3 4 5 6 7 8 9 10]
traj.phase0.controls:theta |266.48489386| 21 [ 0 1 2 3 4 5 ... 4 15 16 17 18 19 20]
name val size indices alias
----------------------------------------------------------- ------------- ---- --------------------------------------------- ----------------------------------------------------
timeseries.timeseries_exec_comp.bcf_y [0.] 1 [29] traj.phases.phase0->final_boundary_constraint->bcf_y
timeseries.timeseries_exec_comp.path_y |15.73297378| 30 [ 0 1 2 3 4 5 ... 3 24 25 26 27 28 29] traj.phases.phase0->path_constraint->path_y
traj.phase0.collocation_constraint.defects:x |6e-08| 10 None None
traj.phase0.collocation_constraint.defects:y |7e-08| 10 None None
traj.phase0.collocation_constraint.defects:v |3e-08| 10 None None
traj.phase0.continuity_comp.defect_control_rates:theta_rate |0.0| 9 None None
name val size indices
------------- ------------- ---- -------
traj.phase0.t [18.02999766] 1 -1

How can I merge two data frames on a range of dates? [duplicate]

Consider the following data.tables. The first defines a set of regions with start and end positions for each group 'x':
d1 <- data.table(x = letters[1:5], start = c(1,5,19,30, 7), end = c(3,11,22,39,25))
setkey(d1, x, start)
# x start end
# 1: a 1 3
# 2: b 5 11
# 3: c 19 22
# 4: d 30 39
# 5: e 7 25
The second data set has the same grouping variable 'x', and positions 'pos' within each group:
d2 <- data.table(x = letters[c(1,1,2,2,3:5)], pos = c(2,3,3,12,20,52,10))
setkey(d2, x, pos)
# x pos
# 1: a 2
# 2: a 3
# 3: b 3
# 4: b 12
# 5: c 20
# 6: d 52
# 7: e 10
Ultimately I'd like to extract the rows in 'd2' where 'pos' falls within the range defined by 'start' and 'end', within each group x. The desired result is
# x pos start end
# 1: a 2 1 3
# 2: a 3 1 3
# 3: c 20 19 22
# 4: e 10 7 25
The start/end positions for any group x will never overlap but there may be gaps of values not in any region.
Now, I believe I should be using a rolling join. From what i can tell, I cannot use the "end" column in the join.
I've tried
d1[d2, roll = TRUE, nomatch = 0, mult = "all"][start <= end]
and got
# x start end
# 1: a 2 3
# 2: a 3 3
# 3: c 20 22
# 4: e 10 25
which is the right set of rows I want; However "pos" has become "start" and the original "start" has been lost. Is there a way to preserve all the columns with the roll join so i could report "start", "pos", "end" as desired?
Overlap joins was implemented with commit 1375 in data.table v1.9.3, and is available in the current stable release, v1.9.4. The function is called foverlaps. From NEWS:
29) Overlap joins #528 is now here, finally!! Except for type="equal" and maxgap and minoverlap arguments, everything else is implemented. Check out ?foverlaps and the examples there on its usage. This is a major feature addition to data.table.
Let's consider x, an interval defined as [a, b], where a <= b, and y, another interval defined as [c, d], where c <= d. The interval y is said to overlap x at all, iff d >= a and c <= b 1. And y is entirely contained within x, iff a <= c,d <= b 2. For the different types of overlaps implemented, please have a look at ?foverlaps.
Your question is a special case of an overlap join: in d1 you have true physical intervals with start and end positions. In d2 on the other hand, there are only positions (pos), not intervals. To be able to do an overlap join, we need to create intervals also in d2. This is achieved by creating an additional variable pos2, which is identical to pos (d2[, pos2 := pos]). Thus, we now have an interval in d2, albeit with identical start and end coordinates. This 'virtual, zero-width interval' in d2 can then be used in foverlap to do an overlap join with d1:
require(data.table) ## 1.9.3
d2[, pos2 := pos]
foverlaps(d2, d1, by.x = names(d2), type = "within", mult = "all", nomatch = 0L)
# x start end pos pos2
# 1: a 1 3 2 2
# 2: a 1 3 3 3
# 3: c 19 22 20 20
# 4: e 7 25 10 10
by.y by default is key(y), so we skipped it. by.x by default takes key(x) if it exists, and if not takes key(y). But a key doesn't exist for d2, and we can't set the columns from y, because they don't have the same names. So, we set by.x explicitly.
The type of overlap is within, and we'd like to have all matches, only if there is a match.
NB: foverlaps uses data.table's binary search feature (along with roll where necessary) under the hood, but some function arguments (types of overlaps, maxgap, minoverlap etc..) are inspired by the function findOverlaps() from the Bioconductor package IRanges, an excellent package (and so is GenomicRanges, which extends IRanges for Genomics).
So what's the advantage?
A benchmark on the code above on your data results in foverlaps() slower than Gabor's answer (Timings: Gabor's data.table solution = 0.004 vs foverlaps = 0.021 seconds). But does it really matter at this granularity?
What would be really interesting is to see how well it scales - in terms of both speed and memory. In Gabor's answer, we join based on the key column x. And then filter the results.
What if d1 has about 40K rows and d2 has a 100K rows (or more)? For each row in d2 that matches x in d1, all those rows will be matched and returned, only to be filtered later. Here's an example of your Q scaled only slightly:
Generate data:
n = 20e3L; k = 100e3L
idx1 = sample(100, n, TRUE)
idx2 = sample(100, n, TRUE)
d1 = data.table(x = sample(letters[1:5], n, TRUE),
start = pmin(idx1, idx2),
end = pmax(idx1, idx2))
d2 = data.table(x = sample(letters[1:15], k, TRUE),
pos1 = sample(60:150, k, TRUE))
d2[, pos2 := pos1]
ans1 = foverlaps(d2, d1, by.x=1:3, type="within", nomatch=0L)
# user system elapsed
# 3.028 0.635 3.745
This took ~ 1GB of memory in total, out of which ans1 is 420MB. Most of the time spent here is on subset really. You can check it by setting the argument verbose=TRUE.
Gabor's solutions:
## new session - data.table solution
setkey(d1, x)
ans2 <- d1[d2, allow.cartesian=TRUE, nomatch=0L][between(pos1, start, end)]
# user system elapsed
# 15.714 4.424 20.324
And this took a total of ~3.5GB.
I just noted that Gabor already mentions the memory required for intermediate results. So, trying out sqldf:
# new session - sqldf solution
system.time(ans3 <- sqldf("select * from d1 join
d2 using (x) where pos1 between start and end"))
# user system elapsed
# 73.955 1.605 77.049
Took a total of ~1.4GB. So, it definitely uses less memory than the one shown above.
[The answers were verified to be identical after removing pos2 from ans1 and setting key on both answers.]
Note that this overlap join is designed with problems where d2 doesn't necessarily have identical start and end coordinates (ex: genomics, the field where I come from, where d2 is usually about 30-150 million or more rows).
foverlaps() is stable, but is still under development, meaning some arguments and names might get changed.
NB: Since I mentioned GenomicRanges above, it is also perfectly capable of solving this problem. It uses interval trees under the hood, and is quite memory efficient as well. In my benchmarks on genomics data, foverlaps() is faster. But that's for another (blog) post, some other time.
data.table v1.9.8+ has a new feature - non-equi joins. With that, this operation becomes even more straightforward:
require(data.table) #v1.9.8+
# no need to set keys on `d1` or `d2`
d2[d1, .(x, pos=x.pos, start, end), on=.(x, pos>=start, pos<=end), nomatch=0L]
# x pos start end
# 1: a 2 1 3
# 2: a 3 1 3
# 3: c 20 19 22
# 4: e 10 7 25
1) sqldf This is not data.table but complex join criteria are easy to specify in a straight forward manner in SQL:
sqldf("select * from d1 join d2 using (x) where pos between start and end")
x start end pos
1 a 1 3 2
2 a 1 3 3
3 c 19 22 20
4 e 7 25 10
2) data.table For a data.table answer try this:
setkey(d1, x)
setkey(d2, x)
d1[d2][between(pos, start, end)]
x start end pos
1: a 1 3 2
2: a 1 3 3
3: c 19 22 20
4: e 7 25 10
Note that this does have the disadvantage of forming the possibly large intermeidate result d1[d2] which SQL may not do. The remaining solutions may have this problem too.
3) dplyr This suggests the corresponding dplyr solution. We also use between from data.table:
library(data.table) # between
d1 %>%
inner_join(d2) %>%
filter(between(pos, start, end))
Joining by: "x"
x start end pos
1 a 1 3 2
2 a 1 3 3
3 c 19 22 20
4 e 7 25 10
4) merge/subset Using only the base of R:
subset(merge(d1, d2), start <= pos & pos <= end)
x start end pos
1: a 1 3 2
2: a 1 3 3
3: c 19 22 20
4: e 7 25 10
Added Note that the data table solution here is much faster than the one in the other answer:
dt1 <- function() {
d1 <- data.table(x=letters[1:5], start=c(1,5,19,30, 7), end=c(3,11,22,39,25))
d2 <- data.table(x=letters[c(1,1,2,2,3:5)], pos=c(2,3,3,12,20,52,10))
setkey(d1, x, start)
idx1 = d1[d2, which=TRUE, roll=Inf] # last observation carried forwards
setkey(d1, x, end)
idx2 = d1[d2, which=TRUE, roll=-Inf] # next observation carried backwards
idx = which(! & !
ans1 <<- cbind(d1[idx1[idx]], d2[idx, list(pos)])
dt2 <- function() {
d1 <- data.table(x=letters[1:5], start=c(1,5,19,30, 7), end=c(3,11,22,39,25))
d2 <- data.table(x=letters[c(1,1,2,2,3:5)], pos=c(2,3,3,12,20,52,10))
setkey(d1, x)
ans2 <<- d1[d2][between(pos, start, end)]
benchmark(dt1(), dt2())[1:4]
## test replications elapsed relative
## 1 dt1() 100 1.45 1.667
## 2 dt2() 100 0.87 1.000 <-- from (2) above
Overlap joins are available in dplyr 1.1.0 via the function join_by.
With join_by, you can do overlap join with between, or manually with >= and <=:
inner_join(d2, d1, by = join_by(x, between(pos, start, end)))
# x pos start end
#1 a 2 1 3
#2 a 3 1 3
#3 c 20 19 22
#4 e 10 7 25
inner_join(d2, d1, by = join_by(x, pos >= start, pos <= end))
# x pos start end
#1 a 2 1 3
#2 a 3 1 3
#3 c 20 19 22
#4 e 10 7 25
Using fuzzyjoin :
result <- fuzzyjoin::fuzzy_inner_join(d1, d2,
by = c('x', 'pos' = 'start', 'pos' = 'end'),
match_fun = list(`==`, `>=`, `<=`))
# x.x pos x.y start end
# <chr> <dbl> <chr> <dbl> <dbl>
#1 a 2 a 1 3
#2 a 3 a 1 3
#3 c 20 c 19 22
#4 e 10 e 7 25
Since fuzzyjoin returns all the columns we might need to do some cleaning to keep the columns that we want.
result %>% select(x = x.x, pos, start, end)
# A tibble: 4 x 4
# x pos start end
# <chr> <dbl> <dbl> <dbl>
#1 a 2 1 3
#2 a 3 1 3
#3 c 20 19 22
#4 e 10 7 25

Loop through irregular list of numbers to append rows to summary table

I'm trying to write code that will loop through a list of integers, which relate to a number of sensors, to provide summary statistics (at this stage just cor()).
corr_table <-data.frame(ID = integer()
, HxT = double())
for(j in gt_thrsh_key){ #this is currently set to 2:5 for testing - its a list of sensors I want to summarise
# extract humidity and time vectors
x <- sqldf(sprintf("SELECT humidity FROM data_agg_2 WHERE ID = %s",j))
y <- sqldf(sprintf("SELECT time_elapsed FROM data_agg_2 WHERE ID = %s",j))
# format into row
new_row <- data.frame(ID = c(j), HxT = c(cor(x,y))) #insert new variables into row
# append to dataframe
corr_table <- rbind(corr_table, new_row)
print(sprintf("Sensor %s has been summarised.",j)) # check 1
print(cor(x,y)) # check 2
assign("data_agg_2", data_agg_2, envir = .GlobalEnv)
I get output:
[1] "Sensor 2 has been summarised." "Sensor 3 has been summarised." "Sensor 4 has been summarised." "Sensor 5 has been summarised."
humidity -0.08950285
1 2 -0.08950285 #INCORRECT
2 3 -0.08950285 #INCORRECT
3 4 -0.08950285 #INCORRECT
4 5 -0.08950285 #correct
This is only the correct measurement for the final iteration of loop (id = 5), so somehow I must be overwriting previous entries. Does anyone know why this is happening? Or can you recommend a better way to perform this loop?
EDIT: check 2 which prints the cor() of x and y through the loop confirms that only the final run of loop is calculating a value. Has anyone seen this before?
Here is a base R solution that uses lapply() to generate the correlations and write them to a list(). The list is converted to a data frame with,...).
# simulate some data
set.seed(19041798) # ensure consistency across multiple runs
ID <- rep(1:10,20)
humidity <- rnorm(200,mean = 30,sd = 15)
elapsed_time <- rpois(200,2.5)
data <- data.frame(ID,humidity, elapsed_time)
uniqueIDs <- unique(data$ID)
correlationList <- lapply(uniqueIDs,function(x){
y <- subset(data,ID == x)
HxT <- cor(y$humidity,y$elapsed_time)
# return as data frame
data.frame(ID = x,HxT = HxT)
correlations <-,correlationList)
...and the output:
> correlations
1 1 -0.1805885
2 2 -0.3166290
3 3 0.1749233
4 4 -0.2517737
5 5 0.1428092
6 6 0.3112812
7 7 -0.3180825
8 8 0.3774637
9 9 -0.3790178
10 10 -0.3070866
sqldf() version
We can restructure the code from the original post so it extracts all the data it needs through a single SQL query, and performs all subsequent processing in R.
First, we simulate 60,000 rows of data.
set.seed(19041798) # ensure consistency across multiple runs
ID <- rep(1:30,2000)
humidity <- rnorm(60000,mean = 30,sd = 15)
elapsed_time <- rpois(60000,2.5)
data <- data.frame(ID,humidity, elapsed_time)
Next, we extract data for the first 5 sensors from the data with sqldf(), as well as the vector of uniqueIDs.
# select ID <= 5
sqlStmt <- "select ID, humidity,elapsed_time from data where ID <= 5"
dataSubset <- sqldf(sqlStmt)
sqlStmt <- "select distinct ID from data where ID <= 5"
uniqueIDs <- sqldf(sqlStmt)[[1]]
At this point, the dataSubset data frame has 10,000 observations. We use lapply() with the vector of uniqueIDs to generate correlations by ID, count the complete.cases() included in each correlation, and write the results to a list of data frames.
correlationList <- lapply(uniqueIDs,function(x){
y <- subset(dataSubset,ID == x)
count <- sum(complete.cases(y)) # number of obs included in cor()
HxT <- cor(y$humidity,y$elapsed_time)
# return as data frame
data.frame(ID = x,count = count,HxT = HxT)
Finally, a,...) and a print, and we have our list of correlations including counts of rows used to calculate the correlation.
correlations <-,correlationList)
...and the output:
> correlations
ID count HxT
1 1 2000 0.015640244
2 2 2000 0.017143573
3 3 2000 -0.011283180
4 4 2000 0.052482666
5 5 2000 0.002083603

Gurobi Optimization Result Writing into Csv file

I am using Gurobi 7 to solve my MIP. I have several different variables. However, I am specifically interested in two of those, "x" and "y" namely. For the reference, I am giving my code that shows how I added x and y variables into the solver:
# Creating Variables
x = {}
y = {}
# Adding Variables
for i in range(I):
x[i+1,P[i]-d[0]] = m.addVar(vtype=GRB.BINARY, name="x%s" % str([i+1,P[i]-d[0]]))
x[i+1,P[i]] = m.addVar(vtype=GRB.BINARY, name="x%s" % str([i+1,P[i]]))
for i in range(I):
for k in range(len(rangevalue)):
y[i+1, rangevalue[k] - E[i]] = m.addVar(vtype=GRB.BINARY,
name="y%s" % str([i+1, rangevalue[k] - E[i]]))
Even though the above code may not really make any sense, I just wanted to show it in case you may use it for my problem.
After I solve the problem, I get the following results:
Variable X
x[1, 3] 1
sigmaminus[1] 874
x[2, 2] 1
sigmaminus[2] 1010
x[3, 2] 1
sigmaminus[3] 1945
x[4, 4] 1
sigmaplus[4] 75
x[5, 4] 1
sigmaminus[5] 1153
x[6, 5] 1
sigmaminus[6] 280
x[7, 3] 1
sigmaplus[7] 1138
x[8, 2] 1
sigmaplus[8] 538
x[9, 1] 1
sigmaplus[9] 2432
x[10, 5] 1
sigmaminus[10] 480
omega[1] 12
OMEGA[1] 12
omega[2] 9
OMEGA[2] 12
omega[3] 8
OMEGA[3] 9
omega[4] 8
OMEGA[4] 8
OMEGA[5] 8
y[1, 2] 1
y[2, 9] 1
y[3, 5] 1
y[4, 6] 1
y[5, 4] 1
y[6, 6] 1
y[7, 3] 1
y[8, 11] 1
y[9, 8] 1
y[10, 1] 1
phiplus[6] 1
phiminus[7] 1
phiminus[10] 1
I specifically want to display x and y variables with their indexes. Other variables are not necessary. My question is how can I write these results into an csv file on one column as following?
I do not need their corresponding value which can only be "1" since they are binary variables. I just need to write the variables which have the value "1".
I would do something along these lines:
import csv
if m.SolCount == 0:
print("Model has no solution")
var_names = []
for var in m.getVars():
# Or use list comprehensions instead
if 'x' == str(var.VarName[0]) and var.X > 0.1:
# Write to csv
with open('out.csv', 'wb') as myfile:
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
I hope this helps. I am going to test this snippet a bit later. Update: works as intended.

A programming challenge with Mathematica

I am interfacing an external program with Mathematica. I am creating an input file for the external program. Its about converting geometry data from a Mathematica generated graphics into a predefined format. Here is an example Geometry.
Figure 1
The geometry can be described in many ways in Mathematica. One laborious way is the following.
This generates the required 3D geometry in GraphicsComplex format of MMA.
This geometry is described as the following input file for my external program.
# x y z [m]
1. -1. 0.
0. -1. 0.5
0. -1. -0.5
1. -0.3333 0.
0. -0.3333 0.50. -0.3333 -0.5
1. 0.3333 0.
0. 0.3333 0.5
0. 0.3333 -0.5
1. 1. 0.
0. 1. 0.5
0. 1. -0.5
10. -1. 0.
10. -0.3333 0.
10. 0.3333 0.
10. 1. -0.
# type node_id1 node_id2 node_id3 node_id4 elem_id1 elem_id2 elem_id3 elem_id4
1 1 4 5 2 4 2 10 0
1 2 5 6 3 1 5 3 10
1 3 6 4 1 2 6 10 0
1 4 7 8 5 7 5 1 0
1 5 8 9 6 4 8 6 2
1 6 9 7 4 5 9 3 0
1 7 10 11 8 8 4 11 0
1 8 11 12 9 7 9 5 11
1 9 12 10 7 8 6 11 0
2 1 2 3 1 2 3
2 10 12 11 9 8 7
10 4 1 13 14 1 3
10 7 4 14 15 4 6
10 10 7 15 16 7 9
# end of input file
Now the description I have from the documentation of this external program is pretty short. I am quoting it here.
First keyword NODES states total number of
nodes. After this line there should be no comment or empty lines. Next lines consist of
three values x, y and z node coordinates and number of lines must be the same as number
of nodes.
Next keyword is PANEL and states how many panels we have. After that we have lines
defining each panel. First integer defines panel type
ID 1 – quadrilateral panel - is defined by four nodes and four neighboring panels.
Neighboring panels are panels that share same sides (pair of nodes) and is needed for
velocity and pressure calculation (methods 1 and 2). Missing neighbors (for example for
panels near the trailing edge) are filled with value 0 (see Figure 1).
ID 2 – triangular panel – is defined by three nodes and three neighboring panels.
ID 10 – wake panel – is quadrilateral panel defined with four nodes and with two
(neighboring) panels which are located on the trailing edge (panels to which wake panel is
applying Kutta condition).
Panel types 1 and 2 must be defined before type 10 in input file.
Important to notice is the surface normal; order of nodes defining panels should be
counter clockwise. By the right-hand rule if fingers are bended to follow numbering,
thumb will show normal vector that should point “outwards” geometry.
We are given with a 3D CAD model in a file called One.obj and it is exported fine in MMA.
cd = Import["One.obj"]
The output is a MMA Graphics3D object
Now I can get easily access the geometry data as MMA internally reads them.
{ver1, pol1} = cd[[1]][[2]] /. GraphicsComplex -> List;
MyPol = pol1 // First // First;
Graphics3D[GraphicsComplex[ver1,MyPol],Axes-> True]
How we can use the vertices and polygon information contained in ver1 and pol1 and write them in a text file as described in the input file example above. In this case we will only have ID2 type (triangular) panels.
Using the Mathematica triangulation how to find the surface area of this 3D object. Is there any inbuilt function that can compute surface area in MMA?
No need to create the wake panel or ID10 type elements right now. A input file with only triangular elements will be fine.
Sorry for such a long post but its a puzzle that I am trying to solve for a long time. Hope some of you expert may have the right insight to crack it.
Q1 and Q2 are easy enough that you could drop the "challenge" labels in your question. Q3 could use some clarification.
edges = cd[[1, 2, 1]];
polygons = cd[[1, 2, 2, 1, 1, 1]];
Update Q1
The main problem is to find the neighbor of each polygon. The following does this:
(* Split every triangle in 3 edges, with nodes in each edge sorted *)
triangleEdges = (Sort /# Subsets[#, {2}]) & /# polygons;
(* Generate a list of edges *)
singleEdges = Union[Flatten[triangleEdges, 1]];
(* Define a function which, given an edge (node number list), returns the bordering *)
(* triangle numbers. It's done by working through each of the triangles' edges *)
edgesNeighbors[_] = {};
edgesNeighbors[#1[[1]]] = Flatten[{edgesNeighbors[#1[[1]]], #2[[1]]}];
edgesNeighbors[#1[[2]]] = Flatten[{edgesNeighbors[#1[[2]]], #2[[1]]}];
edgesNeighbors[#1[[3]]] = Flatten[{edgesNeighbors[#1[[3]]], #2[[1]]}];
) &, triangleEdges
(* Build a triangle relation table. Each '1' indicates a triangle relation *)
relations = ConstantArray[0, {triangleEdges // Length, triangleEdges // Length}];
(n = edgesNeighbors[##];
If[Length[n] == 2,
{n1, n2} = n;
relations[[n1, n2]] = 1; relations[[n2, n1]] = 1];
) &, singleEdges
(* Build a neighborhood list *)
triangleNeigbours =
Table[Flatten[Position[relations[[i]], 1]], {i,triangleEdges // Length}];
(* Test: Which triangles border on triangle number 1? *)
(* ==> {32, 61, 83} *)
(* Check this *)
polygons[[{1, 32, 61, 83}]]
(* ==> {{1, 2, 3}, {3, 2, 52}, {1, 3, 50}, {19, 2, 1}} *)
(* Indeed, they all share an edge with #1 *)
You can use the low level output functions described here to output these. I'll leave the details to you (that's my challenge to you).
The area of the wing is the summed area of the individual polygons. The individual areas can be calculated as follows:
polygonArea[pts_List] :=
Module[{dtpts = Append[pts, pts[[1]]]},
If[Length[pts] < 3,
1/2 Sum[Det[{dtpts[[i]], dtpts[[i + 1]]}], {i, 1, Length[dtpts] - 1}]
based on this Mathworld page.
The area is signed BTW, so you may want to use Abs.
The above area function is only usable for general polygons in 2D. For the area of a triangle in 3D the following can be used:
polygonArea[pts_List?(Length[#] == 3 &)] :=
Norm[Cross[pts[[2]] - pts[[1]], pts[[3]] - pts[[1]]]]/2