2 Midi files in one project - csound

I'm trying to play two separate midi files in one project. Bronson2 I have
working fine. How should I approach bass.mid to get it to work alongside the
other file. I have a feeling it's something about Midi numbers however I'm new to csound and unsure. I've posted the code below. I'm receiving an error message that the midi file won't load however I've made sure it's in the project file. Any help is appreciated, thanks.
<CsoundSynthesizer>
<CsOptions>
-F Bronson2.mid
-F bass.mid
</CsOptions>
<CsInstruments>
ksmps = 10
0dbfs = 1
instr 1
iamp ampmidi 1
ifr cpsmidi
aosc oscil iamp, ifr, 1
afilt lpf18 aosc*.5, 1500, .89, .9
out afilt*.02
endin
instr 5
iamp ampmidi 1
ifr cpsmidi
aosc oscil iamp, ifr, 1
afilt lpf18 aosc*.5, 1500, .89, .9
out afilt*.02
endin
instr 4 ;hihat closed
aamp expon .25, 0.1, .001 ;short fade out...an a-rate
exponantial line, a-rate makes it more accurate.
arand rand aamp ;random noise faded
out arand*.1
endin
instr 2 ;snare
iamp = p4
aenv1 expon iamp, 0.03, 0.01 ;short fade out
a1 oscili aenv1, 147, 1 ;'ring' faded, 147 hz works well for snare
ring.
aamp expon .25, 0.2, .001 ;short fade ut
arand rand aamp ;random noise faded
out a1+arand*.2 ;mix
endin
instr 3; kick
ipitch = p4
k1 expon ipitch, .2, 50 ;detune...with user-controlled starting point.
aenv expon 1, p3, 0.01 ;fade over note
a1 poscil aenv, k1, 1 ;pitched tone, faded out
out a1*.4*.2 ;scale down volume
endin
</CsInstruments>
<CsScore>
f 0 3600
f1 0 1024 10 1
t 0 120
{16 CNT ;38 seconds long
i4 [0 + 4*$CNT.] 0.25
i4 [0.5 + 4*$CNT.] 0.25
i4 [1 + 4*$CNT.] 0.25
i4 [1.5 + 4*$CNT.] 0.25
i4 [2 + 4*$CNT.] 0.25
i4 [2.5 + 4*$CNT.] 0.25
i4 [3 + 4*$CNT.] 0.25
i4 [3.5 + 4*$CNT.] 0.25
i2 [1 + 4*$CNT.] .25 .45
i2 [3 + 4*$CNT.] .25 .45
i3 [0 + 4*$CNT.] .25 100
i3 [2 + 4*$CNT.] .25 100
i3 [2.5 + 4*$CNT.] .25 100
}
</CsScore>
</CsoundSynthesizer>

Related

Two Loop Network on Gams

Regarding the Two Loop Network formulation on GAMS, I'm struggling with one of the equations.
I can solve the problem without the energy conservation loop constraint but, once i add it, the problem becomes infeasible. Also, I'm not sure if the two loops are well defined
I would apprecite if someone spots my error.
Thank you.
Set
n 'nodes' / 1, 2, 3, 4, 5, 6, 7 /
a(n,n) 'arcs/pipes arbitrarly directed'
/1.2, 4.(2,5,6), 3.(2,5), 7.(5,6)/
rn(n) 'reservoir' / 1 /
dn(n) 'demand nodes' /2, 3, 4, 5, 6, 7/
m 'number of loops' /c1, c2/
c1(n,n) 'loop 1'
/2.(3,4), 5.(3,4)/
c2(n,n) 'loop 2'
/4.(5,6), 7.(5,6)/
k 'Options available for the diameters'
/ k1, k2, k3, k4, k5, k6, k7, k8, k9, k10, k11, k12, k13, k14 /;
dn(n) = yes;
dn(rn) = no;
display a;
display dn;
display rn;
display m;
display c1;
display c2;
Alias(n,np);
Table node(n,*) 'node data'
demand elevation minhead
* m^3/sec m m
1 210 30
2 0.0278 150 30
3 0.0278 160 30
4 0.0333 155 30
5 0.0750 150 30
6 0.0917 165 30
7 0.0444 200 30 ;
display node;
Table Diam(k,*) 'Diameter and cost information'
Diameter Cost
* m $/m
k1 0.0254 2
k2 0.0508 5
k3 0.0762 8
k4 0.1016 11
k5 0.1524 16
k6 0.2032 23
k7 0.2540 32
k8 0.3048 50
k9 0.3556 60
k10 0.4064 90
k11 0.4572 130
k12 0.5080 170
k13 0.5588 300
k14 0.6096 550;
Scalar
length 'pipes diameter' /1000/
roughcoef 'roughness coefficient for every pipe' /130/
Vmin 'Minimum velocity (m/s)' /0.3/
Vmax 'Maximum velocity (m/s)' /3.0/
dmin 'minimum diameter of pipe' /0.0254/
dmax 'maximum diameter of pipe' /0.6096/
davg 'Diamter Average for starting point';
davg = sqrt(dmin*dmax);
Variable
x(n,n) 'absolute flow through each arc'
y(n,n,k) 'takes value 1 when a pipe in arc(n,n) has diameter e(k) and 0 otherwise'
t(n,n) 'auxiliary variable for modeling the flow going in the forward direction'
r(n,n) 'auxiliary variable for modeling the flow going in the reverse direction'
u(n) 'variable representing the head of node n'
d(n,n) 'representing the diameter of pipe in link (n,n), takes the same value as some e(n)'
v(n,n) 'Water velocity'
q(n,n)'real variable representing the flow direction by being the flow sign, being 1 if flow goes forward or −1 if in reverse direction for a link (n,n)'
Custo 'total cost';
Binary Variable y, t, r;
NonNegative Variable x, d;
Equation
UniPipeDiam(n,n) 'Unique Pipe Diameter'
PipeDiam(n,n) 'Pipe Diameter Real Value'
FlowDirection(n,n) 'Flow Direction'
FlowSign(n,n) 'Flow Sign'
FlowConservation(n) 'Flow Conservation at Each Node'
HeadLoss(n,n) 'Head Loss'
EnerConserLoop1(n,np) 'Energy Conservation in Loop 1'
EnerConserLoop2(n,np) 'Energy Conservation in Loop 2'
Objective 'Objective Function: Capital Cost'
Velocity(n,n) 'Velocity calculation'
VelocUp(n,np) 'Upper bound velocity'
VelocDown(n,np) 'Lower bound velocity';
UniPipeDiam(a).. sum(k, y(a,k)) =e= 1;
PipeDiam(a(n,np)).. d(n,np) =e= sum(k, Diam(k,'Diameter')*y(n,np,k));
FlowDirection(a(n,np)).. t(a) + r(a) =e= 1;
FlowSign(a(n,np)).. q(a) =e= t(a) - r(a);
FlowConservation(dn(n)).. sum(a(np,n), x(a)*q(a)) - sum(a(n,np), x(a)*q(a)) =e= node(n,"demand");
HeadLoss(a(n,np)).. u(n) - u(np) =e= [10.667]*[roughcoef**(-1.852)]*length*[d(a)**(-4.8704)]*[x(a)**(2)]*q(a);
Velocity(a(n,np)).. v(a) =e= (4.0*x(a))/(Pi*d(a)**2.0);
VelocUp(a).. v(a) =l= Vmax;
VelocDown(a).. v(a) =g= Vmin;
EnerConserLoop1(n,np).. sum(a(n,np)$c1(n,np), q(a) * (u(n) - u(np))) =e= 0;
EnerConserLoop2(n,np).. sum(a(n,np)$c2(n,np),q(a) * (u(n) - u(np))) =e= 0;
Objective.. Custo =e= sum(a(n,np), sum(k, length*Diam(k,'Cost')*y(n,np,k)));
*bounds
d.lo(n,np)$a(n,np) = dmin;
d.up(n,np)$a(n,np) = dmax;
u.lo(rn) = node(rn,"elevation");
u.lo(dn) = node(dn,"elevation") + 5.0 + 5.0*node(dn,"demand");
u.up(dn) = 300.0;
* initial values
d.l(n,np)$a(n,np) = davg;
u.l(n) = u.lo(n) + 5.0;
Model network / all /;
network.domLim = 1000;
Option Iterlim = 50000;
option MINLP = baron;
solve network using minlp minimizing Custo;

Dirichlet regressioni coefficients

starting with this example of Dirichlet regression here.
My variable y is a vector of N = 3 elements and the Dirichlet regression model estimates N-1 coeff.
Let’s say I am interested in all 3 coefficients, how can I get them?
Thanks!
library(brms)
library(rstan)
library(dplyr)
bind <- function(...) cbind(...)
N <- 20
df <- data.frame(
y1 = rbinom(N, 10, 0.5), y2 = rbinom(N, 10, 0.7),
y3 = rbinom(N, 10, 0.9), x = rnorm(N)
) %>%
mutate(
size = y1 + y2 + y3,
y1 = y1 / size,
y2 = y2 / size,
y3 = y3 / size
)
df$y <- with(df, cbind(y1, y2, y3))
make_stancode(bind(y1, y2, y3) ~ x, df, dirichlet())
make_standata(bind(y1, y2, y3) ~ x, df, dirichlet())
fit <- brm(bind(y1, y2, y3) ~ x, df, dirichlet())
summary(fit)
Family: dirichlet
Links: muy2 = logit; muy3 = logit; phi = identity
Formula: bind(y1, y2, y3) ~ x
Data: df (Number of observations: 20)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muy2_Intercept 0.29 0.10 0.10 0.47 1.00 2830 2514
muy3_Intercept 0.56 0.09 0.38 0.73 1.00 2833 2623
muy2_x 0.04 0.11 -0.17 0.24 1.00 3265 2890
muy3_x -0.00 0.10 -0.20 0.19 1.00 3229 2973
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi 39.85 9.13 23.83 59.78 1.00 3358 2652
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Why is MIP best bound infinite for this problem?

I have the following MIP problem. Upper bound for pre_6_0 should not be infinite because it is calculated from inp1, inp2, inp3, and inp4, all of which are bounded on both sides.
Maximize
obj: pre_6_0
Subject To
c1: inp0 >= -84
c2: inp0 <= 174
c3: inp1 >= -128
c4: inp1 <= 128
c5: inp2 >= -128
c6: inp2 <= 128
c7: inp3 >= -128
c8: inp3 <= 128
c9: inp4 >= -128
c10: inp4 <= 128
c11: pre_6_0 + 0.03125 inp1 - 0.0078125 inp2 - 0.00390625 inp3
+ 0.00390625 inp4 = -2.5
c12: - 0.0078125 inp0 + pre_6_1 = -2.5
c13: - 0.00390625 inp0 - 0.01171875 inp3 + pre_6_2 = 6.5
c14: - 0.0078125 inp0 + pre_6_3 = -1.5
c15: - 0.00390625 inp0 - 0.0078125 inp3 + pre_6_4 = 6.5
Bounds
pre_6_0 Free
inp0 Free
inp1 Free
inp2 Free
inp3 Free
inp4 Free
pre_6_1 Free
pre_6_2 Free
pre_6_3 Free
pre_6_4 Free
Generals
pre_6_0 inp0 inp1 inp2 inp3 inp4 pre_6_1 pre_6_2 pre_6_3 pre_6_4
The MIP best bound is infinite because no feasible integer solution exists.
Indeed, all the variables in your ILP have been restricted to general integer values (Generals section).
Here an example by using GLPK to solve the ILP.
15 rows, 10 columns, 25 non-zeros
10 integer variables, none of which are binary
...
Solving LP relaxation...
GLPK Simplex Optimizer, v4.65
5 rows, 10 columns, 15 non-zeros
0: obj = -8.000000000e+00 inf = 1.631e+01 (5)
5: obj = -3.750000000e-01 inf = 0.000e+00 (0)
* 8: obj = 3.000000000e+00 inf = 0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Integer optimization begins...
Long-step dual simplex will be used
+ 8: mip = not found yet <= +inf (1; 0)
+ 8: mip = not found yet <= tree is empty (0; 3)
PROBLEM HAS NO INTEGER FEASIBLE SOLUTION
Time used: 0.0 secs
Memory used: 0.1 Mb (63069 bytes)

Create histogram like bins for a range including negative numbers

I have numbers in a range from -4 to 4, including 0, as in
-0.526350041828112
-0.125648350883331
0.991377353361933
1.079241128983
1.06322905224238
1.17477528478982
-0.0651086035371559
0.818471811380787
0.0355593553368815
I need to create histogram like buckets, and have being trying to use this
BEGIN { delta = (delta == "" ? 0.1 : delta) }
{
bucketNr = int(($0+delta) / delta)
cnt[bucketNr]++
numBuckets = (numBuckets > bucketNr ? numBuckets : bucketNr)
}
END {
for (bucketNr=1; bucketNr<=numBuckets; bucketNr++) {
end = beg + delta
printf "%0.1f %0.1f %d\n", beg, end, cnt[bucketNr]
beg = end
}
}
from Create bins with awk histogram-like
The output would look like
-2.4 -2.1 8
-2.1 -1.8 25
-1.8 -1.5 108
-1.5 -1.2 298
-1.2 -0.9 773
-0.9 -0.6 1067
-0.6 -0.3 1914
-0.3 0.0 4174
0.0 0.3 3969
0.3 0.6 2826
0.6 0.9 1460
0.9 1.2 752
1.2 1.5 396
1.5 1.8 121
1.8 2.1 48
2.1 2.4 13
2.4 2.7 1
2.7 3.0 1
I'm thinking I would have to run this 2x, one with delta let's say 0.3 and another with delta -0.3, and cat the two together.
But I'm not sure this intuition is correct.
This might work for you:
BEGIN { delta = (delta == "" ? 0.1 : delta) }
{
bucketNr = int(($0<0?$0-delta:$0)/delta)
cnt[bucketNr]++
maxBucket = (maxBucket > bucketNr ? maxBucket : bucketNr)
minBucket = (minBucket < bucketNr ? minBucket : bucketNr)
}
END {
beg = minBucket*delta
for (bucketNr=minBucket; bucketNr<=maxBucket; bucketNr++) {
end = beg + delta
printf "%0.1f %0.1f %d\n", beg, end, cnt[bucketNr]
beg = end
}
}
It's basically the code you posted + handling negative numbers.

Faster way to split a string and count characters using R?

I'm looking for a faster way to calculate GC content for DNA strings read in from a FASTA file. This boils down to taking a string and counting the number of times that the letter 'G' or 'C' appears. I also want to specify the range of characters to consider.
I have a working function that is fairly slow, and it's causing a bottleneck in my code. It looks like this:
##
## count the number of GCs in the characters between start and stop
##
gcCount <- function(line, st, sp){
chars = strsplit(as.character(line),"")[[1]]
numGC = 0
for(j in st:sp){
##nested ifs faster than an OR (|) construction
if(chars[[j]] == "g"){
numGC <- numGC + 1
}else if(chars[[j]] == "G"){
numGC <- numGC + 1
}else if(chars[[j]] == "c"){
numGC <- numGC + 1
}else if(chars[[j]] == "C"){
numGC <- numGC + 1
}
}
return(numGC)
}
Running Rprof gives me the following output:
> a = "GCCCAAAATTTTCCGGatttaagcagacataaattcgagg"
> Rprof(filename="Rprof.out")
> for(i in 1:500000){gcCount(a,1,40)};
> Rprof(NULL)
> summaryRprof(filename="Rprof.out")
self.time self.pct total.time total.pct
"gcCount" 77.36 76.8 100.74 100.0
"==" 18.30 18.2 18.30 18.2
"strsplit" 3.58 3.6 3.64 3.6
"+" 1.14 1.1 1.14 1.1
":" 0.30 0.3 0.30 0.3
"as.logical" 0.04 0.0 0.04 0.0
"as.character" 0.02 0.0 0.02 0.0
$by.total
total.time total.pct self.time self.pct
"gcCount" 100.74 100.0 77.36 76.8
"==" 18.30 18.2 18.30 18.2
"strsplit" 3.64 3.6 3.58 3.6
"+" 1.14 1.1 1.14 1.1
":" 0.30 0.3 0.30 0.3
"as.logical" 0.04 0.0 0.04 0.0
"as.character" 0.02 0.0 0.02 0.0
$sampling.time
[1] 100.74
Any advice for making this code faster?
Better to not split at all, just count the matches:
gcCount2 <- function(line, st, sp){
sum(gregexpr('[GCgc]', substr(line, st, sp))[[1]] > 0)
}
That's an order of magnitude faster.
A small C function that just iterates over the characters would be yet another order of magnitude faster.
A one liner:
table(strsplit(toupper(a), '')[[1]])
I don't know that it's any faster, but you might want to look at the R package seqinR - http://pbil.univ-lyon1.fr/software/seqinr/home.php?lang=eng. It is an excellent, general bioinformatics package with many methods for sequence analysis. It's in CRAN (which seems to be down as I write this).
GC content would be:
mysequence <- s2c("agtctggggggccccttttaagtagatagatagctagtcgta")
GC(mysequence) # 0.4761905
That's from a string, you can also read in a fasta file using "read.fasta()".
There's no need to use a loop here.
Try this:
gcCount <- function(line, st, sp){
chars = strsplit(as.character(line),"")[[1]][st:sp]
length(which(tolower(chars) == "g" | tolower(chars) == "c"))
}
Try this function from stringi package
> stri_count_fixed("GCCCAAAATTTTCCGG",c("G","C"))
[1] 3 5
or you can use regex version to count g and G
> stri_count_regex("GCCCAAAATTTTCCGGggcc",c("G|g|C|c"))
[1] 12
or you can use tolower function first and then stri_count
> stri_trans_tolower("GCCCAAAATTTTCCGGggcc")
[1] "gcccaaaattttccggggcc"
time performance
> microbenchmark(gcCount(x,1,40),gcCount2(x,1,40), stri_count_regex(x,c("[GgCc]")))
Unit: microseconds
expr min lq median uq max neval
gcCount(x, 1, 40) 109.568 112.42 113.771 116.473 146.492 100
gcCount2(x, 1, 40) 15.010 16.51 18.312 19.213 40.826 100
stri_count_regex(x, c("[GgCc]")) 15.610 16.51 18.912 20.112 61.239 100
another example for longer string. stri_dup replicates string n-times
> stri_dup("abc",3)
[1] "abcabcabc"
As you can see, for longer sequence stri_count is faster :)
> y <- stri_dup("GCCCAAAATTTTCCGGatttaagcagacataaattcgagg",100)
> microbenchmark(gcCount(y,1,40*100),gcCount2(y,1,40*100), stri_count_regex(y,c("[GgCc]")))
Unit: microseconds
expr min lq median uq max neval
gcCount(y, 1, 40 * 100) 10367.880 10597.5235 10744.4655 11655.685 12523.828 100
gcCount2(y, 1, 40 * 100) 360.225 369.5315 383.6400 399.100 438.274 100
stri_count_regex(y, c("[GgCc]")) 131.483 137.9370 151.8955 176.511 221.839 100
Thanks to all for this post,
To optimize a script in which I want to calculate GC content of 100M sequences of 200bp, I ended up testing different methods proposed here. Ken Williams' method performed best (2.5 hours), better than seqinr (3.6 hours). Using stringr str_count reduced to 1.5 hour.
In the end I coded it in C++ and called it using Rcpp, which cuts the computation time down to 10 minutes!
here is the C++ code:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
float pGC_cpp(std::string s) {
int count = 0;
for (int i = 0; i < s.size(); i++)
if (s[i] == 'G') count++;
else if (s[i] == 'C') count++;
float pGC = (float)count / s.size();
pGC = pGC * 100;
return pGC;
}
Which I call from R typing:
sourceCpp("pGC_cpp.cpp")
pGC_cpp("ATGCCC")