trying to find out permeation events in each pore of AQP protein embedded in lipid bilayer through VMD - scripting

My project is based on MD simulation analysis on a system containing water box and lipid bilayer containing Aquaporin embedded in it. Simulations of timestep 150 ns is performed on this system to study the analysis of water permeation and flow through the lipid bilayer. one of the analysis of my work demands the calculation of water permeation events through each channel of this embedded proteins (this protein contains four monomers forming four water channels). I am performing my analysis using VMD.
I got this script https://www.ks.uiuc.edu/Training/Tutorials/science/nanotubes/files/permeation.tcl surfing from the internet. But this script is not giving the results according to my requirement.
As I wanted to find out permeation events happening through each pore/water channel separately and this script just calculate the water permeation events through the AQP layer as a whole. I have not as much expertise to change this script according to my requirement.

The first question is, of course, whether the simulation has the information that you need. After all, if we can't discover that, then we've got a problem!
If we look at the analysis code itself, we can see that all it is actually doing is using the Z coordinate of the water molecules in each frame and ignoring the other coordinates (which would be required for estimating which pore was used). It decides what is going on with them using a tiny little state machine per molecule. The relevant code is this (after conventionalising the input):
for {set fr 0} {$fr < $numFrame} {incr fr} {
molinfo top set frame $fr
set oldList $labelList
set labelList {}
foreach z [$wat get z] oldLab $oldList segname $segList resid $ridList {
if {$z > $upperEnd} {
set newLab 2
if {$oldLab == -1} {
puts "$segname:$resid permeated through the nanotubes along +z direction at frame $fr"
if {$fr >= $skipFrame} {
incr num1
}
}
} elseif {$z < $lowerEnd} {
set newLab -2
if {$oldLab == 1} {
puts "$segname:$resid permeated through the nanotubes along -z direction at frame $fr"
if {$fr >= $skipFrame} {
incr num2
}
}
} elseif {abs($oldLab) > 1} {
set newLab [expr $oldLab / 2]
} else {
set newLab $oldLab
}
lappend labelList $newLab
}
}
Perhaps a start would be to collect the X and Y coordinates of the molecules immediately after the transit events and to plot those? I don't know if that will help, but maybe?
for {set fr 0} {$fr < $numFrame} {incr fr} {
molinfo top set frame $fr
set oldList $labelList
set labelList {}
foreach x [$wat get x] y [$wat get y] z [$wat get z] oldLab $oldList segname $segList resid $ridList {
if {$z > $upperEnd} {
set newLab 2
if {$oldLab == -1} {
puts "$segname:$resid permeated through the nanotubes along +z direction at frame $fr"
if {$fr >= $skipFrame} {
incr num1
}
# Remember event for later
lappend permeateUpwards $x $y
}
} elseif {$z < $lowerEnd} {
set newLab -2
if {$oldLab == 1} {
puts "$segname:$resid permeated through the nanotubes along -z direction at frame $fr"
if {$fr >= $skipFrame} {
incr num2
}
# Remember event for later
lappend permeateDownwards $x $y
}
} elseif {abs($oldLab) > 1} {
set newLab [expr $oldLab / 2]
} else {
set newLab $oldLab
}
lappend labelList $newLab
}
}
Now that we have those lists, we can try to print them to a file so that you can plot them:
set f [open "downwards.csv" w]
foreach {x y} $permeateDownwards {
puts $f "$x,$y"
}
close $f
set f [open "upwards.csv" w]
foreach {x y} $permeateUpwards {
puts $f "$x,$y"
}
close $f
There's plenty of tools that can plot a series of points in a CSV, and you can look at that and see if what you've got is at least reasonable.

Related

Reading file line by line in Perl6, how to do idiomatically?

I have a rudimentary script in Perl6 which runs very slowly, about 30x slower than the exact perl5 translation.
CONTROL {
when CX::Warn {
note $_;
exit 1;
}
}
use fatal;
role KeyRequired {
method AT-KEY (\key) {
die "Key {key} not found" unless self.EXISTS-KEY(key);
nextsame;
}
}
for dir(test => /^nucleotide_\d**2_\d**2..3\.tsv$/) -> $tsv {
say $tsv;
my $qqman = $tsv.subst(/\.tsv$/, '.qqman.tsv');
my $out = open $qqman, :w;
put "\t$qqman";
my UInt $line-no = 0;
for $tsv.lines -> $line {
if $line-no == 0 {
$line-no = 1;
$out.put(['SNP', 'CHR', 'BP', 'P', 'zscore'].join("\t"));
next
}
if $line ~~ /.+X/ {
next
}
$line-no++;
my #line = $line.split(/\s+/);
my $chr = #line[0];
my $nuc = #line[1];
my $p = #line[3];
my $zscore = #line[2];
my $snp = "'rs$line-no'";
$out.put([$snp, $chr, $nuc, $p, $zscore].join("\t"));
#$out.put();
}
last
}
this is idiomatic in Perl5's while.
This is a very simple script, which only alters columns of text in a file. This Perl6 script runs in 30 minutes. The Perl5 translation runs in 1 minute.
I've tried reading Using Perl6 to process a large text file, and it's Too Slow.(2014-09) and Perl6 : What is the best way for dealing with very big files? but I'm not seeing anything that could help me here :(
I'm running Rakudo version 2018.03 built on MoarVM version 2018.03
implementing Perl 6.c.
I realize that Rakudo hasn't matured to Perl5's level (yet, I hope), but how can I get this to read the file line by line in a more reasonable time frame?
There is a bunch of things I would change.
/.+X/ can be simplified to just /.X/ or even $line.substr(1).contains('X')
$line.split(/\s+/) can be simplified to $line.words
$tsv.subst(/\.tsv$/, '.qqman.tsv') can be simplified to $tsv.substr(*-4) ~ '.qqman.tsv'
uint instead of UInt
given .head {} instead of for … {last}
given dir(test => /^nucleotide_\d**2_\d**2..3\.tsv$/).head -> $tsv {
say $tsv;
my $qqman = $tsv.substr(*-4) ~ '.qqman.tsv';
my $out = open $qqman, :w;
put "\t$qqman";
my uint $line-no = 0;
for $tsv.lines -> $line {
FIRST {
$line-no = 1;
$out.put(('SNP', 'CHR', 'BP', 'P', 'zscore').join("\t"));
next
}
next if $line.substr(1).contains('X');
++$line-no;
my ($chr,$nuc,$zscore,$p) = $line.words;
my $snp = "'rs$line-no'";
$out.put(($snp, $chr, $nuc, $p, $zscore).join("\t"));
#$out.put();
}
}

Concatenating lists in Raku

I'm looking for a simpler solution.
I have a list of prefixes with corresponding suffixes and a list of roots.
my #prefixes = 'A'..'E';
my #suffixes = 'a'..'e';
my #roots = 1, 2;
I would like to make all the possible 'words': A1a, B1b...A2a...E2e.
my #words;
for #roots -> $r {
for #prefixes.kv -> $i, $p {
my $s = #suffixes[$i];
my $word = [~] $p, $r, $s;
#words.push: $word;
}
}
say #words; # [A1a B1b C1c D1d E1e A2a B2b C2c D2d E2e]
I suppose that it is possible to do it much easier using something like zip or cross, but can't figure out how...
My solution would be:
say #roots.map: |(#prefixes >>~>> * <<~<< #postfixes);
Create a WhateverCode for metaopping concatenation, slipping the result to get a Seq with only scalar values at the end.
A few more ways to write it:
say #roots X[&join] (#prefixes Z #suffixes);
say #roots.map({ |(#prefixes Z #suffixes)».join($_) });
say #roots.map({ (#prefixes X~ $_) Z~ #suffixes }).flat;
say (|#prefixes xx *) Z~ (#roots X~ #suffixes);
my #formats = (#prefixes Z #suffixes).flat.map(* ~ '%s' ~ *);
say #formats X[&sprintf] #roots;
(Note: This one prints them in a different order.)
say do for #roots -> $root {
|do for (#prefixes Z #suffixes) -> [$prefix, $suffix] {
$prefix ~ $root ~ $suffix
}
}

TCL multiple assignment (as in Perl or Ruby)

In Ruby or Perl one can assign more than variable by using parentheses. For example (in Ruby):
(i,j) = [1,2]
(k,m) = foo() #foo returns a two element array
Can one accomplish the same in TCL, in elegant way? I mean I know that you can
do:
foreach varname { i j } val { 1 2 } { set $varname $val }
foreach varname { k m } val [ foo ] { set $varname $val }
But I was hoping for something shorter/ with less braces.
Since Tcl 8.5, you can do
lassign {1 2} i j
lassign [foo] k m
Note the somewhat unintuitive left-to-right order of value sources -> variables. It's not a unique design choice: e.g. scan and regexp use the same convention. I'm one of those who find it a little less readable, but once one has gotten used to it it's not really a problem.
If one really needs a Ruby-like syntax, it can easily be arranged:
proc mset {vars vals} {
uplevel 1 [list lassign $vals {*}$vars]
}
mset {i j} {1 2}
mset {k m} [foo]
Before Tcl 8.5 you can use
foreach { i j } { 1 2 } break
foreach { k m } [ foo ] break
which at least has fewer braces than in your example.
Documentation: break, foreach, lassign, list, proc, uplevel

Optimising MST using Prim's algorithm from O(n^3) to O(n^2)

Following is my pseudocode for converting a connected graph to MST using Prim's algorithm. I am however getting a complexity of n^3 rather then n^2. Please help me figure out the non-required steps.I have an adjacency matrix "a" to store the weight of graph edges and a 2D matrix "check" storing "1" for vertices already in the tree and "0" for remaining.
Please also note that this can be done in nlog(n) also, but I don't want to refer any existing pseudocode and want to try it on my own. I would appreciate an answer optimizing my own approach.
Initialize check. //check[0][1]==1
while(--no_of_vertices)
{
minimum_outer = infinite
for(i from 1 to no_of_vertices)
{
minimum_inner = infinite
select i from check having value 1
for(j from 1 to no_of_vertices )
{
select j from check having value 0
if(a[i-j] < minimum_inner)
minimum_inner = a[i-j]
temp_j = j;
}
if(minimum_inner<minimum_outer)
{
minimum_outer = minimum_inner
final_i = i
final_j = temp_j
}
}
//until here basically what I have done is, selected an i-j with lowest
//weight where "i" is chosen from vertices already in MST and "j" from
//remaining vertices
check[final_j][1] = 1
print final_i - final_j
cost_of_mst += a[final_i][final_j]
}
The reason your algorithm runs with O(V^3) time is because in each iteration you are going through the entire adjacency matrix, which takes O(V^2) and performs some redundant actions.
Whenever you are adding a vertex to the spanning tree, there are at most |V-1| new edges that may be added to the solution. At each iteration, you should only check if these edges changed the minimal weight to each of the other vertices.
The algorithm should look like:
1. Select a random vertex v, add v to S, assign an array A where A[i] = d{v, i}
2. While there are vertices that belong to G and not to S do:
2.1. Iterate through A, select the minimum value A[i], add vertex i to S
2.2. for each edge e={i, j} connected to vertex i do:
2.2.1. if d{i, j} < A[j] then A[j] = d{i ,j}
This way you are performing O(V) actions for each vertex you add instead of O(V^2), and the overall running time is O(V^2)
Here's an edit to your code:
Select a random vertex x
Initialize DistanceTo // DistanceTo[i] = d{x, i}
Initialize Visited // Visited[i] = false if i!=x, Visited[x] = true
while(--no_of_vertices)
{
min_val = infinite
min_index = 1;
for(i from 1 to DistanceTo.size)
{
if (Visited[i] = false && DistanceTo[i] < min_val)
{
min_val = DistanceTo[i]
min_index = i
}
}
Visited[min_index] = true
For(i from 1 to Distance.size)
{
if (Visited[i] = false && d{min_index, i} < DistanceTo[i])
{
DistanceTo[i] = d{min_index, i}
}
}
print edge {min_index, i} was added to the MST
cost_of_mst += d{min_index, i}
}

How to receive message from 'any' channel in PROMELA/SPIN

I'm modeling an algorithm in Spin.
I have a process that has several channels and at some point, I know a message is going to come but don't know from which channel. So want to wait (block) the process until it a message comes from any of the channels. how can I do that?
I think you need Promela's if construct (see http://spinroot.com/spin/Man/if.html).
In the process you're referring to, you probably need the following:
byte var;
if
:: ch1?var -> skip
:: ch2?var -> skip
:: ch3?var -> skip
fi
If none of the channels have anything on them, then "the selection construct as a whole blocks" (quoting the manual), which is exactly the behaviour you want.
To quote the relevant part of the manual more fully:
"An option [each of the :: lines] can be selected for execution only when its guard statement is executable [the guard statement is the part before the ->]. If more than one guard statement is executable, one of them will be selected non-deterministically. If none of the guards are executable, the selection construct as a whole blocks."
By the way, I haven't syntax checked or simulated the above in Spin. Hopefully it's right. I'm quite new to Promela and Spin myself.
If you want to have your number of channels variable without having to change the implementation of the send and receive parts, you might use the approach of the following producer-consumer example:
#define NUMCHAN 4
chan channels[NUMCHAN];
init {
chan ch1 = [1] of { byte };
chan ch2 = [1] of { byte };
chan ch3 = [1] of { byte };
chan ch4 = [1] of { byte };
channels[0] = ch1;
channels[1] = ch2;
channels[2] = ch3;
channels[3] = ch4;
// Add further channels above, in
// accordance with NUMCHAN
// First let the producer write
// something, then start the consumer
run producer();
atomic { _nr_pr == 1 ->
run consumer();
}
}
proctype consumer() {
byte var, i;
chan theChan;
i = 0;
do
:: i == NUMCHAN -> break
:: else ->
theChan = channels[i];
if
:: skip // non-deterministic skip
:: nempty(theChan) ->
theChan ? var;
printf("Read value %d from channel %d\n", var, i+1)
fi;
i++
od
}
proctype producer() {
byte var, i;
chan theChan;
i = 0;
do
:: i == NUMCHAN -> break
:: else ->
theChan = channels[i];
if
:: skip;
:: theChan ! 1;
printf("Write value 1 to channel %d\n", i+1)
fi;
i++
od
}
The do loop in the consumer process non-deterministically chooses an index between 0 and NUMCHAN-1 and reads from the respective channel, if there is something to read, else this channel is always skipped. Naturally, during a simulation with Spin the probability to read from channel NUMCHAN is much smaller than that of channel 0, but this does not make any difference in model checking, where any possible path is explored.