Include data in GAMS with an excel file - gams-math

I want to include my data in GAMS with an excel file
$onecho > Dateninput.txt
set=i rng=Sets!A2 rdim=1
set=o rng=Sets!D2 rdim=1
set=k rng=Sets!G2 rdim=1
set=n rng=Sets!J2 rdim=1
set=t rng=Sets!M2 rdim=1
par=p rng=Parameter!A2:D10 cdim=1 rdim=1
par=d rng=Parameter!G2:J5 rdim=1 cdim=1
par=l rng=Parameter!B33:O58 rdim=2 cdim=2
par=ti rng=Parameter!A18 rdim=1
par=M rng=Parameter!G26 dim=0
par=h rng=Parameter!G17:J21 rdim=1 cdim=1
par=cap rng=Parameter!A26:D29 rdim=1 cdim=1
$offecho
$call gdxxrw i=Daten.xlsx o=inputs.gdx #Dateninput.txt
$GDXIN inputs.gdx
$LOAD i, o, k, n, t, p, d, ti, l, M, h, cap,
$gdxin
.....
execute_unload 'ergebnisse.gdx';
The 'inputs.gdx' file shows all sets and parameter. My output file 'ergebnisse.gdx'just shows me the sets and one dimention parameter but not the two dimention parameter. Could you please help me what I did wrong?
Thank you very much !

Related

Snakemake multiple input files with expand but no repetitions

I'm new to snakemake and I don't know how to figure out this problem.
I've got my rule which has two inputs:
rule test
input_file1=f1
input_file2=f2
f1 is in [A{1}$, A{2}£, B{1}€, B{2}¥]
f2 is in [C{1}, C{2}]
The numbers are wildcards that come from an expand call. I need to find a way to pass to the file f1 and f2 a pair of files that match exactly with the number. For example:
f1 = A1
f2 = C1
or
f1 = B1
f2 = C1
I have to avoid combinations such as:
f1 = A1
f2 = C2
I would create a function that makes this kind of matches between the files, but the same should manage the input_file1 and the input_file2 at the same time. I thought to make a function that creates a dictionary with the different allowed combinations but how would I "iterate" over it during the expand?
Thanks
Assuming rule test gives you in output a file named {f1}.{f2}.txt, then you need some mechanism that correctly pairs f1 and f2 and create a list of {f1}.{f2}.txt files.
How you create this list is up to you, expand is just a convenience function for that but maybe in this case you may want to avoid it.
Here's a super simple example:
fin1 = ['A1$', 'A2£', 'B1€', 'B2¥']
fin2 = ['C1', 'C2']
outfiles = []
for x in fin1:
for y in fin2:
## Here you pair f1 and f2. This is a very trivial way of doing it:
if y[1] in x:
outfiles.append('%s.%s.txt' % (x, y))
wildcard_constraints:
f1 = '|'.join([re.escape(x) for x in fin1]),
f2 = '|'.join([re.escape(x) for x in fin2]),
rule all:
input:
outfiles,
rule test:
input:
input_f1 = '{f1}.txt',
input_f2 = '{f2}.txt',
output:
'{f1}.{f2}.txt',
shell:
r"""
cat {input} > {output}
"""
This pipeline will execute the following commands
cat A2£.txt C2.txt > A2£.C2.txt
cat A1$.txt C1.txt > A1$.C1.txt
cat B1€.txt C1.txt > B1€.C1.txt
cat B2¥.txt C2.txt > B2¥.C2.txt
If you touch the starting input files with touch 'A1$.txt' 'A2£.txt' 'B1€.txt' 'B2¥.txt' 'C1.txt' 'C2.txt' you should be able to run this example.

How to read excel two dimensional parameter in Gams?

I have a Gams model and I want read sets and parameters from Excel to Gams.As shown below:
How can I read this parameter in Gams?
Thanks
For that table you need 2 indexes (i.e. sets) e.g. set i for the column of a, b and c. And set j for the row of d, e and f. Try this:
parameter d(i,j) "Data with column of a, b and c and row of e, d and f";
$Call GDXXRW.exe i=C:\Input.xlsx par=d rng=Sheet1!C1:F4 Rdim=1 Cdim=1 o=C:\Input.gdx
$GDXIN C:\Input.gdx
$LOAD d
$GDXIN
Display d;

Fortran: read variables that are not present in a file

I need help understanding this 50 line program
implicit none
integer maxk, maxb, maxs
parameter (maxk=6000, maxb=1000, maxs=5)
integer nk, nspin, nband, ik, is, ib
double precision e(maxb, maxs, maxk), k(maxk)
double precision ef, kmin, kmax, emin, emax
logical overflow
read(5,*) ef
read(5,*) kmin, kmax
read(5,*) emin, emax
read(5,*) nband, nspin, nk
overflow = (nband.gt.maxb) .or. (nk.gt.maxk) .or. (nspin.gt.maxs)
if (overflow) stop 'Dimensions in gnubands too small'
write(6,"(2a)") '# GNUBANDS: Utility for SIESTA to transform ',
. 'bands output into Gnuplot format'
write(6,"(a)") '#'
write(6,"(2a)") '# ',
. ' Emilio Artacho, Feb. 1999'
write(6,"(2a)") '# ------------------------------------------',
. '--------------------------------'
write(6,"(a,f10.4)") '# E_F = ', ef
write(6,"(a,2f10.4)") '# k_min, k_max = ', kmin, kmax
write(6,"(a,2f10.4)") '# E_min, E_max = ', emin, emax
write(6,"(a,3i6)") '# Nbands, Nspin, Nk = ', nband, nspin, nk
write(6,"(a)") '#'
write(6,"(a)") '# k E'
write(6,"(2a)") '# ------------------------------------------',
. '--------------------------------'
read(5,*) (k(ik),((e(ib,is,ik),ib=1,nband), is=1,nspin), ik=1,nk)
do is = 1, nspin
do ib = 1, nband
write(6,"(2f14.6)") ( k(ik), e(ib,is,ik), ik = 1, nk)
write(6,"(/)")
enddo
enddo
This is a free format Fortran file. The name of the program is gnubands and rearranges numbers in an input (which the user specifies). I would like to know how this program operates. Here is what I do not understand. The program takes input from a file, it reads
ef, kmin,kmax,emin,emax,nband,nspin,nk
However, all of these variables are not found inside the input file. I opened the input file in vi and conducted a search using /. I do not obtain any results. Nevertheless, the program appears to correctly pick all values. What is happening?
Also, I do not understand the read format
read(5,*) (k(ik),((e(ib,is,ik),ib=1,nband), is=1,nspin), ik=1,nk)
I am not familiar with the syntax and would like to know what it is saying or any references.
Some tutorial PDF of SIESTA shows that the input for gnubands.f is something like this:
whose header part is to be read by the first four read statements of gnubands.f. With this input, the variables are set as
ef = -5.018...
kmin = 0.000...
kmax = 3.338...
emin = -25.187...
emax = 143.069...
nband = 18
nspin = 1
nk = 150
by giving the input file from the standard input (assumed unit number 5) as
gfortran -o gnubands.x gnubands.f
gnubands.x < your_data_file.bands
Note that there are (and should be) no keywords like "ef" or "EF" or "Ef" (capitalization does not matter), because the numbers are directly read into the variables in gnubands.f. This is in contrast to other cases like using XML files, where (human-readable) tags or keywords are embedded in the file itself (e.g., pseudopotential files used by Quantum ESPRESSO). I guess your confusion might be coming from the use of namelist for obtaining input values, which looks like
namelist /your_inp/ a, b, c
read( funit, nml = your_inp )
with an input file
&your_inp
a = 1.0
b = "method1"
c = 77
/
In this case, the variable names (here, a, b, and c) appear literally in the input file.
Historically, 5 (in your read(5,*)) is stdin, so either
(1)you are supplying the value, when you are running the code,
or,(2) I guess when you run the SIESTA, (gnuband is a postprocessor of that) it creates a file, possibly named fort.5. Check that.

How to awk every nth line starting from different lines each iteration

I would like awk to print every nth line out of a file starting from line 0. Then, after awk has gone through the whole file, I would like it to print every nth line starting from line 1...then print every nth line starting from line 2...etc, up to printing every nth line starting from line n-1. My sad attempt thus far:
#!/bin/bash
rm *.sad *.sadd *.out
#Create loop index
for i in $(seq 20 1 36);
do
listm+=($i)
done
#Create input file
for j in "${listm[#]}"
do
if [ $j -eq 20 ];
then
awk 'NR % 20 == 0' vel_VMDout > atomvel.dat
awk '{print $2,$3,$4}' atomvel.dat > velocity.dat
else
awk 'NR % 20 == 1' vel_VMDout > $j.sad
egrep -v "^[[:space:]]*$|^#" $j.sad > $j.sadd
awk '{print $2, $3, $4}' $j.sadd > $j.out
paste velocity.dat $j.out > taste
fi
done
Let me try to clarify this by providing the input and what the output should look like. Th input is an xyz file of an MD simulation consisting of frames of the atoms' xyz coordinates.
INPUT:
This image shows the 1st snapshot and part of the second snapshot. Because these are snapshot, the ordering of the atoms do not change. Thus, I am trying to print the xyz coordinates from each snapshot for each specific atom in their own columns as shown below. This would eventually make a file consisting of 3N columns, where N is the number of atoms.
OUTPUT:
As you can see, the each atoms' coordinates are in their own columns and the total file is a Nx3N array. My bash script was me trying to do this, but could only do the first two atoms. I wanted to print every nth line (coordinates of the nth atom) so they look like the output. I really appreciate your patience all.
Generating sample data
This is a step that should not be necessary; the question should have included usable sample data and the required output from that sample data.
At one level, it won't help much because you don't have my random number generator program, but the script below shows how I generated the data that follows, and it illustrates the lengths to which it might be necessary to go when the question doesn't supply readable data. I generated some data that looks similar to the data in the question (at least superficially):
18
Generated by VMD in absentia
C 0.979485 -6.665347 0.575383
C 1.191999 -3.002386 2.859484
C 3.151517 -5.610077 0.429413
C 3.439828 -6.454984 1.319724
C 3.726201 -0.123038 2.096854
C 1.363325 -3.031238 0.016019
C 6.090283 -3.915340 2.396358
C 0.407755 -7.957784 -0.846842
C 0.203074 -0.796428 2.659573
O 2.600610 -2.259674 -0.260378
O 4.773839 -6.765097 0.588508
H 2.743424 -2.890016 2.906452
H 2.810233 -6.641054 -0.797672
H 6.854169 -3.191721 -0.925670
O 2.914233 -1.060001 0.776983
H 3.803923 -1.497032 2.908799
H 5.669443 -7.227666 -0.647552
H 0.092455 -5.850637 2.959987
18
Generated by VMD in absentia
C 6.042840 -7.254720 2.093573
C 2.551942 -6.044322 2.061072
C 3.523150 -6.167163 2.451689
C 5.197316 -3.429866 -0.412062
C 2.548777 -6.422851 1.282846
C 3.775197 -2.012031 1.377440
C 3.405112 -3.206415 -0.879886
C 1.448359 -5.419629 0.467291
C 3.661964 -2.789234 2.644294
O 4.214854 -2.439574 -0.951704
O 5.297609 -2.320418 2.709898
H 2.653940 -4.431080 -0.511743
H 5.040635 -0.676199 -0.590970
H 1.546725 -1.294582 2.562937
O 4.231461 -7.180908 1.629901
H 3.297836 -1.557133 -0.133280
H 3.442481 -4.489962 2.111930
H 1.423611 -7.982655 0.715618
18
Generated by VMD in absentia
C 1.432495 -7.686243 2.525734
C 5.038409 -4.976270 2.826846
C 6.184137 -7.303094 2.711561
C 3.208125 -0.606556 1.978725
C 2.171859 -6.792060 0.678988
C 6.521124 -5.622797 -0.773797
C 1.725619 -5.768633 -0.223397
C 3.602427 -2.325680 1.762008
C 1.937521 -1.686895 1.743159
O 0.745526 -0.114246 -0.949490
O 4.754360 -6.531145 1.998913
H 1.114732 -1.158810 1.486939
H 6.410490 -5.411647 0.062737
H 4.164330 -6.743763 1.802804
O 2.587841 -3.979700 2.609748
H 2.192073 -2.815376 -0.809569
H 5.501795 -2.326438 1.325829
H 3.285032 -1.212541 1.284453
18
Generated by VMD in absentia
C 3.564424 -3.117406 -0.032879
C 2.894745 -0.632591 0.532311
C 3.384916 -5.383135 1.179585
C 0.793488 -0.894539 -0.886891
C 1.348785 -6.501867 1.648604
C 2.189941 -2.438067 0.616090
C 2.043378 -4.966472 0.691603
C 3.124161 -5.792896 0.545362
C 5.741472 -0.640590 2.825374
O 0.300550 -7.149663 0.942726
O 1.344387 -0.121382 2.169401
H 4.963296 -0.964665 -0.230523
H 6.651423 -4.905053 2.509626
H 5.059694 -6.166516 0.102255
O 5.046864 -3.288883 0.853948
H 2.389007 -3.057664 1.806301
H 2.365876 -0.956860 1.458959
H 2.892502 -0.097422 -0.531714
The script I used to do it was:
random -n $((4 * 18)) -T '%8:6[0:7]F %8:6[-8:0]F %8:6[-1:3]F' |
awk 'BEGIN { n = split("CCCCCCCCCOOHHHOHHH", atoms, ""); atoms[0] = atoms[n] }
NR % n == 1 { print n; print " Generated by VMD in absentia" }
{ print "", atoms[NR%18], " ", $0 }'
The -n option to random says how many rows to generate; I chose 72. The -T option is a template, and the notation %8:6[0:7]F means use %8.6F format to print uniformly distributed random numbers between 0 and 7. The awk script takes the data that is so generated and interpolates the noise (the number of atoms and a variant on the 'generated by VMD' line), as well as tagging the lines with the appropriate atomic symbol.
Processing the sample data
Given some data, you then need to munge it to get the required output. This script more or less does the job. There are endless ways it should be improved, of course, such as taking file names as command line arguments, using temporary file names instead of fixed names, cleaning up the intermediate files, different compounds, different atoms (nitrogen, phosphorous, etc), and so on. However, it should adapt reasonably easily.
input="data"
output="output"
n=$(sed 1q "$input")
n2=$(($n+2))
for ((i = 3; i <= n2; i++))
do
colno=$(printf "%.2d" $(($i-2)))
awk -v N=$n2 -v R=$i \
' BEGIN { name["C"] = "Carbon"; name["H"] = "Hydrogen"; name["O"] = "Oxygen";
R0 = R % N }
NR > 2 && NR <= R { count[$1]++; }
NR == R { printf "%-32.32s\n", name[$1] " " count[$1]; }
NR % N == R0 { xyz = sprintf("%s %s %s", $2, $3, $4); printf "%-32.32s\n", xyz }
' "$input" > "column.$colno"
done
paste -d ' ' column.* > "$output"
The first four lines set up the control parameters, collecting the number of lines per unit of data from the input file, and adjusting things accordingly. The for loop iterates over offsets 3 to $n2 inclusive (skipping the two header lines), and runs the awk script. That encodes atom types (BEGIN), determines which atom it is processing this time (NR > 2 && NR <= R and NR == R), and then arranges to print the triplets of data for the relevant atom. The formatting is carefully organized so that the column headings and the actual xyz-triplets are uniformly spaced. These are written to a file column.$colno. When all's done, the column.* files are pasted to generate a single output file, which looks like this:
Carbon 1 Carbon 2 Carbon 3 Carbon 4 Carbon 5 Carbon 6 Carbon 7 Carbon 8 Carbon 9 Oxygen 1 Oxygen 2 Hydrogen 1 Hydrogen 2 Hydrogen 3 Oxygen 3 Hydrogen 4 Hydrogen 5 Hydrogen 6
0.979485 -6.665347 0.575383 1.191999 -3.002386 2.859484 3.151517 -5.610077 0.429413 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 2.810233 -6.641054 -0.797672 6.854169 -3.191721 -0.925670 2.914233 -1.060001 0.776983 3.803923 -1.497032 2.908799 5.669443 -7.227666 -0.647552 0.092455 -5.850637 2.959987
6.042840 -7.254720 2.093573 2.551942 -6.044322 2.061072 3.523150 -6.167163 2.451689 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 -0.511743 5.040635 -0.676199 -0.590970 1.546725 -1.294582 2.562937 4.231461 -7.180908 1.629901 3.297836 -1.557133 -0.133280 3.442481 -4.489962 2.111930 1.423611 -7.982655 0.715618
1.432495 -7.686243 2.525734 5.038409 -4.976270 2.826846 6.184137 -7.303094 2.711561 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 1.486939 6.410490 -5.411647 0.062737 4.164330 -6.743763 1.802804 2.587841 -3.979700 2.609748 2.192073 -2.815376 -0.809569 5.501795 -2.326438 1.325829 3.285032 -1.212541 1.284453
3.564424 -3.117406 -0.032879 2.894745 -0.632591 0.532311 3.384916 -5.383135 1.179585 0.793488 -0.894539 -0.886891 1.348785 -6.501867 1.648604 2.189941 -2.438067 0.616090 2.043378 -4.966472 0.691603 3.124161 -5.792896 0.545362 5.741472 -0.640590 2.825374 0.300550 -7.149663 0.942726 1.344387 -0.121382 2.169401 4.963296 -0.964665 -0.230523 6.651423 -4.905053 2.509626 5.059694 -6.166516 0.102255 5.046864 -3.288883 0.853948 2.389007 -3.057664 1.806301 2.365876 -0.956860 1.458959 2.892502 -0.097422 -0.531714
Your task is to understand why all the bits of the awk script are present. For example, why is R0 needed (hint, experiment without the R0 calculation, and use R in its place).

Plotting a function directly from a text file

Is there a way to plot a function based on values from a text file?
I know how to define a function in gnuplot and then plot it but that is not what I need.
I have a table with constants for functions that are updated regularly. When this update happens I want to be able to run a script that draws a figure with this new curve. Since there are quite few figures to draw I want to automate the procedure.
Here is an example table with constants:
location a b c
1 1 3 4
2
There are two ways I see to solve the problem but I do not know if and how they can be implemented.
I can then use awk to produce the string: f(x)=1(x)**2+3(x)+4, write it to a file and somehow make gnuplot read this new file and plot on a certain x range.
or use awk inside gnuplot something like f(x) = awk /1/ {print "f(x)="$2 etc., or use awk directly in the plot command.
I any case, I'm stuck and have not found a solution to this problem online, do you have any suggestions?
Another possibilty to have a somewhat generic version for this, you can do the following:
Assume, the parameters are stored in a file parameters.dat with the first line containing the variable names and all others the parameter sets, like
location a b c
1 1 3 4
The script file looks like this:
file = 'parameters.dat'
par_names = system('head -1 '.file)
par_cnt = words(par_names)
# which parameter set to choose
par_line_num = 2
# select the respective string
par_line = system(sprintf('head -%d ', par_line_num).file.' | tail -1')
par_string = ''
do for [i=1:par_cnt] {
eval(word(par_names, i).' = '.word(par_line, i))
}
f(x) = a*x**2 + b*x + c
plot f(x) title sprintf('location = %d', location)
This question (gnuplot store one number from data file into variable) had some hints for me in the first answer.
In my case I have a file which contains parameters for a parabola. I have saved the parameters in gnuplot variables. Then I plot the function containing the parameter variables for each timestep.
#!/usr/bin/gnuplot
datafile = "parabola.txt"
set terminal pngcairo size 1000,500
set xrange [-100:100]
set yrange [-100:100]
titletext(timepar, apar, cpar) = sprintf("In timestep %d we have parameter a = %f, parameter c = %f", timepar, apar, cpar)
do for [step=1:400] {
set output sprintf("parabola%04d.png", step)
# read parameters from file, where the first line is the header, thus the +1
a=system("awk '{ if (NR == " . step . "+1) printf \"%f\", $1}' " . datafile)
c=system("awk '{ if (NR == " . step . "+1) printf \"%f\", $2}' " . datafile)
# convert parameters to numeric format
a=a+0.
c=c+0.
set title titletext(step, a, c)
plot c+a*x**2
}
This gives a series of png files called parabola0001.png,
parabola0002.png,
parabola0003.png,
…, each showing a parabola with the parameters read from the file called parabola.txt. The title contains the parameters of the given time step.
For understanding the gnuplot system() function you have to know that:
stuff inside double quotes is not parsed by gnuplot
the dot is for concatenating strings in gnuplot
the double quotes for the awk printf command have to be escaped, to hide them from gnuplot parser
To test this gnuplot script, save it into a file with an arbitrary name, e.g. parabolaplot.gplot and make it executable (chmad a+x parabolaplot.gplot). The parabola.txt file can be created with
awk 'BEGIN {for (i=1; i<=1000; i++) printf "%f\t%f\n", i/200, i/100}' > parabola.txt
awk '/1/ {print "plot "$2"*x**2+"$3"*x+"$4}' | gnuplot -persist
Will select the line and plot it
This was/is another question about how to extract specific values into variables with gnuplot (maybe it would be worth to create a Wiki entry about this topic).
There is no need for using awk, you can do this simply with gnuplot only (hence platform-independent), even with gnuplot 4.6.0 (March 2012).
You can do a stats (check help stats) and assign the values to variables.
Data: SO15007620_Parameters.txt
location a b c
1 1 3 4
2 -1 2 3
3 2 1 -1
Script: (works with gnuplot 4.6.0, March 2012)
### read parameters from separate file into variables
reset
FILE = "SO15007620_Parameters.txt"
myLine = 1 # line index 0-based
stats FILE u (a=$2, b=$3, c=$4) every ::myLine::myLine nooutput
f(x) = a*x**2 + b*x + c
plot f(x) w l lc rgb "red" ti sprintf("f(x) = %gx^2 + %gx + %g", a,b,c)
### end of script
Result: