Prolog reading the table in txt - file-io

I am having a problem of reading a text file that contains this information:
T11, R1, 6:00-18:00
T12, R1, 6:00-18:00
T13, R1, 18:00-6:00
For now I have a prolog code for reading it , if I add '' in each line and period in the end. It converts it to one List , but I need separate lists for each line. I also tried to
use:
/*rows(Total,Rows_list):-
atomic_list_concat(Rows_list,nl, Total),
write(Rows_list), nl.*/
But it does not work and displays error message of too long string.
main :-
open('taxi.txt', read, Str),
read_file(Str,Lines),
close(Str),
write(Lines),
nl.
read_file(Stream,[]) :-
at_end_of_stream(Stream).
read_file(Stream,[X|L]) :-
\+ at_end_of_stream(Stream),
read(Stream,X),
read_file(Stream,L).
/*rows(Total,Rows_list):-
atomic_list_concat(Rows_list,nl, Total),
write(Rows_list), nl.*/

read/2 isn't appropriate to parse 'free text' files, because it's meant to parse fully structured Prolog terms, like those written by writeq/1, or listing/0.
Usually, the easier way to parse files is by use of a DCG. Since it's a parse of character by character, you will need some attention to details:
:- [library(dcg/basics)].
read_file(Stream,[]) :-
at_end_of_stream(Stream).
read_file(Stream,[X|L]) :-
\+ at_end_of_stream(Stream),
read_line_to_codes(Stream, Codes),
( phrase(parse_record(Record), Codes) -> assertz(Record) ; writeln('ERROR')),
read_file(Stream,L).
parse_record(taxi(T1, R1, (H1:M1)-(H2:M2),
T2, R2, (H3:M3)-(H4:M4),
T3, R3, (H5:M5)-(H6:M6))) -->
parse_triple(T1,R1, (H1:M1)-(H2:M2)), " ",
parse_triple(T2,R2, (H3:M3)-(H4:M4)), " ",
parse_triple(T3,R3, (H5:M5)-(H6:M6)).
parse_triple(T,R, (H1:M1)-(H2:M2)) -->
string(Ts), ", ", string(Rs), ", ",
integer(H1), ":", integer(M1),
"-", integer(H2), ":", integer(M2),
{atom_codes(T,Ts), atom_codes(R,Rs)}.
An useful feature of DCG is that can be tested fairly easily inlining data:
?- phrase(parse_record(R),"T11, R1, 6:00-18:00 T12, R1, 6:00-18:00 T13, R1, 18:00-6:00").
R = taxi('T11', 'R1', (6:0)- (18:0), 'T12', 'R1', (6:0)- (18:0), 'T13', 'R1', (18:0)- (6:0))
edit I definitely need more coffee, as I didn't noticed the list argument you passed to read_file. The code should read
read_file(Stream,[X|L]) :-
\+ at_end_of_stream(Stream),
read_line_to_codes(Stream, Codes),
( phrase(parse_record(X), Codes) -> true ; writeln('ERROR')),
read_file(Stream,L).

Related

Is it possible to define two variables with the same name in two different segments?

I want to define two variables in two different segments with the same name and different values in Assembly language.
I use Emu8086 and I tried it as below but it didn't work:
DATASG1 SEGMENT 'DATA'
DATA1 DB 10H
DATASG1 ENDS
DATASG2 SEGMENT 'DATA'
DATA1 DB 10H
DATASG2 ENS
DATASG SEGMENT 'CODE'
MAIN PROC NEAR
MOV AX, DATASG1
MOV DS, AX
MOV AL, DATA1
MAIN ENDP
DATASG ENDS
END MAIN
My questions are:
Is it possible?
If yes, how?

Two variables with inconsistent names as input for a Snakemake rule

How can I pair up input data for rules in snakemake if the naming isn't consistent and they are all in the same folder?
For example if I want to use each pair of samples as input for each rule:
PT1 T5
S6 T7
S1 T20
In this example I would want to have 3 pairs PT1 & T5, S6 & T7, S1 & T20 so to start with, I would want to create 3 folders:
PT1vsT5
S6vsT7
S1vsT20
And then perform an analysis with manta and output the results into these 3 folders accordingly.
In the following pipeline I want the GERMLINE sample to be the first element in each line (PT1, S6, S1) and TUMOR the second one (T5, T7, T20).
rule all:
input:
expand("/{samples_g}vs{samples_t}", samples_g = GERMLINE, samples_t = TUMOR),
expand("/{samples_g}vs{samples_t}/runWorkflow.py", samples_g = GERMLINE, samples_t = TUMOR),
# Create folders
rule folders:
output: "./{samples_g}vs{samples_t}"
shell: "mkdir {output}"
# Manta configuration
rule manta_config:
input:
g = BAMPATH + "/{samples_g}.bam",
t = BAMPATH + "/{samples_t}.bam"
output:
wf = "{samples_g}vs{samples_t}/runWorkflow.py"
params:
ref = IND,
out_dir = "{samples_g}vs{samples_t}/runWorkflow.py"
shell:
"python configManta.py --normalBam {input.g} --tumorBam {input.t} --referenceFasta {params.ref} --runDir {params.out_dir} "
Could I do it by using as an input a .txt containing the pairs and then use a loop? If so how should I do it? Otherwise how could it be done?
You can generate the list of input or output files "manually" using any appropriate python code. For instance, you could proceed as follows to generate the first of your input lists:
In [1]: GERMLINE = ("PT1", "S6", "S1")
In [2]: TUMOR = ("T5", "T7", "T20")
In [3]: ["/{}vs{}".format(sample_g, sample_t) for (sample_g, sample_t) in zip(GERMLINE, TUMOR)]
Out[3]: ['/PT1vsT5', '/S6vsT7', '/S1vsT20']
So this would be applied as follows:
rule all:
input:
["/{}vs{}".format(sample_g, sample_t) for (sample_g, sample_t) in zip(GERMLINE, TUMOR)],
["/{}vs{}/runWorkflow.py".format(sample_g, sample_t) for (sample_g, sample_t) in zip(GERMLINE, TUMOR)],
(Note that I put sample_g and sample_t in singular form, as it sounded more logical in this context, where those variable represent individual sample names, and not lists of several samples)

Fortran: read variables that are not present in a file

I need help understanding this 50 line program
implicit none
integer maxk, maxb, maxs
parameter (maxk=6000, maxb=1000, maxs=5)
integer nk, nspin, nband, ik, is, ib
double precision e(maxb, maxs, maxk), k(maxk)
double precision ef, kmin, kmax, emin, emax
logical overflow
read(5,*) ef
read(5,*) kmin, kmax
read(5,*) emin, emax
read(5,*) nband, nspin, nk
overflow = (nband.gt.maxb) .or. (nk.gt.maxk) .or. (nspin.gt.maxs)
if (overflow) stop 'Dimensions in gnubands too small'
write(6,"(2a)") '# GNUBANDS: Utility for SIESTA to transform ',
. 'bands output into Gnuplot format'
write(6,"(a)") '#'
write(6,"(2a)") '# ',
. ' Emilio Artacho, Feb. 1999'
write(6,"(2a)") '# ------------------------------------------',
. '--------------------------------'
write(6,"(a,f10.4)") '# E_F = ', ef
write(6,"(a,2f10.4)") '# k_min, k_max = ', kmin, kmax
write(6,"(a,2f10.4)") '# E_min, E_max = ', emin, emax
write(6,"(a,3i6)") '# Nbands, Nspin, Nk = ', nband, nspin, nk
write(6,"(a)") '#'
write(6,"(a)") '# k E'
write(6,"(2a)") '# ------------------------------------------',
. '--------------------------------'
read(5,*) (k(ik),((e(ib,is,ik),ib=1,nband), is=1,nspin), ik=1,nk)
do is = 1, nspin
do ib = 1, nband
write(6,"(2f14.6)") ( k(ik), e(ib,is,ik), ik = 1, nk)
write(6,"(/)")
enddo
enddo
This is a free format Fortran file. The name of the program is gnubands and rearranges numbers in an input (which the user specifies). I would like to know how this program operates. Here is what I do not understand. The program takes input from a file, it reads
ef, kmin,kmax,emin,emax,nband,nspin,nk
However, all of these variables are not found inside the input file. I opened the input file in vi and conducted a search using /. I do not obtain any results. Nevertheless, the program appears to correctly pick all values. What is happening?
Also, I do not understand the read format
read(5,*) (k(ik),((e(ib,is,ik),ib=1,nband), is=1,nspin), ik=1,nk)
I am not familiar with the syntax and would like to know what it is saying or any references.
Some tutorial PDF of SIESTA shows that the input for gnubands.f is something like this:
whose header part is to be read by the first four read statements of gnubands.f. With this input, the variables are set as
ef = -5.018...
kmin = 0.000...
kmax = 3.338...
emin = -25.187...
emax = 143.069...
nband = 18
nspin = 1
nk = 150
by giving the input file from the standard input (assumed unit number 5) as
gfortran -o gnubands.x gnubands.f
gnubands.x < your_data_file.bands
Note that there are (and should be) no keywords like "ef" or "EF" or "Ef" (capitalization does not matter), because the numbers are directly read into the variables in gnubands.f. This is in contrast to other cases like using XML files, where (human-readable) tags or keywords are embedded in the file itself (e.g., pseudopotential files used by Quantum ESPRESSO). I guess your confusion might be coming from the use of namelist for obtaining input values, which looks like
namelist /your_inp/ a, b, c
read( funit, nml = your_inp )
with an input file
&your_inp
a = 1.0
b = "method1"
c = 77
/
In this case, the variable names (here, a, b, and c) appear literally in the input file.
Historically, 5 (in your read(5,*)) is stdin, so either
(1)you are supplying the value, when you are running the code,
or,(2) I guess when you run the SIESTA, (gnuband is a postprocessor of that) it creates a file, possibly named fort.5. Check that.

Fortran read file into array - transposed dimensions

I'm trying to read a file into memory in a Fortran program. The file has N rows with two values in each row. This is what I currently do (it compiles and runs, but gives me incorrect output):
program readfromfile
implicit none
integer :: N, i, lines_in_file
real*8, allocatable :: cs(:,:)
N = lines_in_file('datafile.txt') ! a function I wrote, which works correctly
allocate(cs(N,2))
open(15, 'datafile.txt', status='old')
read(15,*) cs
do i=1,N
print *, cs(i,1), cs(i,2)
enddo
end
What I hoped to get was the data loaded into the variable cs, with lines as first index and columns as second, but when the above code runs, it first gives prints a line with two "left column" values, then a line with two "right column" values, then a line with the next two "left column values" and so on.
Here's a more visual description of the situation:
In my data file: Desired output: Actual output:
A1 B1 A1 B1 A1 A2
A2 B2 A2 B2 B1 B2
A3 B3 A3 B3 A3 A4
A4 B4 A4 B4 B3 B4
I've tried switching the indices when allocating cs, but with the same results (or segfault, depending on wether I also switch indices at the print statement). I've also tried reading the values row-by-row, but because of the irregular format of the data file (comma-delimited, not column-aligned) I couldn't get this working at all.
How do I read the data into memory the best way to achieve the results I want?
I do not see any comma in your data file. It should not make any difference with the list-directed input anyway. Just try to read it like you write it.
do i=1,N
read (*,*) cs(i,1), cs(i,2)
enddo
Otherwise if you read whole array in one command, it reads it in column-major order, i.e., cs(1,1), cs(2, 1), ....cs(N,1), cs(1, 2), cs(2,2), ... This is the order in which the array is stored in memory.

Erlang: Reading integers from file

I am trying to read an integer value from a simple text file.
Row in my input file looks like this:
1 4 15 43
2 4 12 33
... (many rows of 4 integers)
I have opened the file in the following way:
{ok, IoDev} = file:open("blah.txt",[read])
But only thing that I manage to read are bytes with all the functions available.
What I finally want to get from file are tuples of integers.
{1 4 15 43}
{2 4 12 33}
...
You must first use file:read_line/1 to read a line of text, and use re:split/2 to get a list of strings containing numbers. Then use the list_to_integer BIF to get integers.
Here's an example (surely there is a better solution):
#!/usr/bin/env escript
%% -*- erlang -*-
main([Filename]) ->
{ok, Device} = file:open(Filename, [read]),
read_integers(Device).
read_integers(Device) ->
case file:read_line(Device) of
eof ->
ok;
{ok, Line} ->
% Strip the trailing \n (ASCII 10)
StrNumbers = re:split(string:strip(Line, right, 10), "\s+", [notempty]),
[N1, N2, N3, N4] = lists:map(fun erlang:list_to_integer/1,
lists:map(fun erlang:binary_to_list/1,
StrNumbers)),
io:format("~w~n", [{N1, N2, N3, N4}]),
read_integers(Device)
end.
(EDIT)
I found a somewhat simpler solution that uses io:fread to read formatted input. It works well in your case but fails badly when the file is badly constructed.
#!/usr/bin/env escript
%% -*- erlang -*-
main([Filename]) ->
{ok, Device} = file:open(Filename, [read]),
io:format("~w~n", [read_integers(Device)]).
read_integers(Device) ->
read_integers(Device, []).
read_integers(Device, Acc) ->
case io:fread(Device, [], "~d~d~d~d") of
eof ->
lists:reverse(Acc);
{ok, [D1, D2, D3, D4]} ->
read_integers(Device, [{D1, D2, D3, D4} | Acc]);
{error, What} ->
io:format("io:fread error: ~w~n", [What]),
read_integers(Device, Acc)
end.
Haven't seen edited part
/*
You can try fread/3
read() ->
{ok, IoDev} = file:open("x", [read]),
read([], IoDev).
read(List, IoDev) ->
case io:fread(IoDev, "", "~d~d~d~d") of
{ok, [A,B,C,D]} ->
read([{A,B,C,D} | List], IoDev);
eof ->
lists:reverse(List);
{error, What} ->
failed
end.
*/