Read text file using fortran, where the starting 18 lines are strings and the rest of the lines are real numbers in 1000*8 array? - file-io

I have a text file where the first few lines are text and the rest of the lines contain data in the form of real numbers. I just require the real numbers array to be stored in a new file. For this, I read the total lines of the file for which output is correct and then trying to read the real numbers from the particular line numbers. I am unable to understand as to how to read this data.
Below is a part of file. Also I have many files like these to read.
AptitudeQT paperI: 12233105
Latitude : 30.00 S
Longitude: 46.45 E
Attemptone Time: 2017-03-30-09-03
End Time: 2017-03-30-14-55
Height(agl): m
Pressure: hPa
Temperature: deg C
Humidity: %RH
Uvelocity: cm/s
Vvelocity: cm/s
WindSpeed: cm/s
WindDirection: deg
---------------------------------------
10 1008.383 27.655 62.200 -718.801 -45.665 720.250 266.500
20 1007.175 27.407 62.950 -792.284 -18.481 792.500 268.800

There are many examples how to skip/read lines like this
But to sum it up, option A is to skip header and read only data:
! Skip first 17 lines
do i = 1, 17
read (unit,*,IOSTAT=stat) ! Dummy read
if ( stat /= 0 ) stop "error"
end do
! Read data
do i = 1, 1000
read (unit,*,IOSTAT=stat) data(:,i)
if ( stat /= 0 ) stop "error"
end do
If you have many files like this, I suggest wrapping this in a subroutine/function.
Option B is to use unix tail utility to discard header (more info here):
tail -n +18 file.txt

Related

how to split large text files into smaller text files using vba?

I have a database textfile.
It is large text file about 387,480 KB. This file contains table name, headers of the table and values. I need to split this file into multiple files each containing table creation and insertion with a file name as table name.
Please can anyone help me??
I don't see how Excel will open a 347MB file. You can try to load it into Access, and do the split, using VBA. However, the process of importing a file that large may fragment enough to blow Access up to #GB, and then it's all over. SQL Server would handle this kind of job. Alternatively, you could use Python or R to do the work for you.
### Python:
import pandas as pd
for i,chunk in enumerate(pd.read_csv('C:/your_path/main.csv', chunksize=3)):
chunk.to_csv('chunk{}.csv'.format(i))
### R
setwd("C:/your_path/")
mydata = read.csv("annualsinglefile.csv")
# If you want 5 different chunks with same number of lines, lets say 30.
# Chunks = split(mydata,sample(rep(1:5,30))) ## 5 Chunks of 30 lines each
# If you want 100000 samples, put any range of 20 values within the range of number of rows
First_chunk <- sample(mydata[1:100000,]) ## this would contain first 100000 rows
# Or you can print any number of rows within the range
# Second_chunk <- sample(mydata[100:70,] ## this would contain last 30 rows in reverse order if your data had 100 rows.
# If you want to write these chunks out in a csv file:
write.csv(First_chunk,file="First_chunk.csv",quote=F,row.names=F,col.names=T)
# write.csv(Second_chunk,file="Second_chunk.csv",quote=F,row.names=F,col.names=T)

How can I remove lines from a file with more than a certain number of entries

I've looked at the similar question about removing lines with more than a certain number of characters and my problem is similar but a bit trickier. I have a file that is generated after analyzing some data and each line is supposed to contain 29 numbers. For example:
53.0399 0.203827 7.28285 0.0139936 129.537 0.313907 11.3814 0.0137903 355.008 \
0.160464 12.2717 0.120802 55.7404 0.0875189 11.3311 0.0841887 536.66 0.256761 \
19.4495 0.197625 46.4401 2.38957 15.8914 17.1149 240.192 0.270649 19.348 0.230\
402 23001028 23800855
53.4843 0.198886 7.31329 0.0135975 129.215 0.335697 11.3673 0.014766 355.091 0\
.155786 11.9938 0.118147 55.567 0.368255 11.449 0.0842612 536.91 0.251735 18.9\
639 0.184361 47.2451 0.119655 18.6589 0.592563 240.477 0.298805 20.7409 0.2548\
56 23001585
50.7302 0.226066 7.12251 0.0158698 237.335 1.83226 15.4057 0.059467 -164.075 5\
.14639 146.619 1.37761 55.6474 0.289037 11.4864 0.0857042 536.34 0.252356 19.3\
91 0.198221 46.7011 0.139855 20.1464 0.668163 240.664 0.284125 20.3799 0.24696\
23002153
But every once in a while, a line like the first one appears that has an extra 8 digit number at the end from analyzing an empty file (so it just returns the file ID number but not on a new line like it should). So I just want to find lines that have this extra 30th number and remove just that 30th entry. I figure I could do this with awk but since I have little experience with it I'm not sure how. So if anyone can help I'd appreciate it.
Thanks
Summary: Want to find lines in a text file with an extra entry in a row and remove the last extra entry so all rows have same number of entries.
With awk, you tell it how many fields there are per record. The extras are ignored
awk '{NF = 29; print}' filename
If you want to save that back to the file, you have to do a little extra work
awk '{NF = 29; print}' filename > filename.new && mv filename.new filename

[Q]uestion about reading and saving a large txt-file via {RSQLite} line by line into a DB

Since my hardware is very limited (a dual core with 32bit Win7 and 4GB of ram - I need to make the best of it.....) I try to save a large text-file (about 1.2GB) into a DB, which I can then trigger by SQL-like queries to do some analytics on particular subgroups.
To be honest I'm not familiar with this area and since I could not find help regarding my issues via "googling", I just quickly show what I came up with and how I thought things would look like:
First I check how many columns my txt-file has:
k <- length(scan("data.txt", nlines=1, sep="\t", what="character"))
Then I open a connection to the text file so that it does not need to be opened
again for every single line:
filecon<-file("data.txt", open="r")
Then I initialize a connection (dbcon) to an SQLite database
dbcon<- dbConnect(dbDriver("SQLite"), dbname="mydb.dbms")
I find out where the position of the first line is
pos<-seek(filecon, rw="r")
Since the first line contains the column-names I save them for later use
col_names <- unlist(strsplit(readLines(filecon, n=1), "\t"))
Next, I test to read the first 10 lines, line by line,
and save them into a DB, which themself (should) contain k - columns with columns-names = col_names.
for(i in 1:10) {
# prints the iteration number in hundreds
if(i %% 100 == 0) {
print(i)
}
# read one line into a variable tt
tt<-readLines(filecon, n=1)
# parse tt into a variable tt2, since tt is a string
tt2<-unlist(strsplit(tt, "\t"))
# Every line, read and parsed from the text file, is immediately saved
# in the SQLite database table "results" using the command dbWriteTable()
dbWriteTable(conn=dbcon, name="results", value=as.data.frame(t(tt2[1:k]),stringsAsFactors=T), col.names=col_names, append=T)
pos<-c(pos, seek(filecon, rw="r"))
}
If I run this I get the following error
Warning messages:
1: In value[[3L]](cond) :
RS-DBI driver: (error in statement: table results has 738 columns but 13 values were supplied)
Why should I supply 738 columns? If I change k (which is 12) to 738, the code works but then I need to trigger the columns by V1, V2, V3,.... and not by the column-names I intended to supply
res <- dbGetQuery(dbcon, "select V1, V2, V3, V4, V5, V6 from results")
Any help or even a small hint is very much appreciated!

ksh loop putting results of file into variable

I'm not sure the best way to handle this, I'm guessing it's using a while loop. I have a .txt file with a set of numbers ( these numbers can change based on another script that runs )
ex:
0
36
41
53
60
Each number is on it's own line. For each number I want to get that number and execute a script using it. So in this example I would call a script to stop database 0, after that completes call a script to stop database 36 and so on until it's complete with all numbers in the list.
1) Is a while loop the best way to handle this?
2) I'm having trouble trying to determine what the [[condition]] needs to be to get each number 1 at a time, where can i find some additional help on this?
while [[ condition ]] ; do
command1
done
For testing purposes the file that contains all the numbers is test.txt. The script that will execute is a python script - "amgr.py stop (number from test.txt)"
Here's a simpler way:
cat test.txt | xargs amgr.py stop
this will get each line of your file, and then put it as an extra parameter for your amgr.py:
amgr.py stop 0
amgr.py stop 36
and so on..
I ended up using this method to give me the results i was looking for.
while read -r line ; do
amgr.py stop $line
done<test.txt

3D Graph in Octave/Matlab from a CSV file

I'm new to Octave/Matlab and I want to plot a 3D-Graph.
I was able to do so using a predefined formula, like this:
x=1:.1:5;
y=1:.1:5;
[xx,yy] = meshgrid(x,y);
z = sin(xx)+sin(yy);
mesh(x,y,z);
But now the question is how to do the same getting the data from a CSV (for example). I know I can use the function csvread, but the big question is how to format the CSV to contain such data.
An example of doing the same graph above but this time grabbing the data from Excel/CSV would be appreciated. Thanks!
Done! I was finally able to do it!
Here's how I did it:
1) I've created a file in Excel with the X values in the cells A2:A42, and the Y values in the cells B1:AP1 (so you form a rectangle).
2) Then in the cells in the middle I put the formula I want (ie =sin(A$2)+sin($B1))
3) Saved the file as CSV (but separated by spaces!) and manually edited it to look this way (the way QtOctave opens matrix files, in Matlab it might be different). For example (note the extra space before each column):
# Created by Octave 3.2.4, Thu Jan 12 19:32:05 2012 ART <diego#notebook2>
# name: z
# type: matrix
# rows: 3
# columns: 3
1 2 3
4 5 6
7 8 9
(if you're not sure how to do it, do what I did: create a simple matrix and export it to see how the exported file looks like!)
4) Octave has a function under Data -> Load matrix from file, which loads that kind of files. Or actually running this command (varname is the name of the resulting variable):
load("-text", "file-where-the-data-is", "varname")
5) Create the graph (ex is the name of the matrix I've just imported):
x=1:.1:5;
y=1:.1:5;
mesh(x,y,ex)