Using gawk for Windows included in GnuWin32, how do you append the filename to a text file?.
This is an example of what I want:
Filename -> text.txt
"aaaa","bbbb","c"
The result should be:
"aaaa","bbbb","c","text.txt"
It exists (at least in Linux version) a variable FILENAME with the name of the file in process. So use it:
awk '
BEGIN {
FS = OFS = ",";
}
{
print $0 OFS "\"" FILENAME "\""
}
' input-file
Related
For example, say I had the following lines:
1,r,other,columns,....,
4,w,...,
2,w,etc...
3,r
1,w
2,r
I would want my output written to a file (or overwrite the existing file) as:
1,r/w,other,columns,....,
4,w,...,
2,r/w,etc...
3,r
Where order does not matter in the end.
The first "row" of the line where commas are delimiters are the patterns to match, once matched, one will have 'r' and the other 'w' as their second row, I want to combine them into one line like the example above.
Update
I've managed to get it working with the command:
awk -F, '{a[$1]=a[$1]?a[$1] OFS $2:$2} END{for (i in a) print i FS a[i]}' OFS="/" file
However, this erases all other columns that come after the second, how can I preserve those columns?
$ cat tst.awk
BEGIN { FS=OFS="," }
{
key = $1
perms[key] = (key in perms ? perms[key] "/" : "") $2
}
$3 != "" {
sub(/([^,]*,){2}/,"")
vals[key] = $0
}
END {
for (key in perms) {
print key, perms[key] (key in vals ? OFS vals[key] : "")
}
}
$ awk -f tst.awk file
1,r/w,other,columns,....,
2,w/r,etc...
3,r
4,w,...,
I have 20 files, I want to print the first column of each file into a different file.I need 20 output files.
i have tried the following command, but this one puts all the output into a single file.
awk '{print $1}' /home/gee/SNP_data/20* > out_file
write the output to different files, i have 20 input files
1st solution: Could you please try following.
awk '
FNR==1{
if(file){
close(file)
}
file="out_file_"FILENAME".txt"
}
{
print $1 > (file)
}
' /home/gee/SNP_data/20*
Explanation: Adding explanation for above code.
awk ' ##Starting awk program here.
FNR==1{ ##checking condition if FNR==1 then do following.
if(file){ ##Checking condition if variable file is NOT NULL then do following.
close(file) ##Using close to close the opened output file in backend, to avoid too many opened files error.
} ##Closing BLOCK for if condition.
file="out_file_"FILENAME".txt" ##Setting variable file value to string out_file_ then FILENAME(which is Input_file) and append .txt to it.
} ##Closing BLOCK for condition for FNR==1 here.
{
print $1 > (file) ##Printing first field to variable file here.
}
' /home/gee/SNP_data/20* ##Mentioning Input_file path here to pass files here.
2nd solution: In case you need to get output files like output_file_1.txt ans so on then try following. I have created an awk variable named out_file where you could change your output file's name's initial too(as per your need).
awk -v out_file="Output_file_" '
FNR==1{
if(file){
close(file)
}
++count
file=out_file count".txt"
}
{
print $1 > (file)
}
' /home/gee/SNP_data/20*
Awk has a builtin redirection operator, you can use it like:
awk '{ print $1 > ("out_" FILENAME) }' /home/gee/SNP_data/20*
or, even better:
awk 'FNR==1 { close(f); f=("out_" FILENAME) } { print $1 > f }' /home/gee/SNP_data/20*
Former is just an example usage of redirection operator, latter is how to use it robustly.
I have a script that looks like this:
#! /bin/awk -f
BEGIN { print "start" }
{ print $0 }
END { print "end" }
Call the script like this: ./myscript.awk test.txt
Pretty simple - takes a file and adds "start" to the start and "end" to the end.
Now I want to take the input filename, lets call it test.txt, and print the output to a file called test.out.
So I tried to print the input filename:
BEGIN { print "fname: '" FILENAME "'" }
But that printed: fname: '' :(
The rest I can figure out I think, I have this following to print to a hard-coded filename:
#! /bin/awk -f
BEGIN { print "start" > "test.out" }
{ print $0 >> "test.out" }
END { print "end" >> "test.out" }
And that works great.
So the questions are:
how do I get the input filename?
Assuming somehow I get the input file name in a variable, e.g. FILENAME which contains "test.txt" how would I make another variable, e.g. OUTFILE, which contains "test.out"?
Note: I will be doing much more awk processing so please don't suggest to use sed or other languages :))
Try something like this:
#! /bin/awk -f
BEGIN {
file = gensub(".txt",".out","g",ARGV[1])
print "start" > file
}
{ print $0 >> file }
END {
print "end" >> file
close(file)
}
I'd suggest to close() the file too in the END{} statement. Good call to Sundeep for pointing out that FILENAME is empty in BEGIN.
$ echo 'foo' > ip.txt
$ awk 'NR==1{op=FILENAME; sub(/\.[^.]+$/, ".log", op); print "start" > op}
{print > op}
END{print "end" > op}' ip.txt
$ cat ip.log
start
foo
end
Save FILENAME to a variable, change the extension using sub and then print as required
From gawk manual
Inside a BEGIN rule, the value of FILENAME is "", because there are no input files being processed yet
If you're using GNU awk (gawk), you can use the patterns BEGINFILE and ENDFILE
awk 'BEGINFILE{
outfile=FILENAME;
sub(".txt",".out",outfile);
print "start" > outfile
}
ENDFILE{
print "stop" >outfile
}' file1.txt file2.txt
You can then use the variable outfile your the main {...} loop.
Doing so will allow you to process more that 1 file in a single awk command.
i am running a following awk script
awk 'BEGIN { FS="|" ; OFS="|" }; { printf $0, $1 "_" $2 }' .someFile
unfortunatley the concatention of fields 1 and 2 is printed on new line, looks like the last field contains a new line character
how can i trim it ?
If you want to use printf (which may have been accidental), I think you can use this:
awk 'BEGIN { FS = OFS = "|" } { printf "%s%s%s_%s", $0, OFS, $1, $2 }' .someFile
printf should always be used with a format string. printf doesn't add the Output Record Separator to the end of what it prints, so you have to do that yourself using \n in the format string or by adding %s and passing ORS as the last argument to printf.
In this case, I think you can just use print though:
awk 'BEGIN { FS = OFS = "|" } { print $0, $1 "_" $2 }' .someFile
My input file has a plain-text representation of the newline character in it separating the fields:
First line\nSecond line\nThird line
I would expect the following to replace that text \n with a newline:
$ awk 'BEGIN { FS = "\\n"; OFS = "\n" } { print $1 }' test.txt
First line\nSecond line\nThird line
But it doesn't (gawk 4.0.1 / OpenBSD nawk 20110810).
I'm allowed to separate on just the \:
$ awk 'BEGIN { FS = "\\"; OFS = "\n" } { print $1, $2 }' test.txt
First line
nSecond line
I can also use a character class in gawk:
$ awk 'BEGIN { FS = "[[:punct:]]n"; OFS = "\n" } { $1 = $1; print $0 }' test.txt
First line
Second line
Third line
But I feel like I should be able to specify the exact separator.
A field separator is a type of regexp and when using a dynamic regexp you need to double escape everything:
$ awk 'BEGIN { FS = "\\\\n"; OFS = "\n" } { print $1 }' file
First line
See the man page for details.
Here sed might be a better tool for this task
sed 's/\\n/\n/g'