Looking for a awk one liner for formatting some text in a file of this format where number of fields and number of lines are arbitrary:
abcd,abce,test1
bbcd,bbee,bbvc,test2
ccdd,ccbb,ccbd,ccab,testxyz
Where output is where the last field in each line is appended to each field in the line:
abcd,test1
abce,test1
bbcd,test2
bbee,test2
bbvc,test2
ccdd,testxyz
ccbb,testxyz
ccbd,testxyz
ccab,testxyz
Assuming all lines have at least 2 fields:
awk -F, '{OFS=","; for(i=1;i<NF;i++) print $i,$NF}' file
can do what you expect.
If there can be lines with just one field and you could just print it:
awk -F, '{OFS=","; for(i=1;i<NF;i++) print $i,$NF; if(NF==1) print $0}' file
This might work for you (GNU sed):
sed -r 's/,(.*(,[^,]*))$/\2\n\1/;P;D' file
If the line contains 2 or more commas, replace the first comma by a comma, the last field and a newline, print the first line, delete the first line and repeat.
Related
I am running Ubuntu Linux. I am in need to print filenames & line numbers containing more than 7 columns. There are several hundred thousand files.
I am able to print the number of columns per file using awk. However the output I am after is something like
file1.csv-463 which is to suggest file1.csv has more than 7 records on line 463. I am using awk command awk -F"," '{print NF}' * to print the number of fields across all files.
Please could I request help?
If you have GNU awk with you, try following code then. This will simply check condition if NF is greater than 7 then it will print that particular file's file name along with line number and nextfile will take program to next Input_file which will save our time because we need not to read whole Input_file then.
awk -F',' 'NF>7{print FILENAME,FNR;nextfile}' *.csv
Above will print only very first match of condition to get/print all matched lines try following then:
awk -F',' 'NF>7{print FILENAME,FNR}' *.csv
This might work for you (GNU sed):
sed -Ens 's/\S+/&/8;T;F;=;p' *.csv | paste - - -
If there is no eighth column, break.
Output the file name F, the line number = and print the current line p.
Feed the output into a paste command which prints three lines as one.
N.B. The -s option resets the line numbers for each file, without it, it will number each line for the entire input.
Is it possible to parse a .csv file and look for the 13th entry containing a particular value.
So data for example would be
10,1,a,bhd,5,7,10,,,8,9,3,19,0
I only want to extract lines which have a value of 3 in the 13th field if that makes sense.
Tried it wish a bash while loop using cut etc but was messy.
Not sure if there a awk / sed method.
Thanks in advance.
This is beginner level awk.
awk -F, '$13==3' file
-F, is for setting field separator to comma, $13 is the 13th field's value. For each line, if $13==3 evaluates true the line is printed.
I have a .txt file with 2 rows, and a seperator, some lines only contain 1 row though, so I want to remove those that only contain 1 row.
example of lines are
Line to keep,
Iamnotyours:email#email.com
Line to remove,
Iamnotyours:
Given your posted sample input all you need is:
grep -v ':$' file
or if you insist on awk for some reason:
awk '!/:$/' file
If that's not all you need then edit your question to clarify your requirements.
awk to the rescue!
$ awk -F: 'NF==2' file
prints only the lines with two fields
$ awk -F: 'NF>1' file
prints lines more than one field. Your case, you have the separator in place, the field count will be two. You need to check whether second field is empty
$ awk -F: '$2!=""' file
I have a script which returns few lines of output and I am trying to print the last two words of the last line (irrespective of number of lines in the output)
$ ./test.sh
service is running..
check are getting done
status is now open..
the test is passed
I tried running as below but it prints last word of each line.
$ ./test.sh | awk '{ print $NF }'
running..
done
open..
passed
how do I print the last two words "is passed" using awk or sed?
Just say:
awk 'END {print $(NF-1), $NF}'
"normal" awks store the last line (but not all of them!), so that it is still accessible by the time you reach the END block.
Then, it is a matter of printing the penultimate and the last one. This can be done using the NF-1 and NF trick.
For robustness if your last line can only contain 1 field and your awk doesn't retain the field values in the END section:
awk '{split($0,a)} END{print (NF>1?a[NF-1]OFS:"") a[NF]}'
This might work for you (GNU sed):
sed '$s/.*\(\<..*\<.*\)/\1/p;d' file
This deletes all lines in the file but on the last line it replaces all words by the last two words and prints them if successful.
I have a file containing lines of sentences. I want to print all lines that have more than 3 words. Words are separated by whitespace.
How could I do this with awk?
Use awk like this:
awk 'NF>3' file
GNU sed
sed -E '/\s*(\S+\s+){3}\S+/!d' file
The variable NF indicates the number of fields on the current input line.
awk 'NF>3' file