How to find the total of second column using awk commands? - awk

Input file(filename:cat)
item1,200
item2,499
item3,699
item4,800
Awk command which i had tried
awk -F"," '{x+=$2}END{print x}'cat
Error
The above command display empty output.Is it any possible way to overcome with any solutions for it.

Edited and Final command
awk -F"," '{x+=$2}END{print x}' cat

Related

awk command to print columns with colum data

cat file1.txt | awk -F '{print $1 "|~|" $2 "|~|" $3}' > file2.txt
I am using above command to filter first three columns from file1 and put into file.
But only getting the column names and not the column data.
How to do that?
|~| - is the delimiter.
file1.txt has values as :
a|~|b|~|c|~|d|~|e
1|~|2|~|3|~|4|~|5
11|~|22|~|33|~|44|~|55
111|~|222|~|333|~|444|~|555
my expedted output is :
a|~|b|~|c
1|~|2|~|3
11|~|22|~|33
111|~|222|~|333
With your shown samples, please try following awk code. You need to set field separator to |~| and remove starting space from lines, then print the lines.
awk -F'\\|~\\|' -v OFS='|~|' '{sub(/^[[:blank:]]+/,"");print $1,$2,$3}' Input_file
In case you want to keep spaces(which was in initial post before edit) then try following:
awk -F'\\|~\\|' -v OFS='|~|' '{print $1,$2,$3}' Input_file
NOTE: Had a chat with user in room and got to know why this code was not working for user because of gunzip -c file was being used wrongly, its output was being saved into a variable on which user was running awk program, so correcting that command generated right file and awk program ran fine on it. Adding this as a reference for future readers.
One approach would be:
awk -v FS="," -v OFS="|~|" '{gsub(/[|][~][|]/,","); sub(/^\s*/,""); print $1,$2,$3}' file1.txt
The approach simply replaces all "|~|" with a "," setting the output file separator to "|~|". All leading whitespace is trimmed with sub().
Example Use/Output
With your data in file1.txt, you would have:
$ awk -v FS="," -v OFS="|~|" '{gsub(/[|][~][|]/,","); sub(/^\s*/,""); print $1,$2,$3}' file1.txt
a|~|b|~|c
1|~|2|~|3
11|~|22|~|33
111|~|222|~|333
Let me know if this is what you intended. You can simply redirect, e.g. > file2.txt to write to the second file.
For such cases, my bash+awk script rcut comes in handy:
rcut -Fd'|~|' -f-3 ip.txt
The -F option enables fixed string input delimiter (which is given using the -d option). And by default, the output field separator will also be same as -d when -F is active. -f-3 is similar to cut syntax to specify first three fields.
For better speed, use hck command:
hck -Ld'|~|' -D'|~|' -f-3 ip.txt
Here, -L enables literal field separator and -D specifies output field separator.
Another benefit is that hck supports -z option to automatically handle common compressed formats based on filename extension (adding this since OP had an issue with compressed input).
Another way:
sed 's/|~|/\t/g' file1.txt | awk '{print $1"|~|"$2"|~|"$3}' > file2.txt
First replace the |~| delimiter, and use the default awk separator, then print columns what you need.

awk print several substring

I would like to be able to print several substrings via awk.
Here an example of what I usually do;
awk' {print substr($0,index($0,string),10)} ' test.txt > result.txt
This allow me to print 10 letters after the discovery of my string.
But the result is the first one substring, instead of several as I expected.
Here an example if I use the string "ATGC" :
test.txt
ATGCATATAAATGCTTTTTTTTT
result.txt
ATGCATATAA
instead of
ATGCATATAA
ATGCTTTTTT
What I have to add ?
I'm sure the answer is easy for you guys !
Thank you for your help.
If you have gawk (gnu awk), you can make use of FPAT:
awk -v FPAT='ATGC.{6}' '{for(i=1;i<=NF;i++)print $i}' file
With your example:
$ awk -v FPAT='ATGC.{6}' '{for(i=1;i<=NF;i++)print $i}' <<<"ATGCATATAAATGCTTTTTTTTT"
ATGCATATAA
ATGCTTTTTT
awk '{print substr($0,1,10),RS substr($0,length -12,10)}' file
ATGCATATAA
ATGCTTTTTT

Running awk command in awk script

I am just looking to run a simple script that runs an awk command inside of the awk script.
sample_enrollment.csv file: "EffectiveDate","Status","EmployeeID","ClientID"
Below is the Lab4_1.awk
#!/bin/bash
BEGIN{FS=","}
{
awk 'gsub(/EfectiveDate/, "Effective Date")'
}
I am running the command from the command line like this
awk -f lab4_1.awk sample_enrollment.csv
The error that I am getting seems to indicate that the ' ' in the awk gsub command seem to be wrong. I have tried many variations on this awk command with out any luck. I am just asking for this portion, as I will need to add more to the awk script after I get this done
Any help would be appreciated. Thank you
I don't think there is need for using 2 awk commands here as per your shown effort it could be done in single awk like as follows too.
awk -F, '{gsub(/EfectiveDate/, "Effective Date")} 1' Input_file
As I mentioned in comments too in case you have more requirements you could let us know with samples in code tags in your post and we could help you from there too.
EDIT: As OP mentioned a script is needed so now adding code in a bash script format too.
cat script
#!/bin/bash
awk '{gsub("EffectiveDate","Effective Date")} 1' Input_file
......... do my other stuff too here in bash or awk...........

Get last field using awk

I am new to awk and want to have awk in shell script to select ami name in my automation pipeline
{"us-west_ami" :"ami-123" }
I want to select "ami-123" and pass it new job.
I tried to use print $NF but it is not selecting the last value.
if it is json format, use the right tool, e.g. jq:
kent$ jq '."us-west_ami"' <<<'{"us-west_ami" :"ami-123" }'
"ami-123"
print $NF indeed does select the last field but first you need to define what are the record and field separators (RS and FS). In this case it would be easiest to use Gnu awk and define the FPAT:
$ awk 'BEGIN{FPAT="\"[^\"]+\""}{print $NF}' file
"ami-123"
See this for more details on FPAT.
grep is not right tool to parse json, but still for given input this will work
$ grep -oP '"us-west_ami"(\s+)?:\K[^,}]*' <<< '{"us-west_ami" :"ami-123" }'
"ami-123"
To save in variable
$ myvar=$(grep -oP '"us-west_ami"(\s+)?:\K[^,}]*' <<<'{"us-west_ami" :"ami-123" }')
$ echo "$myvar"
"ami-123"

awk extract of a series of lines

I am stuck at getting a right solution using awk to extract versions between "[]" from
Version Repository Repository URL
[1.0.0.44] repo-0 file://test/test-1.0.0.44-features.xml
[1.0.0.21] repo-0 file://test/test-1.0.0.21-features.xml
Is there any quick efficient one-liners anyone can help with please?
With awk, using square brackets as the field separators, output field 2 except for record number 1:
awk -F '[][]' 'NR > 1 {print $2}'
Or, grep with -o is useful for extracting substrings
grep -oP '(?<=\[)[^]]+'