How to separate columns in linux using awk or cat [duplicate] - awk

This question already has answers here:
bash: shortest way to get n-th column of output
(8 answers)
Closed 2 years ago.
I have a file with 900.000 columns, the structure is:
1613 1200000012000500000011111.......
112345 1200000012000500000011111.......
1287659 1200000012000500000011111.......
1234 1200000012000500000011111.......
712826 1200000012000500000011111.......
I need only the numbers before the space, this is a new file as:
1613
112345
1287659
1234
712826
I try with
cat -df.txt |cut -d |,| -f7
but it does not work.

A couple of approaches, based on your attempts:
awk
As also suggested by RavinderSingh13, straightforward awk approach is
awk '{print $1}' yourfile.txt
remember awk doesn't need cat.
cut
With vanilla cut as well, this should work:
cut -f1 -d' ' yourfile.txt
Here you require cut to print the first field -f1 where the delimiter is a whitespace -d' '. Remember also cut doesn't need cat (unlike me always forgetting).
Other very nice approaches with grep and sed can be found in this question

Related

Convert multiple lines to a line separated by brackets and "|"

I have the following data in multiple lines:
1
2
3
4
5
6
7
8
9
10
I want to convert them to lines separated by "|" and "()":
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|10
I made a mistake. I'm sorry,I want to convert them to lines separated by "|" and "()":
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
What I have tried is:
seq 10 | sed -r 's/(.*)/(\1)/'|paste -sd"|"
What's the best unix one-liner to do that?
This might work for you (GNU sed):
sed 's/.*/(&)/;H;1h;$!d;x;s/\n/|/g' file
Surround each line by parens.
Append all lines to the hold space except for the first line which replaces the hold space.
Delete all lines except the last.
On the last line, swap to the hold space and replace all newlines by |'s.
N.B. When a line is deleted no further commands are invoked and the command cycle begins again. That is why the last two commands are only executed on the last line of the file.
Alternative:
sed -z 's/\n$//;s/.*/(&)/mg;y/\n/|/' file
With your shown samples please try following awk code. This should work in any version of awk.
awk -v OFS="|" '{val=(val?val OFS:"") "("$0")"} END{print val}' Input_file
Using GNU sed
$ sed -Ez ':a;s/([0-9]+)\n/(\1)|/;ta;s/\|$/\n/' input_file
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
Here is another simple awk command:
awk 'NR>1 {printf "%s|", p} {p="(" $0 ")"} END {print p}' file
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
Here it is:
sed -z 's/^/(/;s/\n/)|(/g;s/|($//' your_input
where -z allows you to treat the whole file as a single string with embedded \ns.
In detail, the sed script above consists of 3 commands separated by ;s:
s/^/(/ inserts a ( at the beginning of the whole file,
s/\n/)|(/g changes every \n to )|(;
s/|($// removes the trailing |( resulting from the \n at EOF, that is likely in your file since you are on linux.
With perl:
$ seq 10 | perl -pe 's/.*/($&)/; s/\n/|/ if !eof'
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
s/.*/($&)/ to surround input lines with ()
s/\n/|/ if !eof will change newline to | except for the last input line.
Here's a solution with paste (just for fun):
$ seq 10 | paste -d'()' /dev/null - /dev/null | paste -sd'|'
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)
Using any awk:
$ seq 10 | awk '{printf "%s(%s)", sep, $0; sep="|"} END{print ""}'
(1)|(2)|(3)|(4)|(5)|(6)|(7)|(8)|(9)|(10)

How to skip separator inclusion in file using awk command [duplicate]

This question already has an answer here:
awk: function to escape regex operators from a string
(1 answer)
Closed 6 months ago.
Trying to replace | by , using awk
$ awk '{gsub("|",","); print}' sample.txt | tee sample.txt
sample file contains ||| characters and target is to replace with ,,, when fired above command the output is ,|,|,| where it should be ,,,
Try awk '{gsub(/\|/,","); print}' sample.txt | tee output.txt. Note that "|" need to be escaped with "\", and the result is then "tee"ed to another file. Writing back to the same file may not be safe.

convert hex to decimal with awk incorrect (with --non-decimal-data or strtonum) [duplicate]

This question already has answers here:
Printing long integers in awk
(7 answers)
Closed 3 years ago.
awk hex to decimal result is incorrect, not equal with bash/python
echo 0x06375FDFAE88312A |awk --non-decimal-data '{printf "%d\n",$1}'
or
echo 0x06375FDFAE88312A |awk '{printf "%d\n",strtonum($1)}'
the result is 447932102257160448, but with python the result is 447932102257160490
python -c "print int('0x06375FDFAE88312A', 16)"
You need to use --bignum option, as this answer suggests. (Supported in gawk since version 4.1).
echo 0x06375FDFAE88312A |awk --bignum '{printf "%d\n",strtonum($1)}'
echo 0x06375FDFAE88312A |awk --bignum --non-decimal-data '{printf "%d\n",$1}'
The problem is that AWK typically uses double floating point number to represent numbers by default, so there is a limit on how many exact digits can be stored that way.

getting specific value using awk command in linux [duplicate]

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 4 years ago.
I have a following file.
cat test.txt
NE|East
OR|East
WB|East
HP|North
HR|North
JK|North
NR|North
PB|North
I have a variable circle which stores the following value.
circle="JK"
Now, I want the value matching my variable. I have used the following code, but it doesn't provide me any output. However, when I manually writes "JK", it shows me the desired result.
awk -F '|' '{if($1==$circle) print $2;}' test.txt
awk -F '|' '{if($1 == "JK") print $2;}' test.txt
North
Please suggest. Help is much appreciated.
Could you please try following.
val="$JK"
awk -v var="$val" -F'|' '$1==var{print $2}' Input_file

awk extract of a series of lines

I am stuck at getting a right solution using awk to extract versions between "[]" from
Version Repository Repository URL
[1.0.0.44] repo-0 file://test/test-1.0.0.44-features.xml
[1.0.0.21] repo-0 file://test/test-1.0.0.21-features.xml
Is there any quick efficient one-liners anyone can help with please?
With awk, using square brackets as the field separators, output field 2 except for record number 1:
awk -F '[][]' 'NR > 1 {print $2}'
Or, grep with -o is useful for extracting substrings
grep -oP '(?<=\[)[^]]+'