awk or cut field with filename that has space - awk

From below
-rw-r--r-- 1 user user 0 Aug 26 15:20 /home/user/public_html/this\ space.ext
I want to extract last column. Expected output:
/home/user/public_html/this\ space.ext
What I tried with cut:
ls -lh /home/user/public_html/this\ space.ext | cut -d ' ' -f9
output:
/home/user/public_html/this\
What I tried with awk:
ls -lh /home/user/public_html/this\ space.ext | awk '{print $9}'
output:
/home/user/public_html/this\

with awk
$ echo "-rw-r--r-- 1 user user 0 Aug 26 15:20 /home/user/public_html/this\ space.ext" |
awk -F'[^\\\\] ' '{print $NF}'
/home/user/public_html/this\ space.ext
define delimiter as space after a non-backslash char.

Related

AWK Assignment and execute operation with variables

I would like to find out how to assign and execute an operation with the value variable.
Suppose that I get these files as a result of ls -A *.pdf | grep -v '^d':
firstOne.pdf
ordenSiq.pdf
Now I'm trying to execute any operation later of assignment, an example:
% ls -lAh ordenSiq.pdf
-rw-r--r--# 1 joseluisbz staff 47K Jun 29 15:35 ordenSiq.pdf
Here my attempt (but is not working!)
awk -v thelast="$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}')" 'BEGIN {ls -lAh thelast;}'
EDIT
Obtaining The Last File With Extension!
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); awk -v result=$thelast 'BEGIN{print result}' Otutput: ordenSiq.pdf
Extracting Only The Name (split)
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v result="$thename" 'BEGIN{print result}' Output:ordenSiq
ALL Extensions For name (concatenation)
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); awk -v result="$allnames" 'BEGIN{print result}' Output:ordenSiq.*
or
thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v allexts=${thename}".*" 'BEGIN{print allexts}'
Execution of command with variable
I would like to obtain something like:
% ls -lAh ordenSiq.*
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.abc
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.def
-rw-r--r--# 1 joseluisbz staff 47K Jun 29 15:35 ordenSiq.pdf
%
ERROR:
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v allexts=${thename}".*" 'BEGIN{system(ls -lAh allexts)}' Output:
sh: 0ordenSiq.*: command not found
And with
thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); awk -v allexts=${thename}".*" 'BEGIN{system(ls -lAh $allexts)}' Output:
awk: illegal field $(ordenSiq.*), name "allexts"
source line number 1
Some Working example:
% root1="/webroot"; echo | awk -v r=$root1 '{ print "shell variable $root1 value is " r}' Output:
shell variable $root1 value is /webroot
Statically Works!
% ls -lAh ordenSiq.*
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.abc
-rw-r--r-- 1 joseluisbz staff 0B Jul 15 12:34 ordenSiq.def
-rw-r--r--# 1 joseluisbz staff 47K Jun 29 15:35 ordenSiq.pdf
%
And the variable's value is correct!
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); echo ${allexts} Output:
ordenSiq.*
But doing this, do not work;
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); ls -lAh $allexts Output:
ls: ordenSiq.*: No such file or directory
QUESTION:
What is wrong in my steps in order to perform the final operation with variables (with and without AWK)?
{ls -lAh thelast;}
This is not correct way of using shell command in GNU AWK, you should prepare string with command and then use system function, so for example if you want to sleep for 5 seconds at beginning you might do
awk 'BEGIN{system("sleep 5")}'
Keep in mind that system function returns exit status code (see linked documentation for further discussion), not output of command.
I think this is what you're trying to do:
thelast=$(find . -type f -name '*.pdf' -printf '%T#\t%p\n' | sort -n | cut -f2- | tail -n 1)
thename="${thelast%.pdf}"
allexts=( "$thename".* )
ls -lh "${allexts[#]}"
As for what's wrong with your original code:
% thelast=$(ls -A *.pdf | grep -v '^d' | tail -n 1 | awk '{print}'); thename=$(echo ${thelast} | awk '{split($0,a,"."); print a[1]}'); allexts=$(echo ${thename}'.*'); ls -lAh $allexts
It's trying to parse the output of ls, see https://mywiki.wooledge.org/ParsingLs
It's removing the names of files that start with d for unknown reasons.
It's not listing the files in time order
It's using awk '{print}' which does nothing but copy the input to the output.
It's got unquoted variables (copy/paste your code into http://shellcheck.net and it'll tell you about the basic issues)
echo ${thename}'.*' is leaving the part that needs to be double-quoted unquoted and then single-quoting the part that needs to be unquoted.
Using split() in awk as you are would corrupt file names that contain multiple .s.
There may be other issues that aren't as obvious, idk.

Converting output of ls -ltr to date format %m%d %H:%M

i am writing a awk script for getting modification date and then converting them but getting a problem in converting output of ls -lrt to date format "month/date hour:date"
My awk script:-
awk 'BEGIN{
"ls -lrt "ARGV[1] "| awk '{\"print $6$7$8\" +\"%Y%m/%d %H:%M\"}'" | getline cdatefile1
}
{
print cdatefile1
}' file1
Assuming GNU ls, you want the --time-style option:
$ touch afile
$ ls -l afile
-rw-r--r-- 1 jackman jackman 0 Apr 2 08:54 afile
$ ls -l --time-style='+%m%d %H:%M' afile
-rw-r--r-- 1 jackman jackman 0 0402 08:54 afile

GNU parallel used with xargs and awk

I have two large tab separated files A.tsv and B.tsv, they look like (the header is not in the file):
A.tsv:
ID AGE
User1 18
...
B.tsv:
ID INCOME
User4 49000
...
I want to select list of IDs in A such that 10=< AGE <=20 and select rows in B that match the list. And I want to use GNU parallel tool. My attempt is two steps:
cat A.tsv | parallel --pipe -q awk '{ if ($3 >= 10 && $3 <= 20) print $1}' > list.tsv
cat list.tsv | parallel --pipe -q xargs -I% awk 'FNR==NR{a[$1];next}($1 in a)' % B.tsv > result.tsv
The first step works but the second one comes with error like:
awk: cannot open User1 (No such file or directory)
How can I fix this? Does this method work even if A.tsv and list.tsv are 2 to 3 times bigger than the memory?
$ for I in $(seq 8 2 22); do echo -e "User$I\t$I" >> A.txt; done; cat A.txt
User8 8
User10 10
User12 12
User14 14
User16 16
User18 18
User20 20
User22 22
$ for I in $(seq 8 2 22); do echo -e "User$I\t100${I}00" >> B.txt; done; cat B.txt
User8 100800
User10 1001000
User12 1001200
User14 1001400
User16 1001600
User18 1001800
User20 1002000
User22 1002200
$ cat A.txt | parallel --pipe -q awk '{if ($2 >= 10 && $2 <= 20) print $1}' > list.txt
$ cat B.txt | parallel --pipe -q grep -f list.txt
User10 1001000
User12 1001200
User14 1001400
User16 1001600
User18 1001800
User20 1002000
I know this: (yes, I saw it)
GNU parallel used with xargs and awk
Asked 8 years, 3 months ago
Modified 8 years, 3 months ago
Viewed 2k times
My solution:
only xargs and awk, only a line without intermediate file, and you don't need install a new tool
awk '{if ($2 >= 10 && $2 <= 20) print $1}' A.tsv | xargs -I myItem awk --assign quebuscar=myItem '$1==quebuscar {print}' B.tsv

Exclude characters using awk

I am trying to find a way to exclude numbers on a file when I cat ti but I only want to exclude the numbers on print $1 and I want to keep the number that is in front of the word. I have something that I thought might might work but is not quite giving me what I want. I have also showed an example of what the file looks like.The file is separated by pipes.
cat files | awk -F '|' ' {print $1 "\t" $2}' |sed 's/0123456789//g'
input:
b1ark45 | dog23 | brown
m2eow66| cat24 |yellow
h3iss67 | snake57 | green
Output
b1ark dog23
m2eow cat24
h3iss nake57
try this:
awk -F'|' -v OFS='|' '{gsub(/[0-9]/,"",$1)}7' file
the output of your example would be:
bark | dog23 | brown
meow| cat24 |yellow
hiss | snake57 | green
EDIT
this outputs col1 (without ending numbers and spaces) and col2, separated by <tab>
kent$ echo "b1ark45 | dog23 | brown
m2eow66| cat24 |yellow
h3iss67 | snake57 | green"|awk -F'|' -v OFS='\t' '{gsub(/[0-9]*\s*$/,"",$1);print $1,$2}'
b1ark dog23
m2eow cat24
h3iss snake57
This might work for you (GNU sed):
sed -r 's/[0-9]*\s*\|\s*(\S*).*/ \1/' file

grep and awk parse line

I have e a line that looks like:
Feb 21 1:05:14 host kernel: [112.33000] SRC=192.168.0.1 DST=90.90.90.90 PREC=0x40 TTL=51 ....
I would like to the a list of uniq IPs from SRC=
How can I do this? Thanks
This will work, although you could probably simplify it further in a single awk script if you wanted:
awk '{print $7}' <your file> | awk -F= '{print $2}' | sort -u
grep -o 'SRC=\([^ ]\+\)' | cut -d= -f2 | sort -u
cat thefile | grep SRC= | sed -r 's/^.*SRC=([^ ]+).*$/\1/' | sort | uniq
This awk script will do:
{a[$7]=1}
END{for (i in a) print i}
This will print the IP addresses in order without the "SRC=" string:
awk '{a[$7] = $7} END {asort(a); for (i in a) {split(a[i], b, "="); print b[2]}}' inputfile
Example output:
192.168.0.1
192.168.0.2
192.168.1.1
grep -Po "SRC=(.[^\s]*)" file | sed 's/SRC=//' | sort -u
Ruby(1.9+)
ruby -ne 'puts $_.scan(/SRC=(.[^\s]*)/)[0] if /SRC=/' file| sort -u