I have csv files that need to be changed f -> 0 and t -> 1 only between commas for every single csv if it matches. From:
,t,t,f,f,a,t,f,t,f,f,t,f,
tftf
to:
,1,1,0,0,a,1,0,1,0,0,1,0,
tftf
Works this way, but want to know better way that could reduce the replacing time consume
for i in 1 2 3 4 5 6
do
echo "converting tables for mariaDB"
find ./ -type f -name "*.csv" -print0 | xargs -0 sed -i 's/\,t\,/\,1\,/g'
find ./ -type f -name "*.csv" -print0 | xargs -0 sed -i 's/\,f\,/\,0\,/g'
echo "$i time(s) changed "
done
I except , one single command will change the line
Could you please try following. Though it is not perfect solution but would be simplest use it in case you don't have gawk's latest version where -inplace edit option is present.
for file in *.csv
awk '{gsub(/,t,/,",1,");gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(/,f,/,",0,")} 1' "$file" > temp && mv temp"$file"
done
OR
for file in *.csv
awk -v t_val="1" -v f_val="0" 'BEGIN{FS=OFS=","}{for(i=2;i<NF;i++){$i=($i=="t"?t_val:$i=="f"?f_val:$i)}} 1' "$file" > temp && mv temp "$file"
done
2nd solution: Using gawk's latest version where we could save edit into Input_file itself.
gawk -i inplace '{gsub(/,t,/,",1,");gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(/,f,/,",0,")} 1' *.csv
OR
gawk -i inplace -v t_val="1" -v f_val="0" 'BEGIN{FS=OFS=","}{for(i=2;i<NF;i++){$i=($i=="t"?t_val:$i=="f"?f_val:$i)}} 1' Input_file
The main problem, in this case, is that a regular expression does not allow overlap when parsing it with sed 's/ere/str/g' or awk '{gsub(ere,str,$0)}'. This comment nicely explains how you can circumvent this in sed using the t<label> command, which means: if a change happened to the pattern space, move to <label>. The comment shows a generic way of doing it. The awk alternative to this rule would be:
$ awk '{while(match($0,ere)) gsub(ere,str)}'
An alternative sed solution in the case of the OP's example could use the following idea:
duplicate all commas. Since we are searching for strings of the form ",t,", this duplication avoid overlap using s.
since no overlap is possible, replace all ",f," with ",0," and all ",t," with ",1,".
We can now revert all duplicated commas again. As no overlap is allowed, sequences like ,,,, will be nicely converted to ,, and not ,
In POSIX sed this looks like:
$ sed -e 's/,/,,/g' -e 's/,f,/,0,/g' \
-e 's/,t,/,1,/g' -e 's/,,/,/g' file > file.tmp
$ mv file.tmp file
With GNU sed we can do it in one go:
$ sed -i 's/,/,,/g;s/,f,/,0,/g;s/,t,/,1,/g;s/,,/,/g' file
With awk, this would look like:
$ awk 'BEGIN{FS=",";OFS=FS FS}
{$1=$1;gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(OFS,FS)}1' file > file.tmp
$ mv file.tmp file
If I use cp inside a bash script the copied file will have weird charachters around the destination filename.
The destination name comes from the results of an operation, it's put inside a variable, and echoing the variable shows normal output.
The objective is to name a file after a string.
#!/bin/bash
newname=`cat outputfile | grep 'hostname ' | sed 's/hostname //g'
newecho=`echo $newname`
echo $newecho
cp outputfile "$newecho"
If I launch the script the echo looks ok
$ ./rename.sh
mo-swc-56001
However the file is named differently
~$ ls
'mo-swc-56001'$'\r'
As you can see the file contains extra charachters which the echo does not show.
Edit: the newline of the file is like this
# file outputfile
outputfile: ASCII text, with CRLF, CR line terminators
I tried in every possible way to get rid of the ^M charachter but this is an example of the hundreds of attempts
# cat outputfile | grep 'hostname ' | sed 's/hostname //g' | cat -v
mo-swc-56001^M
# cat outputfile | grep 'hostname ' | sed 's/hostname //g' | cat -v | sed 's/\r//g' | cat -v
mo-swc-56001^M
This newline will stay there. Any ideas?
Edit: crazy, the only way is to perform a dos2unix on the output...
Looks like your outputfile has \r characters in it, so you could add logic there to remove them and give it a try then.
#!/bin/bash
##remove control M first from outputfile by tr command.
tr -d '\r' < outputfile > temp && mv temp outputfile
newname=$(sed 's/hostname //g' outputfile)
newecho=`echo $newname`
echo $newecho
cp outputfile "$newecho"
The only way was to use dos2unix
I want to get the filename from a long string in shell script.After reading some example from likegeeks.com,I write a simple solution:
#/bin/bash
cdnurl="http://download.example.com.cn/download/product/vpn/rules/vpn_patch_20190218162130_sign.pkg?wsSecret=9cadeddedfr7bb85a20a064510cd3f353&wsABSTime=5c6ea1e7"
echo ${cndurl}
url=`echo ${cdnurl} | awk -F'/' '{ print $NF }'`
result=`echo ${url} | awk -F '?' '{ print $1}'`
echo ${url}
echo ${result}
I just want to get vpn_patch_20190218162130_sign.pkg,and the it does.I wonder is there any smart ways (may be one line).
If behind pkg it's not ?,how to use pkg to get the filename,I am not sure if always ? after pkg,but the filename always be *.pkg.
You can try : this is more robust as compare to second awk command:
echo "$cdnurl"|awk -v FS='/' '{gsub(/?.*/,"",$NF);print $NF}'
vpn_patch_20190218162130_sign.pkg
#less robust
echo "$cdnurl"|awk -vFS=[?/] '{print $(NF-1)}'
You should use sed :
sed -r 's|.*/(.*.pkg).*|\1|g'
I am not much of an awk user, but after some Googling, determined it would work best for what I am trying to do...only problem is, I can't get it to work. I'm trying to print out the contents of sudoers while inserting the server name ($i) and a comma before the sudoers entry as I'm directing it to a .csv file.
egrep '^[aA-zZ]|^[%]' //$i/etc/sudoers | awk -v var="$i" '{print "$var," $0}' | tee -a $LOG
This is the output that I get:
$var,unixpvfn ALL = (root)NOPASSWD:/usr/bin/passwd
awk: no program given
Thanks in advance
egrep is superfluous here. Just awk:
awk -v var="$i" '/^[[:alpha:]%]/{print var","$0}' //"$i"/etc/sudoers | tee -a "$LOG"
Btw, you may also use sed:
sed "/^[[:alpha:]%]/s/^/${i},/" //"$i"/etc/sudoers | tee -a "$LOG"
You can save the grep and let awk do all the work:
awk -v svr="$i" '/^[aA-zZ%]/{print svr "," $0}' //$i/etc/sudoers
| tee -a $LOG
If you put things between "..", it means literal string, and variable won't be expanded in awk. Also, don't put $ before a variable, it will indicate the column, not the variable you meant.
How would I go about converting the following bash line into perl? Could I run the system() command, or is there a better way? I'm looking for perl to print out access per day from my apache access_log file.
In bash:
awk '{print $4}' /etc/httpd/logs/access_log | cut -d: -f1 | uniq -c
Prints the following:
632 [27/Apr/2014
156 [28/Apr/2014
awk '{print $4}' /etc/httpd/logs/access_log | cut -d: -f1 | uniq -c
perl -lane'
($val) = split /:/, $F[3]; # First colon-separated elem of the 4th field
++$c{$val}; # Increment number of occurrences of val
END { print for map { "$c{$_} $_" } keys %c } # Print results in no order
' access.log
Switches:
-l automatically appends a newline to the print statement.
-l also removes the newlines from lines read by -n (and -p).
-a splits the line on whitespace into the array #F.
-n loops over the lines of the input but does not print each line.
-e execute the given script body.
Your original command translated to a Perl one-liner:
perl -lane '($k) = $F[3] =~ /^(.*?):/; $h{$k}++ }{ print "$h{$_}\t$_" for keys %h' /etc/httpd/logs/access_log
You can change all your commands to one from:
awk '{print $4}' /etc/httpd/logs/access_log | cut -d: -f1 | uniq -c
to
awk '{split($4,a,":");b[a[1]]++} END {for (i in b) print b[i],i}' /etc/httpd/logs/access_log