I have csv files that need to be changed f -> 0 and t -> 1 only between commas for every single csv if it matches. From:
,t,t,f,f,a,t,f,t,f,f,t,f,
tftf
to:
,1,1,0,0,a,1,0,1,0,0,1,0,
tftf
Works this way, but want to know better way that could reduce the replacing time consume
for i in 1 2 3 4 5 6
do
echo "converting tables for mariaDB"
find ./ -type f -name "*.csv" -print0 | xargs -0 sed -i 's/\,t\,/\,1\,/g'
find ./ -type f -name "*.csv" -print0 | xargs -0 sed -i 's/\,f\,/\,0\,/g'
echo "$i time(s) changed "
done
I except , one single command will change the line
Could you please try following. Though it is not perfect solution but would be simplest use it in case you don't have gawk's latest version where -inplace edit option is present.
for file in *.csv
awk '{gsub(/,t,/,",1,");gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(/,f,/,",0,")} 1' "$file" > temp && mv temp"$file"
done
OR
for file in *.csv
awk -v t_val="1" -v f_val="0" 'BEGIN{FS=OFS=","}{for(i=2;i<NF;i++){$i=($i=="t"?t_val:$i=="f"?f_val:$i)}} 1' "$file" > temp && mv temp "$file"
done
2nd solution: Using gawk's latest version where we could save edit into Input_file itself.
gawk -i inplace '{gsub(/,t,/,",1,");gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(/,f,/,",0,")} 1' *.csv
OR
gawk -i inplace -v t_val="1" -v f_val="0" 'BEGIN{FS=OFS=","}{for(i=2;i<NF;i++){$i=($i=="t"?t_val:$i=="f"?f_val:$i)}} 1' Input_file
The main problem, in this case, is that a regular expression does not allow overlap when parsing it with sed 's/ere/str/g' or awk '{gsub(ere,str,$0)}'. This comment nicely explains how you can circumvent this in sed using the t<label> command, which means: if a change happened to the pattern space, move to <label>. The comment shows a generic way of doing it. The awk alternative to this rule would be:
$ awk '{while(match($0,ere)) gsub(ere,str)}'
An alternative sed solution in the case of the OP's example could use the following idea:
duplicate all commas. Since we are searching for strings of the form ",t,", this duplication avoid overlap using s.
since no overlap is possible, replace all ",f," with ",0," and all ",t," with ",1,".
We can now revert all duplicated commas again. As no overlap is allowed, sequences like ,,,, will be nicely converted to ,, and not ,
In POSIX sed this looks like:
$ sed -e 's/,/,,/g' -e 's/,f,/,0,/g' \
-e 's/,t,/,1,/g' -e 's/,,/,/g' file > file.tmp
$ mv file.tmp file
With GNU sed we can do it in one go:
$ sed -i 's/,/,,/g;s/,f,/,0,/g;s/,t,/,1,/g;s/,,/,/g' file
With awk, this would look like:
$ awk 'BEGIN{FS=",";OFS=FS FS}
{$1=$1;gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(OFS,FS)}1' file > file.tmp
$ mv file.tmp file
If I use cp inside a bash script the copied file will have weird charachters around the destination filename.
The destination name comes from the results of an operation, it's put inside a variable, and echoing the variable shows normal output.
The objective is to name a file after a string.
#!/bin/bash
newname=`cat outputfile | grep 'hostname ' | sed 's/hostname //g'
newecho=`echo $newname`
echo $newecho
cp outputfile "$newecho"
If I launch the script the echo looks ok
$ ./rename.sh
mo-swc-56001
However the file is named differently
~$ ls
'mo-swc-56001'$'\r'
As you can see the file contains extra charachters which the echo does not show.
Edit: the newline of the file is like this
# file outputfile
outputfile: ASCII text, with CRLF, CR line terminators
I tried in every possible way to get rid of the ^M charachter but this is an example of the hundreds of attempts
# cat outputfile | grep 'hostname ' | sed 's/hostname //g' | cat -v
mo-swc-56001^M
# cat outputfile | grep 'hostname ' | sed 's/hostname //g' | cat -v | sed 's/\r//g' | cat -v
mo-swc-56001^M
This newline will stay there. Any ideas?
Edit: crazy, the only way is to perform a dos2unix on the output...
Looks like your outputfile has \r characters in it, so you could add logic there to remove them and give it a try then.
#!/bin/bash
##remove control M first from outputfile by tr command.
tr -d '\r' < outputfile > temp && mv temp outputfile
newname=$(sed 's/hostname //g' outputfile)
newecho=`echo $newname`
echo $newecho
cp outputfile "$newecho"
The only way was to use dos2unix
I have a variable and that variable only needs a '\' in front of it.
I would say that the sed command is the ideal tool for it?
I tried using single quotes, double quotes, multiple variables, combination of variables, ...
I don't get an error returned but the end result is not showing what I need it do be
FOLDER=$(echo `cat file.XML | grep "Value" | cut -d \" -f2`)
echo $FOLDER
sed -i "s#"$FOLDER"#"\\$FOLDER"#g" ./file.XML
echo $FOLDER
After execution, I get
$ ./script.sh
b4c17422-1365-4fbe-bccd-04e0d7dbb295
b4c17422-1365-4fbe-bccd-04e0d7dbb295
Eventually I need to have a result like
$ ./script.sh
b4c17422-1365-4fbe-bccd-04e0d7dbb295
\b4c17422-1365-4fbe-bccd-04e0d7dbb295
Fixed thanks to the input of Cyrus and Ed Morton.
FOLDER=$(echo `cat file.XML | grep "Value" | cut -d \" -f2`)
NEW_FOLDER="\\$FOLDER"
sed -i "s#$FOLDER#\\$NEW_FOLDER#g" ./file.XML
I am not much of an awk user, but after some Googling, determined it would work best for what I am trying to do...only problem is, I can't get it to work. I'm trying to print out the contents of sudoers while inserting the server name ($i) and a comma before the sudoers entry as I'm directing it to a .csv file.
egrep '^[aA-zZ]|^[%]' //$i/etc/sudoers | awk -v var="$i" '{print "$var," $0}' | tee -a $LOG
This is the output that I get:
$var,unixpvfn ALL = (root)NOPASSWD:/usr/bin/passwd
awk: no program given
Thanks in advance
egrep is superfluous here. Just awk:
awk -v var="$i" '/^[[:alpha:]%]/{print var","$0}' //"$i"/etc/sudoers | tee -a "$LOG"
Btw, you may also use sed:
sed "/^[[:alpha:]%]/s/^/${i},/" //"$i"/etc/sudoers | tee -a "$LOG"
You can save the grep and let awk do all the work:
awk -v svr="$i" '/^[aA-zZ%]/{print svr "," $0}' //$i/etc/sudoers
| tee -a $LOG
If you put things between "..", it means literal string, and variable won't be expanded in awk. Also, don't put $ before a variable, it will indicate the column, not the variable you meant.
I have a bash script that will update a table based on a file. The way I have it it opens and closes for every line in the file and would like to understand how to open, perform all the updates, and then close. It is fine for a few updates but if it ever requires more than a few hundred could be really taxing on the system.
#!/bin/bash
file=/export/home/dncs/tmp/file.csv
dateFormat=$(date +"%m-%d-%y-%T")
LOGFILE=/export/home/dncs/tmp/Log_${dateFormat}.log
echo "${dateFormat} : Starting work" >> $LOGFILE 2>&1
while IFS="," read mac loc; do
if [[ "$mac" =~ ^([0-9a-fA-F]{2}:){5}[0-9a-fA-F]{2}$ ]]; then
dbaccess thedb <<EndOfUpdate >> $LOGFILE 2>&1
UPDATE profile
SET local_code= '$loc'
WHERE mac_address = '$mac';
EndOfUpdate
else
echo "Error: $mac not valid format" >> $LOGFILE 2>&1
fi
IIH -i $mac >> $LOGFILE 2>&1
done <"$file"
Source File.
12:BF:20:04:BB:30,POR-4
12:BF:21:1C:02:B1,POR-10
12:BF:20:04:72:FD,POR-4
12:BF:20:01:5B:4F,POR-10
12:BF:20:C2:71:42,POR-7
This is more or less what I'd do:
#!/bin/bash
fmt_date() { date +"%Y-%m-%d.%T"; }
file=/export/home/dncs/tmp/file.csv
dateFormat=$(fmt_date)
LOGFILE="/export/home/dncs/tmp/Log_${dateFormat}.log"
exec >> $LOGFILE 2>&1
echo "${dateFormat} : Starting work"
valid_mac='/^\(\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}\),\([^,]*\)$/'
update_stmt="UPDATE profile SET local_code = '\3' WHERE mac_address = '\1';"
sed -n -e "$valid_mac s//$update_stmt/p" "$file" |
dbaccess thedb -
sed -n -e "$valid_mac d; s/.*/Error: invalid format: &/p" "$file"
sed -n -e "$valid_mac s//IIH -i \1/p" "$file" | sh
echo "$(fmt_date) : Finished work"
I changed the date format to a variant of ISO 8601; it is easier to parse. You can stick with your Y2K-non-compliant US-ish format if you prefer. The exec line arranges for standard output and standard error from here onwards to go to the log file. The sed command all use the same structure, and all use the same pattern match stored in a variable. This makes consistency easier. The first sed script converts the data into UPDATE statements (which are fed to dbaccess). The second script identifies invalid MAC addresses; it deletes valid ones and maps the invalid lines into error messages. The third script ignores invalid MAC addresses but generates a IIH command for each valid one. The script records an end time — it will allow you to assess how long the processing takes. Again, repetition is avoided by creating and using the fmt_date function.
Be cautious about testing this. I had a file data containing:
87:36:E6:5E:AC:41,loc-OYNK
B2:4D:65:70:32:26,loc-DQLO
ZD:D9:BA:34:FD:97,loc-PLBI
04:EB:71:0D:29:D0,loc-LMEE
DA:67:53:4B:EC:C4,loc-SFUU
I replaced the dbaccess with cat, and the sh with cat. The log file I relocated to the current directory — leading to:
#!/bin/bash
fmt_date() { date +"%Y-%m-%d.%T"; }
#file=/export/home/dncs/tmp/file.csv
file=data
dateFormat=$(fmt_date)
#LOGFILE="/export/home/dncs/tmp/Log_${dateFormat}.log"
LOGFILE="Log-${dateFormat}.log"
exec >> $LOGFILE 2>&1
echo "${dateFormat} : Starting work"
valid_mac='/^\(\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}\),\([^,]*\)$/'
update_stmt="UPDATE profile SET local_code = '\3' WHERE mac_address = '\1';"
sed -n -e "$valid_mac s//$update_stmt/p" "$file" |
cat
#dbaccess thedb -
sed -n -e "$valid_mac d; s/.*/Error: invalid format: &/p" "$file"
#sed -n -e "$valid_mac s//IIH -i \1/p" "$file" | sh
sed -n -e "$valid_mac s//IIH -i \1/p" "$file" | cat
echo "$(fmt_date) : Finished work"
After I ran it, the log file contained:
2017-04-27.14:58:20 : Starting work
UPDATE profile SET local_code = 'loc-OYNK' WHERE mac_address = '87:36:E6:5E:AC:41';
UPDATE profile SET local_code = 'loc-DQLO' WHERE mac_address = 'B2:4D:65:70:32:26';
UPDATE profile SET local_code = 'loc-LMEE' WHERE mac_address = '04:EB:71:0D:29:D0';
UPDATE profile SET local_code = 'loc-SFUU' WHERE mac_address = 'DA:67:53:4B:EC:C4';
Error: invalid format: ZD:D9:BA:34:FD:97,loc-PLBI
IIH -i 87:36:E6:5E:AC:41
IIH -i B2:4D:65:70:32:26
IIH -i 04:EB:71:0D:29:D0
IIH -i DA:67:53:4B:EC:C4
2017-04-27.14:58:20 : Finished work
The UPDATE statements would have gone to DB-Access. The bogus MAC address was identified. The correct IIH commands would have been run.
Note that piping the output into sh requires confidence that the data you generate (the IIH commands) will be clean.