Display octal value without being interpreted with tr? - scripting

I have
./script "test\42"
Here is an example script :
#!/bin/sh
echo "$1"
It gives me :
test" (42 being interpreted as an ASCII octal value, 42 = ")
How can I do to display
test\42 (instead test" ?)
With the help of tr (translate Unix command)?

Try either using single quotes:
./script 'test\42'
or escaping the slash:
./script "test\\42"

Related

Substituting Variable in sed command

I have ./cpptest.sh to which I am passing a command line parameter
For e.g.
$./testcps.sh /srv/repository/Software/Wind_1.0.2/
The above command line parameter, is stored in variable $1
when I echo $1, the output is correct (the path)
Actual issue...
There is another file let's say abc.properties file. In this file there is a key-value field something like location.1=stg_area.
I want to replace the 'stg_area' with the value stored in $1 (the path) so that the substitution looks like location.1=/srv/repository/Software/Wind_1.0.2/
Now, to achieve this, I am tried all option below with sed and none worked
sed -i "s/stg_area/$1/" /srv/ppc/abc.properties //output is sed: -e expression #1, char 17: unknown option to `s'
sed -i 's/stg_area/'"$1'"/' /srv/ppc/abc.properties //output is sed: -e expression #1, char 18: unknown option to `s'
sed -i s/stg_area/$1/ /srv/ppc/abc.properties //output is sed: -e expression #1, char 17: unknown option to `s'
I think I have tried all possible ways... Any answer on this is appreciated. Thanks in advance.
You know that sed is using / as a special separator in the command s/pattern/replacement/, right? You've used it yourself in the question.
So obviously there's a problem when you have a replacement string containing /, like yours does.
As the documentation says:
The / characters may be uniformly replaced by any other single character within any given s command. The / character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \ character.
So the two available solutions are:
use a different separator for the s command, such as
s#stg_area#$1#
(although you still need to check there are no # characters in the replacement string)
sanitize the replacement string so it doesn't contain any special characters (either /, or sequences like \1, or anything else sed treats as special), for example by escaping them with \
sanitized=$(sed 's#/#\\/#g' <<< $1)
(and then used $sanitized instead of $1 your sed script)

using awk to find special charecters in txt file

I need to scan a file with many different special charecters and values.
Given a set of special charecters - I need to provide the value next to it:
547 %$
236 \"
4523 &*
8876 (*
8756 "/
...
I am using an awk command with gsub in order to find the sequences as they are.
awk -v st="$match_string" 'BEGIN {gsub(/(\[|\]|\-|\$|\*|\:|\+|\"|\(|\))/,"\\\\&", st)} match($0,st) {print;exit}' file.txt
The command works great e.g.
> (*
>> 8876 (*
However I am having trouble using the command to locate the \" sequence
I am trying to add to the gsub different strings to represnt the sequence:
|\\|
|\\\\|
|\\\\"|
...
But the result is always:
> \"
>> 8756 "/
while the result I am looking for woould be:
> \"
>> 236 \"
It seems that the gsub does not work, and the \" is interpeted just as "
Any ideas?
follwoing is a short script to run -
- it should find the symbol attached to the value in first_num
- Next it should print the first value in the file attched to the symbol found
first_num=$1
echo "looking for : $first_num"
sym_to_check=$(awk -v s="$first_num" '$0~s {if ($0~s)print $2}' temp.txt)
echo "symbol - $sym_to_check"
first_val=$(awk -v s="$sym_to_check" 'BEGIN {gsub(/(\[|\]|\-|\$|\^|\*|\:|\+|\"|\(|\))/,"\\\\&",s)} $0~s {if ($0~s)print; if ($0~s)exit}' temp.txt)
echo "first val- $first_val"
suppose the txt file is:
547 %$
111 [*
222 ()
5655 (*
454 )"
35 #!
743 \"
657 #!
236 \"
4523 &*
8876 (*
456 \"
8756 "/
first run is good:
> bash temp1.sh 8876
looking for : 8876
symbol - (*
first val- 5655 (*
the script finds the first value attached to (*
but the next run is bad:
> bash temp1.sh 236
looking for : 236
symbol - \"
first val- 454 )"
the symbol is correct - looking for \" but when searching for the first value attached to it, it looks for the first symbol with "
This gives the value 454 )" instead of the desired 743 \"
The way you're initializing the awk variable st using -v st="$match_string" is by design expanding escape sequences (so \t in "$match_string" would become a literal tab char in st, for example) and you're using a regexp operator, match(), but trying to escape the regexp metachars to make it act like it's doing string instead of regexp matching and then you're doing partial matching on the whole line (e.g. $0~85 would match 1853) instead of full matching on a specific field ($1==85).
Here's how you init awk variables from the shell without interpreting escape sequences and then test for them as full-matching literal strings or numbers on a specific field rather than partial-matching regexps across the whole line:
$ match_string='\"'
$ st="$match_string" awk 'BEGIN{st=ENVIRON["st"]} $2==st{print; exit}' file
743 \"
$ awk 'BEGIN{st=ARGV[1]; ARGV[1]=""} $2==st{print; exit}' "$match_string" file
743 \"
$ awk 'BEGIN{st=ARGV[1]; ARGV[1]=""} $1==st{print; exit}' '743' file
743 \"
Not all awks support ENVIRON[] so the first approach won't work in all awks but the second will.
See How do I use shell variables in an awk script? for how to set awk variables from shell and when you want to do literal string comparisons, it's usually simpler to just use string operators like == and index() instead of using regexp operators like ~ or match() and trying to escape all the regexp metacharacters to make them act like they're strings.
If you ever DID want to escape all regexp metachars, though, then the syntax to do that would be:
gsub(/[^^]/,"[&]",st); gsub(/\^/,"\\^",st)
rather than what you have in the code in your question:
gsub(/(\[|\]|\-|\$|\*|\:|\+|\"|\(|\))/,"\\\\&", st)
See Is it possible to escape regex metacharacters reliably with sed for an explanation of why that is the correct syntax.

awk - Rounding all floating-point numbers in multi-line text file

Assume a multi-line text file that contains multiple floating-point numbers as well as alphanumeric strings and special characters per line. The only consistency is that all floats are separated from any other string by a single whitespace. Further, assume that we wish to round each floating-point number to a maximum of n digits after the comma. All strings other than the floats shall remain in place and as is. Let us assume that n=5.
I know this can be implemented via awk easily. My current code (below) only rounds the last float of each line and swallows all strings that precede it. How do I improve it?
echo -e "\textit{foo} & 1234.123456 & -1234.123456\n1234.123456" |\
awk '{for(i=1;i<=NF;i++);printf("%.05f\n",$NF)}'
# -1234.12346
# 1234.12346
Using perl :
perl -i -pe 's/(\d+\.\d+)/sprintf "%.05f", $1/eg' file
One solution :
$ echo -e "\textit{foo} & 1234.123456 & -1234.123456\n1234.123456" |
awk '{for(i=1;i<=NF;i++){if ($i ~ /[0-9]+.[0-9]+/){printf "%.05f\n", $i}}}'
Output :
1234.12346
-1234.12346
1234.12346
Is this what you're trying to do?
$ printf '\textit{foo} & 1234.123456 & -1234.123456\n1234.123456\n' |
awk -F'[ ]' '{for(i=1;i<=NF;i++) if ($i+0 == $i) $i = sprintf("%.05f",$i)} 1'
extit{foo} & 1234.12346 & -1234.12346
1234.12346
if ($i+0 == $i) is the idiomatic awk way to test for a value being a number since only a number could have the same value on the left and right side of that comparison.
I'm setting the FS to a literal, single blank char instead of it's default which, confusingly, is also a blank char but the latter (i.e. ' ' vs '[ ]') is treated specially and results in ALL chains of contiguous white space being treated as a separator and ignoring stripping leading/trailing blanks on a recompilation of $0 (e.g. as caused by assigning to any field) and so would not allow your formatting to be maintained in the output.

Field separators-trouble delimiting command characters

I'm trying to parse through html source code. In my example I'm just echoing it in. But, I am reading html from a file in practice.
Here is a bit of code that works, syntactically:
echo "<td>Here</td> some dynamic text to ignore <garbage> is a string</table>more junk" |
awk -v FS="(<td>|</td>|<garbage>|</table>)" '{print $2, $4}'
in the FS declaration I create 4 delimiters which work fine, and I output the 2nd and 4th field.
However, the 3rd field delimeter I actually need to use contains awk command characters, literally:
')">
such that when I change the above statement to:
echo "<td>Here</td> some dynamic text to ignore ')\"> is a string</table>more junk" |
awk -v FS="(<td>|</td>|')\">|</table>)" '{print $2, $4}'
I've tried escaping one, all, and every combination of the offending string with the \character. but, nothing is working.
This might be what you're looking for:
$ echo "<td>Here</td> some dynamic text to ignore ')\"> is a string</table>more junk" |
awk -v FS='(<td>|</td>|\047\\)">|</table>)' '{print $2, $4}'
Here is a string
In shell, always include strings (and command line scripts) in single quotes unless you NEED to use double quotes to expose your strings contents to the shell, e.g. to let the shell expand a variable.
Per shell rules you cannot include a single quote within a single quote delimited string 'foo'bar' though (no amount of backslashes will work to escape that mid-string ') so you need to either jump back out of the single quotes to provide a single quote and then come back in, e.g. with 'foo'\''bar' or use the octal escape sequence \047 (do not use the hex equivalent as it is error prone) wherever you want a single quote, e.g. 'foo\047bar'. You then need to escape the ) twice - once for when awk converts the string to a regexp and then again when awk uses it as a regexp.
If you had been using double quotes around the string you'd have needed one additional escape for when shell parsed the string but that's not needed when you surround your string in single quotes since that is blocking the shell from parsing the string.

gawk system-command ignoring backslashes

Consider a file with a list of strings,
string1
string2
...
abc\de
.
When using gawk's system command to execute a shell command
, in this case printing the strings,
cat file | gawk '{system("echo " $0)}'
the last string will be formatted to abcde. $0 denotes the whole record, here this is just the one string
Is this a limitation of gawk's system command, not being able to output the gawk variables unformatted?
Expanding on Mussé Redi's answer, observe that, in the following, the backslash does not print:
$ echo 'abc\de' | gawk '{system("echo " $0)}'
abcde
However, here, the backslash will print:
$ echo 'abc\de' | gawk '{system("echo \"" $0 "\"")}'
abc\de
The difference is that the latter command passes $0 to the shell with double-quotes around it. The double-quotes change how the shell processes the backslash.
The exact behavior will change from one shell to another.
To print while avoiding all the shell vagaries, a simple solution is:
$ echo 'abc\de' | gawk '{print $0}'
abc\de
In Bash we use double backslashes to denote an actual backslash. The function of a single backslash is escaping a character. Hence the system command is not formatting at all; Bash is.
The solution for this problem is writing a function in awk to preformat backslashes to double backslahses, afterwards passing it to the system command.