from string to integer (scripts) - scripting

I have this snippet of the code:
set calls = `cut -d" " -f2 ${2} | grep -c "$numbers"`
set messages = `cut -d" " -f2 ${3} | grep -c "$numbers"`
# popularity = (calls * 3) + messages
and error
# expression syntax
what does it mean? grep -c returns number, am I wrong, thanks in advance
in $numbers I have list of numbers, 2 and 3 parameters also contain numbers

Try
# popularity = ($calls * 3) + $messages
The $ symbols are still needed to indicate variables.
See C-shell Cookbook

Related

How to Field Separate and append text using awk

Experts,
I have the following text in an xml files ( there will 20,000 rows in file).
<record record_no = "1" error_code="101">"21006041";"28006041";"34006211";"43";"101210-0001"
Here is how I need the result for each row to be and append to new file.
"21006041";"28006041";"34006211";"43";"101210-0001";101
Here is what I need to do to get the above result.
I replaced " with "
remove <record record_no = "1" error_code="
Get the text 101 ( it can have any value in this position)
append to the last.
Here is what I have been trying.
BEGIN { FS=OFS=";" }
/<record/ {
gsub(/"/,"\"")
gsub(/&apos;/,"")
gsub(/.*="|">.*/,"",$1)
$(NF+1)=$1;
$1="";
print $0;
}
This should do the trick.
awk -F'">' -v OFS=';' '{gsub(/<record record_no = \"[0-9]+\" error_code="/,""); gsub(/"/,"\""); print $2,$1}'
The strategy is to:
split the string at closing chars of the xml element ">
remove the first bit of the xml element including the attribute names leaving only the error code.
replace all " xml entities with ".
print the two FS sections in reverse order.
Test it out with the following data generation script. The script will generate 500x20000 line files with records of random length, some with dashes in the values.
#!/bin/bash
recCount=0
for h in {1..500};
do
for i in {1..20000};
do
((recCount++))
error=$(( RANDOM % 998 + 1 ))
record="<record record_no = "'"'"${recCount}"'"'" error_code="'"'"${error}"'"'">"
upperBound=$(( RANDOM % 4 + 5 ))
for (( k=0; k<${upperBound}; k++ ));
do
randomVal=$(( RANDOM % 99999999 + 1))
record+=""${randomVal}"
if [[ $((RANDOM % 4)) == 0 ]];
then
randomVal=$(( RANDOM % 99999999 + 1))
record+="-${randomVal}"
fi
record+="""
if [[ $k != $(( ${upperBound} - 1 )) ]];
then
record+=";"
fi
done;
echo "${record}" >> "file-${h}.txt"
done;
done;
On my laptop I get the following performance.
$ time cat file-*.txt | awk -F'">' -v OFS=';' '{gsub(/<record record_no = \"[0-9]+\" error_code="/,""); gsub(/"/,"\""); print $2,$1}' > result
real 0m18.985s
user 0m17.673s
sys 0m2.697s
As an added bonus, here is the "equivalent" command in sed:
sed -e 's|\("\)|"|g' -e 's|^.*error_code="\([^>]\+\)">\(.\+\).*$|\2;\1|g'
Much slower although the strategy is the same. Two expressions are used. First replace all " xml entities with ". Lastly group all characters (.+) after >. Display the remembered patterns in reverse order \2;\1
Timing statistics:
$ time cat file-* | sed -e 's|\("\)|"|g' -e 's|^.*error_code="\([^>]\+\)">\(.\+\).*$|\2;\1|g' > result.sed
real 5m59.576s
user 5m56.136s
sys 0m9.850s
Is this too thick:
$ awk -F""+" -v OFS='";"' -v dq='"' '{gsub(/^.*="|">$/,"",$1);print dq""$2,$4,$6,$8,$10dq";"$1}' test.in
"21006041";"28006041";"34006211";"43";"101210-0001";101

read and set the value of variable from log file to KSH shell script

I have a log file (which is on output from running a python script)
The log file has the list of variables that I want to pass to a shell script. How do I accomplish this
Example
Log file has the following content. It has the variables x, y, z
Contents of file example.log:
2016-06-07 15:28:12.874 INFO x = (10, 11, 12)
2016-06-07 15:28:12.874 INFO y = case when id =1 then gr8 else ok end
2016-06-07 15:28:12.874 INFO z = 2016-06-07
I want the shell script to read the variables and use in the shell program
Sample shell
shell.ksh
Assign variables
var1 = read_value_x from example.log
var2 = read_value_y from example.log
Is there a generic shell function that I can use to read the log and parse the variable values
Thanks
PMV
Here's how you can do it reasonably efficiently in ksh, for smallish files:
# Read into variables $var1, $var2, ...
n=0
while IFS='=' read -r unused value; do
typeset "var$((++n))=${value# }"
done < example.log
# Enumerate the variables created.
# Equivalent to: `echo "$var1"`, `echo "$var2"`, ...
for (( i = 1; i <= n; ++i)); do
eval echo \"\$'var'$i\"
done
Read the log file, use regex to get the value after = on each line and assign to a variable in a loop.
var1=$(awk -F " = " '$1 ~ /[x]$/' < file.log)
var2=$(awk -F " = " '$1 ~ /[y]$/' < file.log)
The awk utility command above will use the delimiter " = " and using regex we check whether $1 is having x or y at the end; if it does, we set the value to relevant variable.
In case you want to set the 2nd part in variable
var1=$(awk -F " = " '$1 ~ /[x]$/{print $2}' < file.log)
var2=$(awk -F " = " '$1 ~ /[y]$/{print $2}' < file.log)

match duplicate string before a specified delimiter

cat test.txt
serverabc.test.net
serverabc.qa.net
serverabc01.test.net
serverstag.staging.net
serverstag.test.net
here i need to match the duplicate strings just before the delimiter '.'
So the expected output would be like below. because string "serverabc" and "serverstag" found to be duplicates. Please help.
serverabc.test.net
serverabc.qa.net
serverstag.staging.net
serverstag.test.net
awk to the rescue!
$ awk -F\. '{c[$1]++; a[$1]=a[$1]?a[$1]RS$0:$0}
END{for(k in c) if(c[k]>1) print a[k]}' file
serverabc.test.net
serverabc.qa.net
serverstag.staging.net
serverstag.test.net
If it is not going to be used allot I would probably just do something like this:
cut -f1 -d\. foo.txt | sort |uniq -c | grep -v " 1 " | cut -c 9-|sed 's/\(.*\)/^\1\\./' > dup.host
grep -f dup.host foo.txt
serverabc.test.net
serverabc.qa.net
serverstag.staging.net
serverstag.test.net

insert variable output of script in multiple lines using sed

test file contains
$ cat test
i-d119c118,vol-37905322,,,2015-07-29T03:50:32.511Z,General Purpose SSD,15
i-2278b42e,vol-c90539cc,,,2014-11-12T04:27:22.618Z,General Purpose SSD,10
script output:
$ for instance_id in $(cut -d"," -f1 test); do python getattrib.py get $instance_id | cut -d"'" -f2; done
10.10.0.68
10.10.0.96
inserting variable using sed yields following result, note the same IP address
$ insert=( `for instance_id in $(cut -d"," -f1 test); do python getattrib.py get $instance_id | cut -d"'" -f2; done` )
$ sed "s|$|,${insert}|" test
i-d119c118,vol-37905322,,,2015-07-29T03:50:32.511Z,General Purpose SSD,15,10.10.0.68
i-2278b42e,vol-c90539cc,,,2014-11-12T04:27:22.618Z,General Purpose SSD,10,10.10.0.68
but i am looking for output as below:
10.10.0.68,i-d119c118,vol-37905322,,,2015-07-29T03:50:32.511Z,General Purpose SSD,15
10.10.0.96,i-2278b42e,vol-c90539cc,,,2014-11-12T04:27:22.618Z,General Purpose SSD,10
use start delimiter ^ instead of end $ and adapt the ,
sed "s/^/${insert},/" test
but your sed and value retreiving need to be into the loop, not after or taking all result as value
example in loop:
for instance_id in $(cut -d"," -f1 test)
do
insert="$( python getattrib.py get ${instance_id} | cut -d"'" -f2 )"
sed -e "/^${instance_id}/ !d" -e "s|$|,${insert}|" test
done
insert=( `for instance_id in $(cut -d"," -f1 test); do python getattrib.py get $instance_id | cut -d"'" -f2; done` )
the insert variable is an array holding 2 elements
sed "s|$|,${insert}|" test
${insert} only retrieves the first element -- it is implicitly ${insert[0]}
I would rewrite that like this, to read the file line-by-line:
while IFS=, read -ra fields; do
ip=$( python getattrib.py get "${fields[0]}" | cut -d"'" -f2 )
printf "%s" "$ip"
printf ",%s" "${fields[#]}"
echo
done < test

sort orders in nonalphabetic order with = followed by a string

Why does sort order in a different order when = followed by a string is appended to a line? Is this the correct behaviour or a bug in my version?
$ echo -e "a = T\nab = T"|sort
ab = T
a = T
$ echo -e "a = \nab = "|sort
a =
ab =
$ sort --version
sort (GNU coreutils) 8.13
To me this seems to happen if there are two lines where one starts with a word which is a substring of the first word in the other line.
It's your locale ignoring spaces. Try:
echo -e "a = T\nab = T" | LC_ALL=C sort
or restrict to the first field
echo -e "a = T\nab = T" | sort -k1,1