Grep query in C shell script not performing properly - scripting

When I run the grep command on the command prompt, the output is correct. However, when I run it as part of a script, I only get partial output. Does anyone know what is wrong with this programme?
#!/bin/csh
set res = `grep -E "OPEN *(OUTPUT|INPUT|I-O|EXTEND)" ~/work/lst/TXT12UPD.lst`
echo $res

Your wildcard is probably being processed by the shell calling awk rather than as part of the awk script.
try escaping the * with a \ (i.e. \*)

Related

WSL: bash-cmd.exe interoperability: get rid of carriage return [duplicate]

I am new to shell script. I am sourcing a file, which is created in Windows and has carriage returns, using the source command. After I source when I append some characters to it, it always comes to the start of the line.
test.dat (which has carriage return at end):
testVar=value123
testScript.sh (sources above file):
source test.dat
echo $testVar got it
The output I get is
got it23
How can I remove the '\r' from the variable?
yet another solution uses tr:
echo $testVar | tr -d '\r'
cat myscript | tr -d '\r'
the option -d stands for delete.
You can use sed as follows:
MY_NEW_VAR=$(echo $testVar | sed -e 's/\r//g')
echo ${MY_NEW_VAR} got it
By the way, try to do a dos2unix on your data file.
Because the file you source ends lines with carriage returns, the contents of $testVar are likely to look like this:
$ printf '%q\n' "$testVar"
$'value123\r'
(The first line's $ is the shell prompt; the second line's $ is from the %q formatting string, indicating $'' quoting.)
To get rid of the carriage return, you can use shell parameter expansion and ANSI-C quoting (requires Bash):
testVar=${testVar//$'\r'}
Which should result in
$ printf '%q\n' "$testVar"
value123
use this command on your script file after copying it to Linux/Unix
perl -pi -e 's/\r//' scriptfilename
Pipe to sed -e 's/[\r\n]//g' to remove both Carriage Returns (\r) and Line Feeds (\n) from each text line.
for a pure shell solution without calling external program:
NL=$'\n' # define a variable to reference 'newline'
testVar=${testVar%$NL} # removes trailing 'NL' from string

How to run awk script on multiple files

I need to run a command on hundreds of files and I need help to get a loop to do this:
have a list of input files /path/dir/file1.csv, file2.csv, ..., fileN.csv
need to run a script on all those input files
script is something like: command input=/path/dir/file1.csv output=output1
I have tried things like:
for f in /path/dir/file*.csv; do command, but how do I get to read and write the new file every time?
Thank you....
Try this, (after changing /path/to/data to the correct path. Same with /path/to/awkscript and other places, pointing to your test data.)
#!/bin/bash
cd /path/to/data
for f in *.csv ; do
echo "awk -f /path/to/awkscript \"$f\" > ${f%.csv}.rpt"
#remove_me awk -f /path/to/awkscript "$f" > ${f%.csv}.rpt
done
make the script "executable" with
chmod 755 myScript.sh
The echo version will help you ensure the script is going to work as expected. You still have to carefully examine that output OR work on a copy of your data so you don't wreck you base-line data.
You could take the output of the last iteration
awk -f /path/to/awkscript myFileLast.csv > myFileLast.rpt
And copy/paste to cmd-line to confirm it will work.
WHen you comfortable that the awk script works as you need, then comment out the echo awk .. line, and uncomment the word #remove_me (and save your bash script).
for f in /path/to/files/*.csv ; do
bname=`basename $f`
pref=${bname%%.csv}
awk -f /path/to/awkscript $f > /path/to/store/output/${pref}_new.txt
done
Hopefully this helps, I am on my blackberry so there may be typos

awk doesn't work in hadoop's mapper

This is my hadoop job:
hadoop streaming \
-D mapred.map.tasks=1\
-D mapred.reduce.tasks=1\
-mapper "awk '{if(\$0<3)print}'" \ # doesn't work
-reducer "cat" \
-input "/user/***/input/" \
-output "/user/***/out/"
this job always fails, with an error saying:
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `export TMPDIR='..../work/tmp'; /bin/awk { if ($0 < 3) print } '
But if I change the -mapper into this:
-mapper "awk '{print}'"
it works without any error. What's the problem with the if(..) ?
UPDATE:
Thank #paxdiablo for your detailed answer.
what I really want to do is filter out some data whose 1st column is greater than x, before piping the input data to my custom bin. So the -mapper actually looks like this:
-mapper "awk -v x=$x{if($0<x)print} | ./bin"
Is there any other way to achieve that?
The problem's not with the if per se, it's to do with the fact that the quotes have been stripped from your awk command.
You'll realise this when you look at the error output:
sh: -c: line 0: `export TMPDIR='..../work/tmp'; /bin/awk { if ($0 < 3) print } '
and when you try to execute that quote-stripped command directly:
pax> echo hello | awk {if($0<3)print}
bash: syntax error near unexpected token `('
pax> echo hello | awk {print}
hello
The reason the {print} one works is because it doesn't contain the shell-special ( character.
One thing you might want to try is to escape the special characters to ensure the shell doesn't try to interpret them:
{if\(\$0\<3\)print}
It may take some effort to get the correctly escaped string but you can look at the error output to see what is generated. I've had to escape the () since they're shell sub-shell creation commands, the $ to prevent variable expansion, and the < to prevent input redirection.
Also keep in mind that there may be other ways to filter depending on you needs, ways that can avoid shell-special characters. If you specify what your needs are, we can possibly help further.
For example, you could create an shell script (eg, pax.sh) to do the actual awk work for you:
#!/bin/bash
awk -v x=$1 'if($1<x){print}'
then use that shell script in the mapper without any special shell characters:
hadoop streaming \
-D mapred.map.tasks=1 -D mapred.reduce.tasks=1 \
-mapper "pax.sh 3" -reducer "cat" \
-input "/user/***/input/" -output "/user/***/out/"

sed with doublequotes and $

I have a bash script that I use to setup a simple php script on my server. I am stuck with how to correctly change a variable with sed with the script. Here is what I have tried:
echo "Enter Portal Password:"
read PORTPASS;
sed -i 's/$ppass =".*"/$ppass ="$PORTPASS"/' includes/config.php
The above changes the variable in the config file but it only changes it to $PORTPASS it does not change it to what I input in the script.
I also tried this and it does change the $PORTPASS correctly, but it remove the " " around the variable in the file.
sed -i 's/$ppass ='".*"'/$ppass ='"$PORTPASS"';/' includes/config.php
Here is what I'm trying to change in the conf.php file: $ppass ="password";
Try:
sed -i "s/\$ppass =\".*\"/\$ppass =\"$PORTPASS\"/" includes/config.php
You have to use double quotes (") around the command so that the shell will evaluate $PORTPASS before passing it to sed, so then you have to "escape" all of the double quotes within the command.

awk: setting environment variables directly from within an awk script

first post here, but been a lurker for ages. i have googled for ages, but cant find what i want (many abigious topic subjects which dont request what the topic suggests it does ...). not new to awk or scripting, just a little rusty :)
i'm trying to write an awk script which will set shell env values as it runs - for another bash script to pick up and use later on. i cannot simply use stdout from awk to report this value i want setting (i.e. "export whatever=awk cmd here"), as thats already directed to a 'results file' which the awkscript is creating (plus i have more than one variable to export in the final code anyway).
As an example test script, to demo my issue:
echo $MYSCRIPT_RESULT # returns nothing, not currently set
echo | awk -f scriptfile.awk # do whatever, setting MYSCRIPT_RESULT as we go
echo $MYSCRIPT_RESULT # desired: returns the env value set in scriptfile.awk
within scriptfile.awk, i have tried (without sucess)
1/) building and executing an adhoc string directly:
{
cmdline="export MYSCRIPT_RESULT=1"
cmdline
}
2/) using the system function:
{
cmdline="export MYSCRIPT_RESULT=1"
system(cmdline)
}
... but these do not work. I suspect that these 2 commands are creating a subshell within the shell awk is executing from, and doing what i ask (proven by touching files as a test), but once the "cmd"/system calls have completed, the subshell dies, unfortunatley taking whatever i have set with it - so my env setting changes dont stick from "the caller of awk"'s perspective.
so my question is, how do you actually set env variables within awk directly, so that a calling process can access these env values after awk execution has completed? is it actually possible?
other than the adhoc/system ways above, which i have proven fail for me, i cannot see how this could be done (other than writing these values to a 'random' file somewhere to be picked up and read by the calling script, which imo is a little dirty anyway), hence, help!
all ideas/suggestions/comments welcomed!
You cannot change the environment of your parent process. If
MYSCRIPT_RESULT=$(awk stuff)
is unacceptable, what you are asking cannot be done.
You can also use something like is described at
Set variable in current shell from awk
unset var
var=99
declare $( echo "foobar" | awk '/foo/ {tmp="17"} END {print "var="tmp}' )
echo "var=$var"
var=
The awk END clause is essential otherwise if there are no matches to the pattern declare dumps the current environment to stdout and doesn't change the content of your variable.
Multiple values can be set by separating them with spaces.
declare a=1 b=2
echo -e "a=$a\nb=$b"
NOTE: declare is bash only, for other shells, use eval with the same syntax.
You can do this, but it's a bit of a kludge. Since awk does not allow redirection to a file descriptor, you can use a fifo or a regular file:
$ mkfifo fifo
$ echo MYSCRIPT_RESULT=1 | awk '{ print > "fifo" }' &
$ IFS== read var value < fifo
$ eval export $var=$value
It's not really necessary to split the var and value; you could just as easily have awk print the "export" and just eval the output directly.
I found a good answer. Encapsulate averything in a subshell!
The comand declare works as below:
#Creates 3 variables
declare var1=1 var2=2 var3=3
ex1:
#Exactly the same as above
$(awk 'BEGIN{var="declare "}{var=var"var1=1 var2=2 var3=3"}END{print var}')
I found some really interesting uses for this technique. In the next exemple I have several partitions with labels. I create variables using the labels as variable names and the device name as variable values.
ex2:
#Partition data
lsblk -o NAME,LABEL
NAME LABEL
sda
├─sda1
├─sda2
├─sda5 System
├─sda6 Data
└─sda7 Arch
#Creates a subshell to execute the text
$(\
#Pipe lsblk to awk
lsblk -o NAME,LABEL | awk \
#Initiate the variable with the text for the declare command
'BEGIN{txt="declare "}'\
#Filters devices with labels Arch or Data
'/Data|Arch/'\
#Concatenate txt with itself plus text for the variables(name and value)
#substr eliminates the special caracters before the device name
'{txt=txt$2"="substr($1,3)" "}'\
#AWK prints the text and the subshell execute as a command
'END{print txt}'\
)
The end result of this is 2 variables: Data with value sda6 and Arch with value sda7.
The same exemple in a single line.
$(lsblk -o NAME,LABEL | awk 'BEGIN{txt="declare "}/Data|Arch/{txt=txt$2"="substr($1,3)" "}END{print txt}')