How do I search text in a file with DCL - vms

How do I search text in a file with DCL? Yes, I have to use DCL.
The file format is straight forward:
<NUMBER OF ENTRIES>
<ID> <DIRECTORY>
<ID> <DIRECTORY>
.
.
.
<ID> <DIRECTORY>
They're separated by a few white space characters. I just need to search the file for a given ID and extract the DIRECTORY.
It's a really simple task, but I can't seem to find any decent DCL documentation anywhere.

Edited.... the forum 'eats' strings like <xx> unless marked as code.
Are there pointy brackets on the datalines or not?
Please provide a REAL example
is it
or: XX XXX-DIRECTORY
I am assuming the first.
VMS as it ships does NOT have a standard tool to select a field from a record.
But there are a bunch of standard tools available for OpenVMS which can do this.
Mostly notably (g)AWK and PERL
So that's what I would use:
$ gawk /comm="$1 == ""<xx>"" { print $2 }" tmp.tmp
<xxx-DIRECTORY>
or
$ perl -ne "print $1 if /^\s*<xx>.*?<([^>]*)/" tmp.tmp
xxx-DIRECTORY
Those can be augmented for case-and-space-sensitivity, as needed and trim that <> as needed.
And maybe you need the search ID to be a parameter or not.
Anyway, in a pure DCL script it could look like....
$ IF p2.eqs."" then exit 16
$ CLOSE/NOLOG file
$ OPEN/READ file 'p1
$loop:
$ READ/END=done file rec
$ id = F$EDIT( F$ELEM(0,">",F$ELEM(1,"<",rec)), "UPCASE")
$ IF id.NES.p2 THEN GOTO loop
$ dir = F$ELEM(0,">",F$ELEM(2,"<",rec))
$ WRITE SYS$OUTPUT dir
$ GOTO loop
$done:
$CLOSE/NOLOG file
if the <> do not exist, use this for core...
$ rec = F$EDIT(rec,"TRIM,COMPRESS")
$ id = F$EDI(F$ELEM(0," ",rec),"UPCASE")
$ IF id.NES.p2 THEN GOTO loop
$ dir = F$ELEM(1," ",rec)
And the perl would be:
$ perl -ne "print $1 if /^\s*<xx>\s+(\S+)/" tmp.tmp
Good luck
Hein

Alternatively, if the ID field looks like fixed-width, then you may convert the file to RMS INDEXED , keyed on ID field. Then you can just do lookup by calling READ/KEY='ID'.
Call HELP on CONVERT , READ /KEY and perhaps SEARCH /KEY

Related

awk/grep/sed: find multiple patterns at one line in my files

I read a lot here about awk and variables, but could not find what I want.
I have some files ($FILES) in a directory ($DIR) and I want to search in those files for all lines containing: both the 2 strings (SEARCH1 and SEARCH2). Using sh (/bin/bash): I do NOT want to use the read command, so I prefer awk/grep/sed. The wanted output is the line(s) containing the 2 strings and the corresp. file name(s) of the file(s).
When I use this code, everything is ok:
FILES="news_*.txt"
DIR="/news"
awk '/Corona US/&&/Infected/{print a[FILENAME]?$0:FILENAME RS $0;a[FILENAME]++}' ${DIR}/${FILES}
Now I want to replace the 2 patterns ('Corona US' and "Infected') with variables in the awk command and I tried:
SEARCH1="Corona US"
SEARCH2="Infected"
awk -v str1="$SEARCH1" -v str2="$SEARCH2" '/str1/&&/str2/{print a[FILENAME]?$0:FILENAME RS $0;a[FILENAME]++}' ${DIR}/${FILES}
However that did not give me the right output: it came up empty (didn't find anything).
Since you have not shown sample of output so couldn't test it, based on OP's code trying to fix it.
awk -v str1="$SEARCH1" -v str2="$SEARCH2" 'index($0,str1) && index($0,str2){print (seen[FILENAME]++ ? "" : FILENAME ORS) $0;a[FILENAME]++}' ${DIR}/${FILES}
OR
awk -v str1="$SEARCH1" -v str2="$SEARCH2" '$0 ~ str1 && $0 ~ str2{print (seen[FILENAME]++ ? "" : FILENAME ORS) $0;a[FILENAME]++}' ${DIR}/${FILES}
OP's code issue: We can't search variables inside /var/ in should be used like index or $0 ~ str style.
It isn't 100% clear exactly what you are looking for, but it sounds like grep -H with an alternate pattern would allow you to output the filename and the line that matches $SEARCH1 or $SEARCH2 anywhere in the line. For example, you could do:
grep -H "$SEARCH1.*$SEARCH2\|$SEARCH2.*$SEARCH1" "$DIR/"$FILES
(note $FILES must NOT be quoted in order for * expansion to take place.)
If you just want a list of filenames that contain a match on any line, you can change -H to -l.

How to use sed/awk to replace the original file and get the following desired output?

I'm writing a bash scrip that would translate one file to another, and am encountering an issue.
Whenever the program sees something like this(......not included):
......Mul(-a1+b2-c3...+f+e)......
change it to:
......M(-a1)*M(b2)*M(-c3)*...*M(f)*M(e)......
the number of the variables in Mul is unknown and there could be multiple occurrence of Mul in the file. There are also other places in the file where + or - appears. And Variables could be one or more characters.
I tried grouping in sed, with a group followed by a "*", but it doesn't seem to be working due to the need of replacing unknown amount of variables.
Here is a sed script that will do it:
:a
s/\(Mul(.[^)]*\)\([+-].\)/\1)*Mul(\2/
ta
s/Mul(+\{0,1\}/M(/g
The trick is to use the test to jump back to the beginning after making a substitution (e.g. "Mul(a+b+c)"=>"Mul(a)*Mul(+b+c)").
$ cat tst.awk
match($0,/Mul\([^()]+\)/) {
tgt = substr($0,RSTART+4,RLENGTH-5)
gsub(/[-+][[:alnum:]]+/,"*M(&)",tgt)
gsub(/\+/,"",tgt)
sub(/^\*/,"",tgt)
print substr($0,1,RSTART-1) tgt substr($0,RSTART+RLENGTH)
}
$ awk -f tst.awk file
......M(-a1)*M(b2)*M(-c3)*M(f)*M(e)......
The above was run on this input file:
$ cat file
......Mul(-a1+b2-c3+f+e)......

Find a word in a text file and replace it with the filename

I have a lot of text files in which I would like to find the word 'CASE' and replace it with the related filename.
I tried
find . -type f | while read file
do
awk '{gsub(/CASE/,print "FILENAME",$0)}' $file >$file.$$
mv $file.$$ >$file
done
but I got the following error
awk: syntax error at source line 1 context is >>> {gsub(/CASE/,print <<< "CASE",$0)}
awk: illegal statement at source line 1
I also tried
for i in $(ls *);
do
awk '{gsub(/CASE/,${i},$0)}' ${i} > file.txt;
done
getting an empty output and
awk: syntax error at source line 1 context is >>> {gsub(/CASE/,${ <<<
awk: illegal statement at source line 1
Why awk? sed is what you want:
while read -r file; do
sed -i "s/CASE/${file##*/}/g" "$file"
done < <( find . -type f )
or
while read -r file; do
sed -i.bak "s/CASE/${file##*/}/g" "$file"
done < <( find . -type f )
To create a backup of the original.
You didn't post any sample input and expected output so this is a guess but maybe this is what you want:
find . -type f |
while IFS= read -r file
do
awk '{gsub(/CASE/,FILENAME)} 1' "$file" > "${file}.$$" &&
mv "${file}.$$" "$file"
done
Every change I made to the shell code is important so if you don't understand why I changed any part of it, ask the question.
btw if after making the changes you are still getting the error message:
awk: syntax error at source line 1
awk: illegal statement at source line 1
then you are using old, broken awk (/usr/bin/awk on Solaris). Never use that awk. On Solaris use /usr/xpg4/bin/awk (or nawk if you must).
Caveats: the above will fail if your file name contains newlines or ampersands (&) or escaped digits (e.g. \1). See Is it possible to escape regex metacharacters reliably with sed for details. If any of that is a problem, post some representative sample input and expected output.
print in that first script is the error.
The second argument to gsub is the replacement string not a command.
You want just FILENAME. (Note not "FILENAME" that's a literal string. FILENAME the variable.)
find . -type f -print0 | while IFS= read -d '' file
do
awk '{gsub(/CASE/,FILENAME,$0)} 7' "$file" >"$file.$$"
mv "$file.$$" "$file"
done
Note that I quoted all your variables and fixed your find | read pipeline to work correctly for files with odd characters in the names (see Bash FAQ 001 for more about that). I also fixed the erroneous > in the mv command.
See the answers on this question for how to properly escape the original filename to make it safe to use in the replacement portion of gsub.
Also note that recent (4.1+ I believe) versions of awk have the -i inplace argument.
To fix the second script you need to add the quotes you removed from the first script.
for i in *; do awk '{gsub(/CASE/,"'"${i}"'",$0)}' "${i}" > file.txt; done
Note that I got rid of the worse than useless use of ls (worse than useless because it actively breaks files with spaces or shell metacharacters in the their names (see Parsing ls for more on that).
That command though is somewhat ugly and unsafe for filenames with various characters in them and would be better written as the following though:
for i in *; do awk -v fname="$i" '{gsub(/CASE/,fname,$0)}' "${i}" > file.txt; done
since that will work with filenames with double quotes/etc. in their names correctly whereas the direct variable expansion version will not.
That being said the corrected first script is the right answer.

Using grep or sed in a foreach loop won't work

I've spent countless hours trying to get this work and I think it's time to get some help. I have a 2-column file - let's call it "result.txt" with a list of values like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
I want to recreate this list using only unique results for each file type, so it should look like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I created a list of files which would be able to do this by using grep or sed to extract the first line containing the matching file:
fileA
fileB
fileC
fileD
We'll call this result2.txt.
I have attempted to write the following c-shell script:
foreach l (`cat result2.txt`)
set name = "$l"
echo "$name"
grep -m1 "$name" result.txt >> result3.txt
end
The output file, "result3.txt" is empty. The script runs perfectly up to the grep command. When I run the grep command outside of the loop, using a line from result2.txt, it works fine. I get the same result using this: sed -n '/"\$name\"/p'
And I think I tried an awk command at some point.
The problem seems to be in getting those programs to recognise the $name or $l variables. I have tried different combinations of " and ' around $name and I have tried adding backslashes: e.g. $\name. Can anyone please tell me what the issue is?
Thanks
Sounds like a job for awk. Use underscore or whitespace as the field separator, and print a line only if the first field has not been seen yet:
awk -F '[_[:space:]]+' '!seen[$1]++' << END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I've just tried in CSH and both your version and the following simplified version just work. Note, no quotation marks at all.
foreach name (`cat result2.txt`)
grep -m1 $name result.txt >>result3.txt
end
Could you please check whether result.txt really contains what you mentioned at the beginning?
cat result.txt
sed -n 's/.*/²&³/;H
$ {x;s/\(.\).*/&\1/
t again
: again
s/²\([^_]\{1,\}_\)\(.*\)\²\1[^³]*³./²\1\2/
t again
s/.\(.*\)./\1/;s/[²³]//g
p
}' YourFile
Use of 2 temporary delimiter ² and ³ due to limitation in \n manipulation

awk split question

I wrote a small script, using awk 'split' command to get the current directory name.
echo $PWD
I need to replace '8' with the number of tokens as a result of the split operation.
// If PWD = /home/username/bin. I am trying to get "bin" into package.
package="`echo $PWD | awk '{split($0,a,"/"); print a[8] }'`"
echo $package
Can you please tell me what do I substitute in place of 'print a[8]' to get the script working for any directory path ?
-Sachin
You don't need awk for that. If you always want the last dir in a path just do:
#!/bin/sh
cur_dir="${PWD##*/}/"
echo "$cur_dir"
The above has the added benefit of not creating any subshells and/or forks to external binaries. It's all native POSIX shell syntax.
You could use print a[length(a)] but it's better to avoid splitting and use custom fields separator and $NF:
echo $PWD | awk -F/ '{print $NF}'
But in that specific case you should rather use basename:
basename "$PWD"
The other answers are better replacements to perform the function you're trying to accomplish. However, here is the specific answer to your question:
package=$(echo $PWD | awk '{n = split($0,a,"/"); print a[n] }')
echo "$package"
split() returns the number of resulting elements.