How to modify sed awk command to work with relative path - awk

Context
I had a SO question successfully answered at https://stackoverflow.com/a/59244265/80353
I have successfully used the command that was given.
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\
sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\
|tee -a "$2")
What does this command do?
This command will download captions for a youtube video as a .vtt file from $1 parameter
then print out the simplified version of the .vtt file into another file that's stated as parameter $2
This works as advertised.
How to call the command
In the terminal I will run the above command once and then run cap $youtube_url $full_path_to_output_file
What changes I would like
Currently, the $2 parameter must be a full path. Also currently, if the $2 parameter doesn't exist, an actual file will be created. What I would like is this behavior remains even for relative path. So hopefully for relative path, this behavior of creating a new empty file still works.
Update
I see that comments are such that there's nothing wrong with the command.
However, I did try running
cap $youtube_url $relative_path_to_a_text_file and it definitely did not work for me in macOS
Perhaps I am missing something else?
Update 2
This is a video of me running the awk sed command . First I did it with just a relative path. No output file shows up in the current working directory. The second shows me typing the full path and it works.
https://www.loom.com/share/1c179506fa5b48b4a3d62c81a9d2a411
I hope this clarifies the question i am raising and the commenters would kindly update their comments based on this video.

EDIT: Adding a solution after OP's comment which do checks inside OP's function itself, warning not tested it though.
cap()(
user_path=$(echo "$path_details" | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH)}')
path_details="$2"
PWD=`pwd`
cd "$PWD"
user_path=$(echo "$path_details" | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH)}')
if [[ -d "$user_path" ]]
then
echo "Present path $user_path."
##Call your program here....##
cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\
sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\
|tee -a "$2"
else
echo "NOT present path $user_path."
##Can exit from here. if needed.##
fi
)
I believe OP wants to check directory of relative path passed as 2nd argument, is present or not, if this is the case then one could try following.
cat file.ksh
path_details="$2"
PWD=`pwd`
##Why I am going to your path is, in case you are running this from cron, so in that case you can mention complete path here, rather than pwd as mentioned above.
cd "$PWD"
user_path=$(echo "$path_details" | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH)}')
if [[ -d "$user_path" ]]
then
echo "Present path $user_path."
##Call your program here....##
else
echo "NOT present path $user_path."
##Can exit from here. if needed.##
fi
Explanation: Adding detailed explanation for above code.
cat file.ksh ##For OP reference to show content I am using cat script_name here.
path_details="$2" ##Creating variable path_details whose value is $2(2nd argument passed to script)
PWD=`pwd` ##Creating variable PWD whose value is pwd(current working directory).
##Why I am going to your path is, in case you are running this from cron, so in that case you can mention complete path here, rather than pwd as mentioned above.
cd "$PWD" ##Going to current directory, why I did is you can set PWD above variable value as per your need and navigate to that path, this will help in case of script is running from Cron.
user_path=$(echo "$path_details" | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH)}') ##Now getting path details from passed 2nd argument for script.
if [[ -d "$user_path" ]] ##Checking if user_path(path value is existing on system)
then
echo "Present path $user_path."
##Call your program here....## ##If path existing then call your program.
else ##If path NOT existing then exit from program or print message up to you :)
echo "NOT present path $user_path."
##Can exit from here. if needed.##
fi ##Closing if condition here.

Related

How to modify this sed awk command so that the output goes to a file of choice?

I am using the last command from this SO answer https://stackoverflow.com/a/54818581/80353
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'|tee cap)
What this command currently do
This command will download captions for a youtube video as a .vtt file and
then print out on the terminal the simplified version of the .vtt file
This command works as described.
How to use this command
In the terminal I will run the above command once and then run cap $youtube_url
What I like to have
I would like to modify the original cap() function so that the original behavior remains with one extra part
This command will download captions for a youtube video as a .vtt file (unchanged)
then print out the simplified version of the .vtt file into another file that's stated as parameter $2 (changed)
How I expect to call the new command
Originally, I would call the original command as
cap $youtube_url
Now I like to do this
cap $youtube_url $relative_or_absolute_path_of_text_or_markdown_file
How do I modify the original cap command to achieve the outcome I want?
Considering that you want to see output on screen as well as you want to save output into a output file too, if this is the case could you please try following.
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'|tee -a "$2")
OR in non-one liner form use:
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\
sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\
|tee -a "$2")
Please make sure that you have provided complete path in your variable eg--> relative_or_absolute_path_of_text_or_markdown_file="/full/path/output_file.txt" etc just an example. I couldn't test it since I don't have mechanism for vtt files etc in my box.
In case you don't want to print information on screen and simply want to save output into output file then as #oguz ismail's comment use only tee "$2" not tee -a "$2" as I shown above.
Here's a detailed bash script for those who wants to save the subs file with a relative path.
The result is saved as plaintext, removing time, new lines and other markup.
#!/bin/bash
# video-cap.sh videoUrl sub.txt
# Download captions only and save in a .vtt file
youtube-dl --skip-download --write-auto-sub "$1";
# Find .vtt files in current directory created within last 3 seconds, limit to 1
vtt=$(find . -cmin -0.05 -name "*.vtt" | head -1)
# Extract the subs and save as plaintext, removing time, new lines and other markup
sed '1,/^$/d' "$vtt" \
| sed 's/<[^>]*>//g' \
| awk -F. 'NR%8==1{$1}NR%8==3' \
| tr '\n' ' ' > "$2"
# Remove the original .vtt subs file
rm -f "$vtt"
Thank You #KimStacks #RavinderSingh13 #Oguz-Ismail for posting these solutions above and in the previous post
I managed to get results in the .vtt file with youtube-dl --skip-download --write-auto-sub $youtube_url
However, the format of the output is not ideal for my purpose. I have to delete line by line in order to remove the time as well as the /n new line. So I would like to customize the code syntax to fit my requirements.
NOTE: Not sure whether it's a new query or not, so I will post it here for now:
I have tried all the steps suggested in previous post and here as well but I still can not understand:
How to insert the "$youtube_url" inside the code below?
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\
sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\
|tee -a "$2")
I tried editing the numbers from 0 to 3 to -1 in 'NR%8==1{printf"%s ",$1}NR%8==3', on both ends but not successfully getting the right format inside the .vtt file. Thus, Is it possible to have:
transcripted text printed continously as sentences, rather than each subtitle printed as new lines?
remove printout of start time?

Apache Subversion pre-commit to restrict files

I'm new to Apache SVN and I need some help to use a pre-commit script to filter which files are being upload to my repository.
I was searching a lot and found this script on another question, but it didn't work for me.
#!/bin/bash
REPOS=$1
TXN=$2
AWK=/usr/bin/awk
SVNLOOK="/usr/bin/svnlook";
#Put all the restricted formats in variable FILTER
FILTER=".(sh|xls|xlsx|exe|xlsm|XLSM|vsd|VSD|bak|BAK|class|CLASS)$"
# Figure out what directories have changed using svnlook.
FILES=`${SVNLOOK} changed -t ${REPOS} ${TXN} | ${AWK} '{ print $2 }'` > /dev/null
for FILE in $FILES; do
#Get the base Filename to extract its extension
NAME=`basename "$FILE"`
#Get the extension of the current file
EXTENSION=`echo "$NAME" | cut -d'.' -f2-`
#Checks if it contains the restricted format
if [[ "$FILTER" == *"$EXTENSION"* ]]; then
echo "Your commit has been blocked because you are trying to commit a restricted file." 1>&2
echo "Please contact SVN Admin. -- Thank you" 1>&2
exit 1
fi
done
exit 0
If I try to use svnlook changed -t repodirectory it didn't work because had a missing subcommand.
I overwrote my pre-commit.tmpl but it didn't work, can someone help me?
First - seems you incorrectly use svnlook. It should has parameters:
svnlook changed ${REPOS} -t ${TXN}
-t means 'read from transaction' and TXN - transaction name itself.
Second - not sure if I understand correctly, but hook file should has name pre-commit not pre-commit.tmpl
Third - pre-commit should has correct rights. For tests try a+rwx
update. It is not easy to obtain transaction object for tests, but you can use svnlook -r <revision> <repositiry_path> and experiment on already commited revisions.

How to run an .awk file without 'awk -f' command?

I am new to awk script. I am trying to figure out how to run an awk file without awk -f command. I see people keep saying add "#!bin/awk -f" for the first line of an awk file. But this didn't for my awk. It still gives me "no file or directory" error.
I question is what does "#!bin/awk -f" really mean, and what does it do?
Its #!/bin/awk -f not #!bin/awk. That will probably work, but theres no guaranty. If someone who has awk installed in a different location runs your script, it won't work. What you want is this: #!/usr/bin/env awk -f.
#! is what tells bash what to use to interpret your script. It should go at the very top of your file. It's called a Shebang. Right after that, you put the path to the interpreter.
/usr/bin/env finds where awk is located, and uses that script as the interpreter. So if they installed awk into somewhere else like /usr/local/bin then it'll find it. This probably won't matter for you, but it's a good habit to get into. It's more portable, and can be shared easier.
The -f says that awk is gonna read from a file. You could do awk -f yourfilename.awk in bash, but in the shebang, -f means the rest of the code will be the file it reads from.
I hope this helped. Feel free to ask me any questions if it doesn't work, or isn't clear enough.
UPDATE
If you get the error message:
/usr/bin/env: ‘awk -f’: No such file or directory
/usr/bin/env: use -[v]S to pass options in shebang lines
then change the first line of your script to #!/usr/bin/env -S awk -f (tested with GNU bash, version 4.4.23)
You probably want
#!/bin/awk -f
(The first slash after the #! is important).
This tells unix what program it should use to 'run' the script with.
It is usually called the 'shebang' which comes from hash + bang.
If you want to run your script like this you need to make sure it is executable (chmod +x <script>).
Otherwise you can just run your script by typing the command /bin/awk -f <script>
The Shebang for Awk Explained
#! is the start of a shebang line, which tells the shell which interpreter to use for the script.
/bin/awk is the path to your awk executable. You may need to change this is your awk is installed elsewhere, or if you want to use a different version of awk.
-f is a flag to awk to tell it to interpret the flag's argument as an awk script. In a shebang, it tells some awks to interpret the remainder of the script instead of a file.
Your Shebang is (Probably) Broken
You are using #!bin/awk -f which is unlikely to work, unless you have awk installed as $PWD/bin/awk. You probably meant to use #!/bin/awk instead.
In some instances, passing a flag on the shebang line may not work with your shell or your awk. If you have the rest of the shebang line correct, you might try removing the -f flag and see if that works for you.

Redirect stderr to stdout in C shell

When I run the following command in csh, I got nothing, but it works in bash.
Is there any equivalent in csh which can redirect the standard error to standard out?
somecommand 2>&1
The csh shell has never been known for its extensive ability to manipulate file handles in the redirection process.
You can redirect both standard output and error to a file with:
xxx >& filename
but that's not quite what you were after, redirecting standard error to the current standard output.
However, if your underlying operating system exposes the standard output of a process in the file system (as Linux does with /dev/stdout), you can use that method as follows:
xxx >& /dev/stdout
This will force both standard output and standard error to go to the same place as the current standard output, effectively what you have with the bash redirection, 2>&1.
Just keep in mind this isn't a csh feature. If you run on an operating system that doesn't expose standard output as a file, you can't use this method.
However, there is another method. You can combine the two streams into one if you send it to a pipeline with |&, then all you need to do is find a pipeline component that writes its standard input to its standard output. In case you're unaware of such a thing, that's exactly what cat does if you don't give it any arguments. Hence, you can achieve your ends in this specific case with:
xxx |& cat
Of course, there's also nothing stopping you from running bash (assuming it's on the system somewhere) within a csh script to give you the added capabilities. Then you can use the rich redirections of that shell for the more complex cases where csh may struggle.
Let's explore this in more detail. First, create an executable echo_err that will write a string to stderr:
#include <stdio.h>
int main (int argc, char *argv[]) {
fprintf (stderr, "stderr (%s)\n", (argc > 1) ? argv[1] : "?");
return 0;
}
Then a control script test.csh which will show it in action:
#!/usr/bin/csh
ps -ef ; echo ; echo $$ ; echo
echo 'stdout (csh)'
./echo_err csh
bash -c "( echo 'stdout (bash)' ; ./echo_err bash ) 2>&1"
The echo of the PID and ps are simply so you can ensure it's csh running this script. When you run this script with:
./test.csh >test.out 2>test.err
(the initial redirection is set up by bash before csh starts running the script), and examine the out/err files, you see:
test.out:
UID PID PPID TTY STIME COMMAND
pax 5708 5364 cons0 11:31:14 /usr/bin/ps
pax 5364 7364 cons0 11:31:13 /usr/bin/tcsh
pax 7364 1 cons0 10:44:30 /usr/bin/bash
5364
stdout (csh)
stdout (bash)
stderr (bash)
test.err:
stderr (csh)
You can see there that the test.csh process is running in the C shell, and that calling bash from within there gives you the full bash power of redirection.
The 2>&1 in the bash command quite easily lets you redirect standard error to the current standard output (as desired) without prior knowledge of where standard output is currently going.
I object the above answer and provide my own. csh DOES have this capability and here is how it's done:
xxx |& some_exec # will pipe merged output to your some_exec
or
xxx |& cat > filename
or if you just want it to merge streams (to stdout) and not redirect to a file or some_exec:
xxx |& tee /dev/null
As paxdiablo said you can use >& to redirect both stdout and stderr. However if you want them separated you can use the following:
(command > stdoutfile) >& stderrfile
...as indicated the above will redirect stdout to stdoutfile and stderr to stderrfile.
xxx >& filename
Or do this to see everything on the screen and have it go to your file:
xxx | & tee ./logfile
What about just
xxx >& /dev/stdout
???
I think this is the correct answer for csh.
xxx >/dev/stderr
Note most csh are really tcsh in modern environments:
rmockler> ls -latr /usr/bin/csh
lrwxrwxrwx 1 root root 9 2011-05-03 13:40 /usr/bin/csh -> /bin/tcsh
using a backtick embedded statement to portray this as follows:
echo "`echo 'standard out1'` `echo 'error out1' >/dev/stderr` `echo 'standard out2'`" | tee -a /tmp/test.txt ; cat /tmp/test.txt
if this works for you please bump up to 1. The other suggestions don't work for my csh environment.

script to run a certain program with input from a given directory

So I need to run a bunch of (maven) tests with testfiles being supplied as an argument to a maven task.
Something like this:
mvn clean test -Dtest=<filename>
And the test files are usually organized into different directories. So I'm trying to write a script which would execute the above 'command' and automatically feed the name of all files in a given dir to the -Dtest.
So I started out with a shellscript called 'run_test':
#!/bin/sh
if test $# -lt 2; then
echo "$0: insufficient arguments on the command line." >&1
echo "usage: $0 run_test dirctory" >&1
exit 1
fi
for file in allFiles <<<<<<< what should I put here? Can I somehow iterate thru the list of all files' name in the given directory put the file name here?
do mvn clean test -Dtest= $file
exit $?
The part where I got stuck is how to get a list of filenames.
Thanks,
Assuming $1 contains the directory name (validation of the user input is a separate issue), then
for file in $1/*
do
[[ -f $file ]] && mvn clean test -Dtest=$file
done
will run the comand on all files. If you want to recurse into subdirectories then you need to use the find command
for file in $(find $1 -type f)
do
etc...
done
#! /bin/sh
# Set IFS to newline to minimise problems with whitespace in file/directory
# names. If we also need to deal with newlines, we will need to use
# find -print0 | xargs -0 instead of a for loop.
IFS="
"
if ! [[ -d "${1}" ]]; then
echo "Please supply a directory name" > &2
exit 1
else
# We use find rather than glob expansion in case there are nested directories.
# We sort the filenames so that we execute the tests in a predictable order.
for pathname in $(find "${1}" -type f | LC_ALL=C sort) do
mvn clean test -Dtest="${pathname}" || break
done
fi
# exit $? would be superfluous (it is the default)