How do I iterate over all the lines output by a command in zsh? - iteration

How do I iterate over all the lines output by a command using zsh, without setting IFS?
The reason is that I want to run a command against every file output by a command, and some of these files contain spaces.
Eg, given the deleted file:
foo/bar baz/gamma
That is, a single directory 'foo', containing a sub directory 'bar baz', containing a file 'gamma'.
Then running:
git ls-files --deleted | xargs ls
Will report in that file being handled as two files: 'foo/bar', and '/baz/gamma'.
I need it to handle it as one file: 'foo/bar baz/gamma'.

If you want to run the command once for all the lines:
ls "${(#f)$(git ls-files --deleted)}"
The f parameter expansion flag means to split the command's output on newlines. There's a more general form (#s:||:) to split at an arbitrary string like ||. The # flag means to retain empty records. Somewhat confusingly, the whole expansion needs to be inside double quotes, to avoid IFS splitting on the output of the command substitution, but it will produce separate words for each record.
If you want to run the command for each line in turn, the portable idiom isn't particularly complicated:
git ls-filed --deleted | while IFS= read -r line; do ls $line; done
If you want to run the command as few times as the command line length limit permits, use zargs.
autoload -U zargs
zargs -- "${(#f)$(git ls-files --deleted)}" -- ls

Using tr and the -0 option of xargs, assuming that the lines don't contain \000 (NUL), which is a fair assumption due to NUL being one of the characters that can't appear in filenames:
git ls-files --deleted | tr '\n' '\000' | xargs -0 ls
this turns the line: foo/bar baz/gamma\n into foo/bar baz/gamma\000 which xargs -0 knows how to handle

Related

Extract Regex Pattern For Each Line - Leave Blank Line If No Pattern Exists

I am working with the following input:
"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"phone":"549-287-5287","city":"Marshall","gender":"female","email":"mortina.curabia#gmail.com"
I need to be able to extract both the phone number and email of each line into separate files. However, both values don't always appear in the same field - they will always be prefaced with "phone": or "email":, but they may be in the first, second, third or even twentieth field.
I have tried chopping together solutions in SED and AWK to remove everything up until "phone" and then every after the next , but this doesn't not work as desired. It also means that, if "phone" and/or "email do not exist, the line is not changed at all.
I need a solution that will give me an output with the phone value of each line in one file, and the email value in another. HOWEVER, if no phone or email value exists, a blank line in the output needs to be in place.
Any ideas?
This might work for you (GNU sed):
sed -Ene 'h;/.*"phone":([^,]*).*/!z;s//\1/;w phoneFile' -e 'g;/.*"email":([^,]*).*/!z;s//\1/;w emailFile' file
Make a copy of line.
If the line does not contain a phone number empty the line, otherwise remove everything but the phone number.
Write the result to the phone number file.
Replace the current pattern space by the copy of the original line.
Repeat as above for an email address.
N.B. My first attempt used s/.*// instead of z to empty the line which worked but should not have. If the line contained no phone/email, the substitution should have reset default regexp and the second substitution should have objected that it did not contain a back reference. However the second substitution worked in either case.
After fixing your file to be valid json and adding an extra line missing the phone attribute so we can test more of your requirements:
$ cat file
{"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"phone":"549-287-5287","city":"Marshall","gender":"female","email":"mortina.curabia#gmail.com"}
{"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"city":"Marshall","gender":"female","email":"foo.bar#gmail.com"}
you can do whatever you like with the data:
$ jq -r '.email // ""' file
mortina.curabia#gmail.com
foo.bar#gmail.com
$
$ jq -r '.phone // ""' file
549-287-5287
$
As long as it doesn't contain embedded newlines you can used sed 's/.*/{&}/' file to convert the input in your question to valid json as in my answer:
$ cat file
"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"phone":"549-287-5287","city":"Marshall","gender":"female","email":"mortina.curabia#gmail.com"
"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"city":"Marshall","gender":"female","email":"foo.bar#gmail.com"
$ sed 's/.*/{&}/' file
{"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"phone":"549-287-5287","city":"Marshall","gender":"female","email":"mortina.curabia#gmail.com"}
{"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"city":"Marshall","gender":"female","email":"foo.bar#gmail.com"}
$ sed 's/.*/{&}/' file | jq -r '.email // ""'
mortina.curabia#gmail.com
foo.bar#gmail.com
but I'm betting you started out with valid json and removed the {} by mistake along the way so you probably just need to not do that.
Using grep
Try:
grep -o '"phone":"[0-9-]*"' < Input > phone.txt
grep -o '"email":"[^"]*"' <Input > email.txt
Demo:
$echo '"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"phone":"549-287-5287","city":"Marshall","gender":"female","email":"mortina.curabia#gmail.com"' | grep -o '"phone":"[0-9-]*"'
"phone":"549-287-5287"
$echo '"visit_date":{"$date":"2017-11-28T04:43:00.000Z"},"phone":"549-287-5287","city":"Marshall","gender":"female","email":"mortina.curabia#gmail.com"' | grep -o '"email":"[^"]*"'
"email":"mortina.curabia#gmail.com"
$

extract part of column to make cp command [duplicate]

This question already has answers here:
Bash One Liner: copy template_*.txt to foo_*.txt?
(8 answers)
Closed 3 years ago.
I wan to create copy command to copy files from one directory to just back of it with removing suffix date. There are multiple files are there.
eg file LOAN.DAILY.20191204
want to create command
cp LOAN.DAILY.20191204 ../LOAN.DAILY
My attempt
ls -lrt | awk ' /DAILY/{ print "cp " , $9 , "../" , sub(/\.20191204$/,""); $9 }'
getting output
cp LOAN.DAILY.20191204 ../ 1
why this 1 is coming
This might work for you (GNU sed):
ls *DAILY* | sed -E 's#^(.*)\..*#cp & \1#'
and once the output has been checked use this version to enact the copy.
ls *DAILY* | sed -E 's#^(.*)\..*#cp & \1#e'
or an alternative using GNU parallel:
parallel --dry-run cp {} {.} ::: *DAILY*
again, check the result and if all ok, use:
parallel cp {} {.} ::: *DAILY*
One simple way:
shopt -s nullglob
for file in *.DAILY.* ; do cp "$file" ../"${file%.*}"; done
shopt -s nullglob: To avoid any unecessary copies in case the glob doesn't get a match.
"${file%.*}": Shell's parameter expansion to strip off the everything from strings's end till the first matched . in reverse direction.
I can't recall better and shorter ways to do this, although I suppose there are many.
According to https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html:
As mentioned, the third argument to sub() must be a variable, field, or array element. Some versions of awk allow the third argument to be an expression that is not an lvalue. In such a case, sub() still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away because there is no place to put it.
This explains why you get a 1 in the output.
If you want to modify the value of the ninth column you need to specify it in the sub call:
ls -lrt | awk ' /DAILY/{ orig=$9; sub(/\.20191204$/,"", $9); print "cp " , orig , "../", $9 }'
In this command, the original value of $9 is stored in a variable orig, then the date suffix is removed using sub, and finally the cp command is constructed using the old and new values.

Search file contents recursively when know where in file

I am interested in efficiently searching files for content using bash and related tools (eg sed, grep), in the specific case that I have additional information about where in the file the intended content is. For example, I want to replace a particular string in line #3 of each file that contains a specific string on line 3 of the file. Therefore, I don't want to do a recursive grep -r on the whole directory as that would search the entirety of each file, wasting time since I know that the string of interest is on line #3, if it is there. This full-grep approach could be done with grep -rl 'string_to_find_in_files' base_directory_to_search_recursively. Instead I am thinking about using sed -i ".bak" '3s/string_to_replace/string_to_replace_with' files to search only on line #3 of all files recursively in a directory, however sed seems to only be able to take one file as input argument. How can I apply sed to multiple files recursively? find -exec {} \; and find -print0 | xargs -0 seem to be very slow.. Is there a faster method than using find? I can achieve the desired effect very quickly with awk but only on a single directory, it does not seem to me to be recursive, such as using awk 'FNR==3{print $0}' directory/*. Any way to make this recursive? Thanks.
You can use find to have the list of files and feed to sed or awk one by one by xargs
for example, this will print the first lines of all files listed by find.
$ find . -name "*.csv" | xargs -L 1 sed -n '1p'

SSH recursively change all subfolders names to a specific name

i have hundreads of folders with a subfolder named "thumbs" under each folder. i need to change the "thumbs" subfolder name with "thumb", under each subfolder.
i tried
find . -type d -exec rename 's/^thumbs$/thumb/' {} ";"
and i run this in shell when i am inside the folder that contains all subfolders, and each one of these subfolders contains the "thumbs" folder that need to be renamed with "thumb".
well I ran that command and shell stayed a lot of time thinking, then i gave a CTRL+C to stop, but I checked and no folder was renamed under current directory, I dont know if i renamed folders outside the directory i was in, can someone tell me where i am wrong with the code?
Goal 1: To change a subfolder "thumbs" to "thumb" if only one level deep.
Example Input:
./foo1/thumbs
./foo2/thumbs
./foo2/thumbs
Solution:
find . -maxdepth 2 -type d | sed 'p;s/thumbs/thumb/' | xargs -n2 mv
Output:
./foo1/thumb
./foo2/thumb
./foo2/thumb
Explanation:
Use find to give you all "thumbs" folders only one level deep. Pipe the output to sed. The p option prints the input line and the rest of the sed command changes "thumbs" to "thumb". Finally, pipe to xargs. The -n2 option tells xargs to use two arguments from the pipe and pass them to the mv command.
Issue:
This will not catch deeper subfolders. You can't simple not use depth here because find prints the output from the top and since we are replacing things with sed before we mv, mv will result in a error for deeper subfolders. For example, ./foo/thumbs/thumbs/ will not work because mv will take care of ./foo/thumbs first and make it ./foo/thumb, but then the next output line will result in an error because ./foo/thumbs/thumbs/ no longer exist.
Goal 2: To change all subfolders "thumbs" to "thumb" regardless of how deep.
Example Input:
./foo1/thumbs
./foo2/thumbs
./foo2/thumbs/thumbs
./foo2/thumbs
Solution:
find . -type d | awk -F'/' '{print NF, $0}' | sort -k 1 -n -r | awk '{print $2}' | sed 'p;s/\(.*\)thumbs/\1thumb/' | xargs -n2 mv
Output:
./foo1/thumb
./foo2/thumb
./foo2/thumb/thumb
./foo2/thumb
Explanation:
Use find to give you all "thumbs" subfolders. Pipe the output to awk to print the number of '/'s in each path plus the original output. sort the output numerically, in reverse (to put the deepest paths on top) by the number of '/'s. Pipe the sorted list to awk to remove the counts from each line. Pipe the output to sed. The p option prints the input line and the rest of the sed command finds the last occurrence of "thumbs" and changes only it to "thumb". Since we are working with sorted list in the order of deepest to shallowest level, this will provide mv with the right commands. Finally, pipe to xargs. The -n2 option tells xargs to use two arguments from the pipe and pass them to the mv command.

Show filename and line number in grep output

I am trying to search my rails directory using grep. I am looking for a specific word and I want to grep to print out the file name and line number.
Is there a grep flag that will do this for me? I have been trying to use a combination of -n and -l but these are either printing out the file names with no numbers or just dumping out a lot of text to the terminal which can't be easily read.
ex:
grep -ln "search" *
Do I need to pipe it to awk?
I think -l is too restrictive as it suppresses the output of -n. I would suggest -H (--with-filename): Print the filename for each match.
grep -Hn "search" *
If that gives too much output, try -o to only print the part that matches.
grep -nHo "search" *
grep -rin searchstring * | cut -d: -f1-2
This would say, search recursively (for the string searchstring in this example), ignoring case, and display line numbers. The output from that grep will look something like:
/path/to/result/file.name:100: Line in file where 'searchstring' is found.
Next we pipe that result to the cut command using colon : as our field delimiter and displaying fields 1 through 2.
When I don't need the line numbers I often use -f1 (just the filename and path), and then pipe the output to uniq, so that I only see each filename once:
grep -ir searchstring * | cut -d: -f1 | uniq
I like using:
grep -niro 'searchstring' <path>
But that's just because I always forget the other ways and I can't forget Robert de grep - niro for some reason :)
The comment from #ToreAurstad can be spelled grep -Horn 'search' ./, which is easier to remember.
grep -HEroine 'search' ./ could also work ;)
For the curious:
$ grep --help | grep -Ee '-[HEroine],'
-E, --extended-regexp PATTERNS are extended regular expressions
-e, --regexp=PATTERNS use PATTERNS for matching
-i, --ignore-case ignore case distinctions
-n, --line-number print line number with output lines
-H, --with-filename print file name with output lines
-o, --only-matching show only nonempty parts of lines that match
-r, --recursive like --directories=recurse
Here's how I used the upvoted answer to search a tree to find the fortran files containing a string:
find . -name "*.f" -exec grep -nHo the_string {} \;
Without the nHo, you learn only that some file, somewhere, matches the string.