Save entire file from GREP results - awk

Okay so I'm using grep on a external HDD
example,
M:/
grep -rhI "bananas" . > out.txt
which would output any lines within " M:/ " containing " bananas "
However I would like to output the entire contents of the file, so if one line in example.txt contains " bananas " output entire content of example.txt and same goes for any other .txt file within directory " M:/ " that contains " bananas ".

To print the contents of any file name containing the string bananas would be:
find . -type f -exec grep 'bananas' -l --null {} + | xargs -0 cat
The above uses GNU tools to handle with file names containing newlines.
Forget you ever saw any grep args to recursively find files, adding those args to GNU grep was a terrible idea and just makes your code more complicated. Use find to find files and grep to g/re/p within files.

Related

Using grep to only obtain first match in EACH file

I have a bunch of output files labelled file1.out, file2.out, file3.out, ...,fileN.out.
All of these files have multiple instances of a string in them called "keystring". However, only the first instance of "keystring" is meaningful to me. The other lines are not required.
When I do
grep 'keystring' *.out
I reach all files, and they output every instance of keystring.
When I do grep -m1 'keystring' *.out I only get the instance when file1.out has keystring.
I want to extract the line where keystring appears FIRST in all these output files. How can I pull this off?
You can use awk:
awk '/keystring/ {print FILENAME ":", $0; nextfile}' *.out
nextfile will move to next file as soon as it has printed first match from current file.
Use find -exec like so:
find . -name '*.out' -exec grep -m1 'keystring' {} \;
SEE ALSO:
GNU find manual

What is a more efficient way to get the first match from a reverse file search using some combination of awk grep and sed

I am working on an operating system with limited utilities. Utilities like tail, head, and tac are not available! sed, awk, and Grep are available, but grep does not have the -m option for stopping after the first find. see the list of available options here.
My goal is to search for a line containing a string in a potentially large log.txt file, maybe ~100Mb from the end in reverse and print it. The trick is the operation has to be fast: no more than 3-4sec tops.
I tried using sed to reverse the contents of the file into another and then using awk and grep in a loop to search chunks of 10,000 lines, but the sed reverse was way too slow for anything beyond a few Mb
Something I tried.
self.sed_line_search = 10001
self.sed_cmd = "sed -e :a -e '$q;N;"+str(self.sed_line_search)+",$D;ba'"
self.awk_cmd = "awk '/Version/{print}'"
self.Command = self.sed_cmd + " " + LOGFILE_PATH + " | " + self.awk_cmd + "\n"
tries, max_tries = 1,5
while tries < max_tries:
ret = execute(self.Command)
if not ret:
self.sed_line_search += 10000
self.sed_cmd = "sed -e :a -e '$q;N;"+str(self.sed_line_search)+",$D;ba'"
self.Command = self.sed_cmd + " " + LOGFILE_PATH + " | " + self.awk_cmd + "\n"
tries += 1
With out knowing how to stop at the fist match without the grep -m 1 option, this slightly achieves that goal by only looking at a few thousand lines at a time. But, It does not search in reverse.
Not sure if it this you want. It search for all line with test and prints them in reveres.
cat file
dfsdf
test1
fsdfsdf
fdg
sfdgs
fdgsdf
gsfdg
sfdte
test2
dgsfdgsdf
fdgsfdg
sdfgs
df
test3
sfdgsfdg
awk '/test/ {a[++x]=$0} END {for (i=x;i>=1;i--) print a[i]}' file
test3
test2
test1
This might work for you (GNU sed):
sed -n '/regexp/h;$!b;x;p' file
Copy the line that matches regexp to the hold space and at the end of the file print the hold space.
IMHO the fastest you could do would be:
grep 'regexp' | sed -n '$p'

Cygwin - grep ( if file contains )

Okay so basically I have a list of emails in
EMAILS.TXT
Then I have another bunch .txt files containing, email:phonenumber:name
Compiled1.txt
Compiled2.txt
Compiled3.txt
...
Can I use grep or gawk, to search a folder containg compiled1, compiled2 to see if lines contain emails from the .txt file?
So example
email.txt Contains " example#example.com " & " example1#example1.com "
Folder containing
Compiled1.txt & Compiled2.txt have both these lines
Cygwin/Gnuwin outputs lines from compiled1 & compiled2 IF it contains those specified from emails.txt
Output > example#example.com:000000:ExampleUser
example1#example1.com:00010101:ExampleUser2
...
You can use
grep -Fi -f EMAILS.TXT Compiled*.txt
-f EMAILS.TXT uses the lines of EMAILS.TXT as search patterns.
-F disables special treatment of symbols like . or ?.
-i case insensitive search.
Output will be of the form
File-in-which-a-match-was-found:Matched-line-from-that-file
In case of your example:
Compiled1.txt:example#example.com:000000:ExampleUser
Compiled1.txt:example1#example1.com:00010101:ExampleUser2
Compiled2.txt:example#example.com:000000:ExampleUser
Compiled2.txt:example1#example1.com:00010101:ExampleUser2
If your are also interested in the line numbers, add -n to the command.

Show filename and line number in grep output

I am trying to search my rails directory using grep. I am looking for a specific word and I want to grep to print out the file name and line number.
Is there a grep flag that will do this for me? I have been trying to use a combination of -n and -l but these are either printing out the file names with no numbers or just dumping out a lot of text to the terminal which can't be easily read.
ex:
grep -ln "search" *
Do I need to pipe it to awk?
I think -l is too restrictive as it suppresses the output of -n. I would suggest -H (--with-filename): Print the filename for each match.
grep -Hn "search" *
If that gives too much output, try -o to only print the part that matches.
grep -nHo "search" *
grep -rin searchstring * | cut -d: -f1-2
This would say, search recursively (for the string searchstring in this example), ignoring case, and display line numbers. The output from that grep will look something like:
/path/to/result/file.name:100: Line in file where 'searchstring' is found.
Next we pipe that result to the cut command using colon : as our field delimiter and displaying fields 1 through 2.
When I don't need the line numbers I often use -f1 (just the filename and path), and then pipe the output to uniq, so that I only see each filename once:
grep -ir searchstring * | cut -d: -f1 | uniq
I like using:
grep -niro 'searchstring' <path>
But that's just because I always forget the other ways and I can't forget Robert de grep - niro for some reason :)
The comment from #ToreAurstad can be spelled grep -Horn 'search' ./, which is easier to remember.
grep -HEroine 'search' ./ could also work ;)
For the curious:
$ grep --help | grep -Ee '-[HEroine],'
-E, --extended-regexp PATTERNS are extended regular expressions
-e, --regexp=PATTERNS use PATTERNS for matching
-i, --ignore-case ignore case distinctions
-n, --line-number print line number with output lines
-H, --with-filename print file name with output lines
-o, --only-matching show only nonempty parts of lines that match
-r, --recursive like --directories=recurse
Here's how I used the upvoted answer to search a tree to find the fortran files containing a string:
find . -name "*.f" -exec grep -nHo the_string {} \;
Without the nHo, you learn only that some file, somewhere, matches the string.

Shell Script Search and Delete Non Text Files

I want to write a shell script to search and delete all non text files in a directory..
I basically cd into the directory that I want to iterate through in the script and search through all files.
-- Here is the part I can't do --
I want to check using an if statement if the file is a text file.
If not I want to delete it
else continue
Thanks
PS By the way this is in linux
EDIT
I assume a file is a "text file" if and only if its name matches the shell pattern *.txt.
The file program always outputs the word "text" when passed the name of a file that it determines contains text format. You can test for output using grep. For example:
find -type f -exec file '{}' \; | grep -v '.*:[[:space:]].*text.*' | cut -d ':' -f 1
I strongly recommend printing out files to delete before deleting them, to the point of redirecting output to a file and then doing:
rm $(<filename)
after reviewing the contents of "filename". And beware of filenames with spaces, if you have those, things can get more involved.
Use the opposite of, unless an if statement is mandatory:
find <dir-path> -type f -name "*.txt" -exec rm {} \;
What the opposite is exactly is an exercise for you. Hint: it comes before -name.
Your question was ambiguous about how you define a "text file", I assume it's just a file with extension ".txt" here.
find . -type f ! -name "*.txt" -exec rm {} +;