Show filename and line number in grep output - awk

I am trying to search my rails directory using grep. I am looking for a specific word and I want to grep to print out the file name and line number.
Is there a grep flag that will do this for me? I have been trying to use a combination of -n and -l but these are either printing out the file names with no numbers or just dumping out a lot of text to the terminal which can't be easily read.
ex:
grep -ln "search" *
Do I need to pipe it to awk?

I think -l is too restrictive as it suppresses the output of -n. I would suggest -H (--with-filename): Print the filename for each match.
grep -Hn "search" *
If that gives too much output, try -o to only print the part that matches.
grep -nHo "search" *

grep -rin searchstring * | cut -d: -f1-2
This would say, search recursively (for the string searchstring in this example), ignoring case, and display line numbers. The output from that grep will look something like:
/path/to/result/file.name:100: Line in file where 'searchstring' is found.
Next we pipe that result to the cut command using colon : as our field delimiter and displaying fields 1 through 2.
When I don't need the line numbers I often use -f1 (just the filename and path), and then pipe the output to uniq, so that I only see each filename once:
grep -ir searchstring * | cut -d: -f1 | uniq

I like using:
grep -niro 'searchstring' <path>
But that's just because I always forget the other ways and I can't forget Robert de grep - niro for some reason :)

The comment from #ToreAurstad can be spelled grep -Horn 'search' ./, which is easier to remember.
grep -HEroine 'search' ./ could also work ;)
For the curious:
$ grep --help | grep -Ee '-[HEroine],'
-E, --extended-regexp PATTERNS are extended regular expressions
-e, --regexp=PATTERNS use PATTERNS for matching
-i, --ignore-case ignore case distinctions
-n, --line-number print line number with output lines
-H, --with-filename print file name with output lines
-o, --only-matching show only nonempty parts of lines that match
-r, --recursive like --directories=recurse

Here's how I used the upvoted answer to search a tree to find the fortran files containing a string:
find . -name "*.f" -exec grep -nHo the_string {} \;
Without the nHo, you learn only that some file, somewhere, matches the string.

Related

Delete everything before first pattern match with sed/awk

Let's say I have a line looking like this:
/Users/random/354765478/Tests/StoreTests/Base64Tests.swift
In this example, I would like to get the result:
Tests/StoreTests/Base64Tests.swift
How can I do if I want to get everything before the first pattern match (either Sources or Tests) using sed or awk?
I am using sed 's/^.*\(Tests.*\).*$/\1/' right now but it's falling:
echo '/Users/random/354765478/Tests/StoreTests/Base64Tests.swift' | sed 's/^.*\(Tests\)/\1/'
Tests.swift
Here's another example using Sources (which seems to work):
echo '/Users/random/741672469/Sources/Store/StoreDataSource.swift' | sed 's/^.*\(Sources\)/\1/'
Sources/Store/StoreDataSource.swift
I would like to get everything before the first, and not the last Sources or Tests pattern match.
Any help would be appreciated!
How can I do if I want to get everything before the first pattern match (either Sources or Tests).
Easier to use a grep -o here:
grep -Eo '(Sources|Tests)/.*' file
Tests/StoreTests/Base64Tests.swift
Sources/Store/StoreDataSource.swift
# where input file is
cat file
/Users/random/354765478/Tests/StoreTests/Base64Tests.swift
/Users/random/741672469/Sources/Store/StoreDataSource.swift
Breakdown:
Regex pattern (Sources|Tests)/.* match any text that starts with Sources/ or Tests/ until end of the line.
-E: enables extended regex mode
-o: prints only matched text instead of full line
Alternatively you may use this awk as well:
awk 'match($0, /(Sources|Tests)\/.*/) {
print substr($0, RSTART)
}' file
Tests/StoreTests/Base64Tests.swift
Sources/Store/StoreDataSource.swift
Or this sed:
sed -E 's~.*/((Sources|Tests)/.*)~\1~' file
Tests/StoreTests/Base64Tests.swift
Sources/Store/StoreDataSource.swift
With your shown samples please try following GNU grep. This will look for very first match of /Sources OR /Tests and then print values from these strings to till end of the value.
grep -oP '^.*?\/\K(Sources|Tests)\/.*' Input_file
Using sed
$ sed -E 's~([^/]*/)+((Tests|Sources).*)~\2~' input_file
Tests/StoreTests/Base64Tests.swift
would like to get everything before the first, and not the last
Sources or Tests pattern match.
First thing is to understand reason of that, you are using
sed 's/^.*\(Tests.*\).*$/\1/'
observe that * is greedy, i.e. it will match as much as possible, therefore it will always pick last Tests, if it would be non-greedy it would find first Tests but sed does not support this, if you are using linux there is good chance that you have perl command which does support that, let file.txt content be
/Users/random/354765478/Tests/StoreTests/Base64Tests.swift
then
perl -p -e 's/^.*?(Tests.*)$/\1/' file.txt
gives output
Tests/StoreTests/Base64Tests.swift
Explanation: -p -e means engage sed-like mode, alterations in regular expression made: brackets no longer require escapes, first .* (greedy) changed to .*? (non-greedy), last .* deleted as superfluous (observe that capturing group will always extended to end of line)
(tested in perl 5, version 30, subversion 0)

Script to display only comments from /etc/services file

I need to write a bash script that takes service name as a parameter and display only comment that is after hash symbol in /etc/services but I have no idea how to cut only the comment part.
The ,,it's working solution'' for me is to just:
grep "^$1" /etc/services | awk '{print $3,$4 ...
but I don't think this is a good one
I'm searching for something like:
[find the service] -> print only the part from # till the end of the line
I'm still learning so any solution with explanation or just a hint will be very helpful for me.
Chances are this is what you're looking for:
awk -v svc="$1" '($1==svc) && sub(/[^#]+#/,"")' /etc/services
but without sample input/output it's a guess.
The above will work using any awk in any shell on every Unix box.
Try this:
SERVICE_NAME=linuxconf; grep -Po "^$SERVICE_NAME.*# \K.*$" /etc/services
-P tells grep to use perl regex.
-o trims the output so that it only includes the regex match.
\K tells the regex engine to exclude previously matched part of the string from the match, i.e. only the part after \K will be present in the final match.

Search file contents recursively when know where in file

I am interested in efficiently searching files for content using bash and related tools (eg sed, grep), in the specific case that I have additional information about where in the file the intended content is. For example, I want to replace a particular string in line #3 of each file that contains a specific string on line 3 of the file. Therefore, I don't want to do a recursive grep -r on the whole directory as that would search the entirety of each file, wasting time since I know that the string of interest is on line #3, if it is there. This full-grep approach could be done with grep -rl 'string_to_find_in_files' base_directory_to_search_recursively. Instead I am thinking about using sed -i ".bak" '3s/string_to_replace/string_to_replace_with' files to search only on line #3 of all files recursively in a directory, however sed seems to only be able to take one file as input argument. How can I apply sed to multiple files recursively? find -exec {} \; and find -print0 | xargs -0 seem to be very slow.. Is there a faster method than using find? I can achieve the desired effect very quickly with awk but only on a single directory, it does not seem to me to be recursive, such as using awk 'FNR==3{print $0}' directory/*. Any way to make this recursive? Thanks.
You can use find to have the list of files and feed to sed or awk one by one by xargs
for example, this will print the first lines of all files listed by find.
$ find . -name "*.csv" | xargs -L 1 sed -n '1p'

Using grep or sed in a foreach loop won't work

I've spent countless hours trying to get this work and I think it's time to get some help. I have a 2-column file - let's call it "result.txt" with a list of values like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
I want to recreate this list using only unique results for each file type, so it should look like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I created a list of files which would be able to do this by using grep or sed to extract the first line containing the matching file:
fileA
fileB
fileC
fileD
We'll call this result2.txt.
I have attempted to write the following c-shell script:
foreach l (`cat result2.txt`)
set name = "$l"
echo "$name"
grep -m1 "$name" result.txt >> result3.txt
end
The output file, "result3.txt" is empty. The script runs perfectly up to the grep command. When I run the grep command outside of the loop, using a line from result2.txt, it works fine. I get the same result using this: sed -n '/"\$name\"/p'
And I think I tried an awk command at some point.
The problem seems to be in getting those programs to recognise the $name or $l variables. I have tried different combinations of " and ' around $name and I have tried adding backslashes: e.g. $\name. Can anyone please tell me what the issue is?
Thanks
Sounds like a job for awk. Use underscore or whitespace as the field separator, and print a line only if the first field has not been seen yet:
awk -F '[_[:space:]]+' '!seen[$1]++' << END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I've just tried in CSH and both your version and the following simplified version just work. Note, no quotation marks at all.
foreach name (`cat result2.txt`)
grep -m1 $name result.txt >>result3.txt
end
Could you please check whether result.txt really contains what you mentioned at the beginning?
cat result.txt
sed -n 's/.*/²&³/;H
$ {x;s/\(.\).*/&\1/
t again
: again
s/²\([^_]\{1,\}_\)\(.*\)\²\1[^³]*³./²\1\2/
t again
s/.\(.*\)./\1/;s/[²³]//g
p
}' YourFile
Use of 2 temporary delimiter ² and ³ due to limitation in \n manipulation

grep a number from the line and append it to a file

I went through several grep examples, but don't see how to do the following.
Say, i have a file with a line
! some test here and number -123.2345 text
i can get this line using
grep ! input.txt
but how do i get the number (possibly positive or negative) from this line and append it to the end of another file? Is it possible to apply grep to grep results?
If yes, then i could get the number via something like
grep -Eo "[0-9]{1,}|\-[0-9]{1,}"
p/s/ i am using OS-X
p/p/s/ i'm trying to fetch data from several files and put into a single file for later plotting.
The format with your commands would be:
grep ! input.txt | grep -Eo "[0-9]{1,}|\-[0-9]{1,}" >> output
To grep from grep we use the pipe operator | this lets us chain commands together. To append this output to a file we use the redirection operator >>.
However there are a couple of problems. You regexp is better written: grep -Eoe '-?[0-9.]+' this allows for the decimal and returns the single number instead of two and if you want lines that start with ! then grep ^! is better to avoid matches with lines what contain ! but don't start with it. Better to do:
grep '^!' input | grep -Eoe '-?[0-9.]+' >> output
perl -lne 'm/.*?([\d\.\-]+).*/g;print $1' your_file >>anotherfile_to_append
$foo="! some test here and number -123.2345 text"
$echo $foo | sed -e 's/[^0-9\.-]//g'
$-123.2345
Edit:-
for a file,
[ ]$ cat log
! some test here and number -123.2345 text
some blankline
some line without "the character" and with number 345.566
! again a number 34
[ ]$ sed -e '/^[^!]/d' -e 's/[^0-9.-]//g' log > op
[ ]$ cat op
-123.2345
34
Now lets see the toothpicks :) '/^[^!]/d' / start of pattern, ^ not (like multiply with false), [^!] anyline starting with ! and d delete. Second expression, [^0-9.-] not matching anything within 0 to 9, and . and -, (everything else) // replace with nothing (i.e. delete) and done :)