In spacemacs, search occurances of pattern A only in files with name matching pattern B - spacemacs

In spacemacs, I often search for patterns within my project via SPC-* or SPC-/. These commands allow me to input a pattern to search for, such as the name of a function I would like to jump to the definition of.
Sometimes, I would like to restrict that search to files of only a certain type, such as searching only *.elm files and omitting all others (*.hs, *.sql, etc.).
How can I specify filenames for my pattern search?
I.e., How to search for pattern A only in files with name matching pattern B?
I'm wondering if there is some special key I can type as part of my search query to accomplish this.

If you use ag as a search backend, you can do SPC-/ -G<file name regexp> <search pattern>, see man ag for details.
I am not sure if the grep backend can do something similar, I think the internal call to grep is much more complex and adding flags tends to hang my emacs session. From a shell you can run grep -n <search pattern> <file pattern>

Related

Recursively search directory for occurrences of each string from one column of a .csv file

I have a CSV file--let's call it search.csv--with three columns. For each row, the first column contains a different string. As an example (punctuation of the strings is intentional):
Col 1,Col 2,Col 3
string1,valueA,stringAlpha
string 2,valueB,stringBeta
string'3,valueC,stringGamma
I also have a set of directories contained within one overarching parent directory, each of which have a subdirectory we'll call source, such that the path to source would look like this: ~/parentDirectory/directoryA/source
What I would like to do is search the source subdirectories for any occurrences--in any file--of each of the strings in Col 1 of search.csv. Some of these strings will need to be manually edited, while others can be categorically replaced. I run the following command . . .
awk -F "," '{print $1}' search.csv | xargs -I# grep -Frli # ~/parentDirectory/*/source/*
What I would want is a list of files that match the criteria described above.
My awk call gets a few hits, followed by xargs: unterminated quote. There are some single quotes in some of the strings in the first column that I suspect may be the problem. The larger issue, however, is that when I did a sanity check on the results I got (which seemed far too few to be right), there was a vast discrepancy. I ran the following:
ag -l "searchTerm" ~/parentDirectory
Where searchTerm is a substring of many (but not all) of the strings in the first column of search.csv. In contrast to my above awk-based approach which returned 11 files before throwing an error, ag found 154 files containing that particular substring.
Additionally, my current approach is too low-resolution even if it didn't error out, in that it wouldn't distinguish between which results are for which strings, which would be key to selectively auto-replacing certain strings. Am I mistaken in thinking this should be doable entirely in awk? Any advice would be much appreciated.

How can i query for specific matching substrings while ignoring the rest in a SCAN query?

trying to do a redis SCAN command and trying to figure out how to do glob-pattern substring matching for words instead of single characters (using ruby redis gem)
redis.set("first:url:123", "val1")
redis.set("second:url:123", "val2")
redis.set("third:url:123", "val3")
redis.set("fourth:url:123", "val4")
cursor = 0
pattern = "[first,second]:url:*" ## I only want the first and second keys
redis.scan(cursor, match: pattern)
# => ...
--
according to the docs here i found these available options but it looks like it only works for single characters, how can i use it for words?
h[ae]llo matches hello and hallo, but not hillo
Edit:
https://globster.xyz/
makes me think that using {first,second}:url:123 should work, but that doesnt seem to work either
Redis doesn't support regex expressions for key name patterns, only glob-like expressions.
You can revert to an EVAL Lua script if you are amendment about it. Here's one that does it, but do read the comments: https://gist.github.com/itamarhaber/19c8393f465b62c9cfa8

How to write a redis scan command to match either of two patterns

I want to use HSCAN match clause to match a key which is of type 1 or type 2. A regex would be like ^match1|^match2. Is it possible to do this in glob style pattern.
Redis does not offer a straight forward way to match multiple patterns.
Redis matchs using glob-style pattern which is very limited.
Supported glob-style patterns:
h?llo matches hello, hallo and hxllo
h*llo matches hllo and heeeello
h[ae]llo matches hello and hallo, but not hillo
h[^e]llo matches hallo, hbllo, ... but not hello
h[a-b]llo matches hallo and hbllo
Use \ to escape special characters if you want to match them verbatim.
You can't use glob-style for that, but you can Lua your way around this. That means you can use
EVAL and a script (can be similar to https://github.com/itamarhaber/redis-lua-scripts/blob/master/scanregex.lua) for performing the HSCAN match1* first, and then filtering with Lua on match2.*.

Targeting a string for deletion with grep, sed, awk (or cut)

I am trying to parse some logs to gain the user agent and account id per line. I have already managed to pull the user agent and a string which contains the account id all on the same line.
The next step is to extract the account id from its longer string. I thought this would be fairly simple as I will know the start of the string and there are / slashes for the delimiter but the user agent also contains slashes and have varied number of fields.
The log file currently looks something like the following example but there are hundreds to thousands of lines to parse. Luckily I am working off a partition with plenty of space to spare.
USER_AGENT_PART ACCOUNT_ID_Part_/plus/path/to/stuff/they/access
some user agent/1.3 KnownString1_32d4-56e-009f98/some/stuff/here
user/agent KnownString1_12d3-345e-4c534/more/stuff/here
User/Agent cURL/1.5.0 KnownString2_12d34e56/stuff/things/stuff/stuff
one/User Agent/2.0 KnownString1_12d3_456e_7g8/more/random/stuff/stuff
So the goal is to keep the user agent part and the account id part and drop the path of the stuff they are accessing in the last string. But I can't use / or spaces as general delimiters because many user agents have / and various amounts of spaces in their name.
Also, the different types of user agents is way more than this little sample I have here. There are anywhere from 25 - 50 distinct types depending on the log. So it doesn't seem worth it to target the user agent and try to exclude it.
It seems the logical way to start is by targeting the part of the account ID which is a known string (KnownString1 or KnownString2) and grab everything from there (which is unknown numbers and letters with dashes) up until the first / of that account string.
Then I would delete the first / (In the account ID string) and everything after. I expect I will need to do this in two passes to utilize the two known parts of the user IDs.
This seemed like it would be easy but I just can't wrap my head around how to start targeting that last string. I don't even have a good example of something that is close to working because I don't know how to target the last string by delimiters without catching the same delimiters in the user agent part.
Any ideas?
Edit: Every line will have an account id that starts with one of two common KnownString_ in it but then is followed by a series of unknown digits and dashes until it gets to the first /. So I don't need to search for lines containing that before targeting the string.
Edit2: My original examples of the Account ID did not reflect there were letters mixed in with the numbers.
Edit3: Thanks to the responses from oguz ismail and kesubagu I was able to solve this using egrep. Looks like I was trying to make things more complicated than they were. I also realized I need to revisit grep as its capable of doing far more than what I tend to use it for.
This is what I ended up using which worked in one pass:
egrep -o ".+(KnownString1|KnownString2)_[^/]+" logfile > logfile2
Using grep:
$ grep -o '.*KnownString[^/]*' file
some user agent/1.3 KnownString1_32d4-56e-009f98
user/agent KnownString1_12d3-345e-4c534
User/Agent cURL/1.5.0 KnownString2_12d34e56
one/User Agent/2.0 KnownString1_12d3_456e_7g8
.* matches everything before KnownString, and [^/]* matches everything after KnownString until the first /.
You can use egrep with the -o option which will only output the part of that matches the provided regex, so you could do something like this
cat test | egrep -o ".+(KnownString1|KnownString2)_[_0-9-]+"
where the test file contains the input you've given, the output in this case was
some user agent/1.3 KnownString1_324-56-00998
user/agent KnownString1_123-345-4534
User/Agent cURL/1.5.0 KnownString2_123456
one/User Agent/2.0 KnownString1_123_456_78

DCL sort - different start positions

I have a DCL script that creates a .txt file that looks something like this
something,somethingelse,00000004
somethingdifferent,somethingelse1,00000002
anotherline,line,00000015
I need to sort the file by the 3rd column highest to lowest
ex:
anotherline,line,00000015
something,somethingelse,00000004
somethingdifferent,somethingelse1,00000002
Is it best to use the sort command, if so everything i've seen required a position number, how can this be done if each line would have a different start position?
If sort is a bad way to handle this is there something else or can I somehow handle this while writing the lines to the file.
I've only been working with VMS/DCL for a few weeks now so i'm not fimilar with all of the commands yet.
Thanks!
As you already noticed, the VMS sort expects fields with a fixed start position within a record. You can not specify a field by a separator. If you want to use the VMS sort you have to make sure your third field starts at the same column, for all records. In other words, you have to pad preceding fields. If you have control on how the file is created, this may work for you. If you don't or you don't know how big the string in front of the sort field will be, this may not be a workaround. Maybe changing the order of the fields is an option.
On the other hand, you may find GNV installed on your system. Then you can try to use its sort, which is a GNU style sort. That is, $ mcr gnv$gnu:[bin]sort -t, -k3 -r x.txt may get you the wanted results.
VMS Sort is indeed not really equipped for this.
Reformatting as you did is about the only way.
If you so not have access to GNV sort on the OpenVMS system then perhaps you have, or can install PERL? Is is somewhat easier to install.
In perl there are of course many ways.
For example using an anonymous sort function ( $a is first arg, $b second; <> reads all input )
$ perl -e "print sort { 0+(split /,/,$b)[1] <=> 0+(split /,/,$a)[2]} <>" x.x
where the 0 + forces numeric evaluation. For (fixed length?) string compare use:
$ perl -e "print sort { (split /,/,$b)[2] cmp (split /,/,$a)[2]} <>" x.x
hth,
Hein.enter code here