grep each line from a file recursively [closed] - awk

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have an alphabetically sorted file names.txt containing names:
Dio Brando
Erina Pendleton
Jonathan Joestar
Mario Zeppeli
....
and a folder containing files:
a1.txt
a2.txt
....
zn.txt
each having lines with Name: phone number. Assuming recursion goes though files alphabetically, I'd like to grep each line from names.txt and switch to the next name whenever it fails to find a match in the next one.
For example: I grep for "Dio Brando" and it reaches in file, say, D2.txt following lines:
Dio Brando: phone number 1
Dio Brando: phone number 2
Dio Brando: phone number 3
<not Dio Brando>: phone number 1
When I hit the <not Dio Brando> line, I want it to start searching for "Erina Pendleton" where it left off (from D2.txt onward). It's guarantied to find at least one match, and when it fails after that, it should switch to "Jonathan Joestar" etc. Is it possible without some serious scripting?
If resuming with the next name is not doable, just performing grep on each line through the whole folder will do.

It sounds like this might be all you need:
awk '
BEGIN { nameNr = 1 }
NR==FNR {
names[NR] = $0
next
}
{
if ( index($0,names[nameNr]) ) {
print FILENAME, FNR, $0
found = 1
}
else if ( found ) {
nameNr++
found = 0
}
}
' *.txt
but since you didn't provide sample input/output we could test with that's just an untested guess.

Related

How to rename files by reducing the number value within filenames by the same amount for each file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 days ago.
Improve this question
I have a list of files that are of this format:
"stuff that happened B05T300 today.png"
"stuff that happened B05T301 today.png"
"stuff that happened B05T302 today.png"
"stuff that happened B05T303 today.png"
I would like to rename the files in 2 ways.
1. B05 needs to become B01
2. T300 needs to become T001, T301 -> T002, etc... essentially the Txxx will be xxx - 299.
Potentially also have the 0 chars removed immediately after the T. ie. T1, T10
Using Perl's rename:
$ rename -n 's/B05T/B01T/;s/(?<=B01T)(\d+)/sprintf "%03d", ($1 - 299)/e' *.png
rename(stuff that happened B05T300 today.png, stuff that happened B01T001 today.png)
rename(stuff that happened B05T301 today.png, stuff that happened B01T002 today.png)
rename(stuff that happened B05T302 today.png, stuff that happened B01T003 today.png)
rename(stuff that happened B05T303 today.png, stuff that happened B01T004 today.png)
Remove -n (aka dry-run) when the output looks satisfactory.

Script that removes all occurences of duplicated lines from file + keeps the original order of lines (perl + python + lua) [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
As the title says, i need to make a script in perl, one in python and one in lua that removes all occurences of a duplicate line (can be even a one-line command). For example let's say the file has the following lines (I don't know exactly what the file has, need a generic command to do that, this is just an example):
apple
orange
banana
banana
berry
cherry
orange
melon
The output should be like :
apple
berry
cherry
melon
Another thing to note is that i need the file to have the same line order as it was at the beginning. I managed to pull multiple commands using awk and sed, but i couldn't find anything related to removing all occurences in python / lua / perl.
In Perl, you'd keep a hash to record what you've already seen.
#!/usr/bin/perl
use strict;
use warnings;
my %seen;
while (<>) {
print unless $seen{$_}++;
}
This reads from STDIN and writes to STDOUT, so you can use it as a Unix filter. If it's in a file called filter:
$ filter < input_data > filtered_data
Update: Ok, I misunderstood the requirement. You can't do this without iterating across the data twice. Here's a Perl solution.
#!/usr/bin/perl
use strict;
use warnings;
my #data;
my %count;
# Store the input data and also
# keep a count.
while (<>) {
$count{$_}++;
push #data, $_;
}
# Print the input data, but only
# records which only appear once.
print grep { $count{$_} == 1 } #data;
In python you can just use the following script.
file = open("myfile.txt", "r")
no_duplicates = list(dict.fromkeys(file.readlines())
readlines() returns an ordered list of the file content. Each line gives a list item.
dict.fromkeys(a) generates a dict for the list of key a. It reads the list in order, and doesn't add already existing keys.
Algorithm of the code
walk through the data
++$count{$_} occurrences of word
store word in an array #words only if seen for first time (to preserve words order)
use array #words to output word only if it's $count{$_} == 1
use strict;
use warnings;
my %count;
my #words;
for( <DATA> ) {
++$count{$_} == 1 && push #words, $_;
}
$count{$_} == 1 && print for #words;
__DATA__
apple
orange
banana
banana
berry
cherry
orange
melon
Output
apple
berry
cherry
melon
sorry for that one... here the crt code in python....
x=open("dupli","r")
ar=(x.readlines())
new=[]
for i in ar:
co=ar.count(i)
if co == 1:
new.append(i)
print(new)
the result will be..
['apple\n', 'berry\n', 'cherry\n', 'melon\n']

Grep from specific file and print after matching strings [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to get grep matching output from 2 list files, but still print the previous information.
Here are the example files and the wanted output.
Input_1
AAAAA
CCCCC
DDDDD
EEEEE
Input_2 (tab_delimited) (there might be some blank cells)
AAAAA:1 - -
- BBBBB:0.5 -
- - CCCCC:0.2
- DDDDD:0 -
Wanted output
AAAAA:1
CCCCC:0.2
DDDDD:0
I have been trying all sorts of grep options and couldn't figure it out. Other methods (other than using grep) are all welcome!
Thank you in advance!
grep -Eof <(awk '{print $0":[0-9.]+"}' <input1>) <input2>
I use awk to change input1 and create patterns that input2 can match.

make wrapped text one line using awk [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a file with following records:
1 Error 19-03-23 02:02:26 LPU 6 : RX_PWR_L_ALARM of SFPF ALARM of
PIC1 is abnormal[OID:1.3.6.1.4.1.201
1.5.25.129.2.1.9,BasCode:67697]
2 Error 19-03-20 07:50:40 The air filter : Maybe it is not clean
ed as scheduled. Please clean it and
run the reset dustproof run-time comman
d[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,
BasCode:67995]
I want to output:
1 Error 19-03-23 02:02:26 LPU 6 : RX_PWR_L_ALARM of SFPF ALARM of PIC1 is abnormal[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,BasCode:67697]
2 Error 19-03-20 07:50:40 The air filter : Maybe it is not cleaned as scheduled. Please clean it and run the reset dustproof run-time command[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,BasCode:67995]
GNU sed:
sed -En '/^[[:digit:]]+[[:blank:]]+/{:l1;N;/]$/!{b l1};s/\n +//g;s/ +/ /g;s/ /\t/g;s/\t/ /5gp}' file
Output
1 Error 19-03-23 02:02:26 LPU 6 : RX_PWR_L_ALARM of SFPF ALARM of PIC1 is abnormal[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,BasCode:67697]
2 Error 19-03-20 07:50:40 The air filter : Maybe it is not cleaned as scheduled. Please clean it and run the reset dustproof run-time command[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,BasCode:67995]
Could you please try following.
awk '
{
if($0 ~ / +$/){
sub(/ +$/," ")
}
}
FNR==1{
printf("%s",$0)
next
}
{
if($0 ~ /^ +/){
check=1
sub(/^ +/,"")
}
printf("%s%s",check==1?"":ORS,$0)
check=""
}' Input_file
following is the output I am getting.
1 Error 19-03-23 02:02:26 LPU 6 : RX_PWR_L_ALARM of SFPF ALARM of PIC1 is abnormal[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,BasCode:67697]
2 Error 19-03-20 07:50:40 The air filter : Maybe it is not cleaned as scheduled. Please clean it and run the reset dustproof run-time command[OID:1.3.6.1.4.1.2011.5.25.129.2.1.9,BasCode:67995]

print from match & process several input files

when you scrutiny my questions from the past weeks you find I asked questions similar to this one. I had problems to ask in a demanded format since I did not really know where my problems came from. E. Morton tells me not to use range expression. Well, I do not know what they are excactly. I found in this forum many questions alike mine with working answers.
Like: "How to print following line from a match" (e.g.)
But all solutions I found stop working when I process more than one input file. I need to process many.
I use this command:
gawk -f 1.awk print*.csv > new.txt
while 1.awk contains:
BEGIN { OFS=FS=";"
pattern="row4"
}
go {print} $0 ~ pattern {go = 1}
input file 1 print1.csv contains:
row1;something;in;this;row;;;;;;;
row2;something;in;this;row;;;;;;;
row3;something;in;this;row;;;;;;;
row4;don't;need;to;match;the;whole;line,;
row5;something;in;this;row;;;;;;;
row6;something;in;this;row;;;;;;;
row7;something;in;this;row;;;;;;;
row8;something;in;this;row;;;;;;;
row9;something;in;this;row;;;;;;;
row10;something;in;this;row;;;;;;;
Input file 2 print2.csv contains the same just for illustration purpose.
The 1.awk (and several others ways I found in this forum to print from match) works for one file. Output:
row5;something;in;this;row;;;;;;;
row6;something;in;this;row;;;;;;;
row7;something;in;this;row;;;;;;;
row8;something;in;this;row;;;;;;;
row9;something;in;this;row;;;;;;;
row10;something;in;this;row;;;;;;;
BUT not when I process more input files.
Each time I process this way more than one input file awk commands 'to print from match' seem to be ignored.
As said I was told not to use range expression. I do not know how and maybe the problem is linked to the way I input several files?
just reset your match indicator at the beginning of each file
$ awk 'FNR==1{p=0} p; /row4/{p=1} ' file1 file2
row5;something;in;this;row;;;;;;;
row6;something;in;this;row;;;;;;;
row7;something;in;this;row;;;;;;;
row8;something;in;this;row;;;;;;;
row9;something;in;this;row;;;;;;;
row10;something;in;this;row;;;;;;;
row5;something;in;this;row;;;;;;;
row6;something;in;this;row;;;;;;;
row7;something;in;this;row;;;;;;;
row8;something;in;this;row;;;;;;;
row9;something;in;this;row;;;;;;;
row10;something;in;this;row;;;;;;;
UPDATE
From the comments
is it possible to combine your awk with: "If $1="row5" then write in
$6="row5" and delete the value "row5" in $5? In other words, to move
content "row5" in column1, if found there, to new column 6? I could to
this with another awk but a combination into one would be nicer
... $1=="row5"{$6=$5; $5=""} ...
or, if you want to use another field instead of $5 replace $5 with the corresponding field number.