Delete multiple strings/characters in a file

Delete multiple strings/characters in a file - awk

I have a curl output generated similar below, Im working on a SED/AWK script to eliminate unwanted strings.
File
{id":"54bef907-d17e-4633-88be-49fa738b092d","name":"AA","description","name":"AAxxxxxx","enabled":true}
{id":"20000000000000000000000000000000","name":"BB","description","name":"BBxxxxxx","enabled":true}
{id":"542ndf07-d19e-2233-87gf-49fa738b092d","name":"AA","description","name":"CCxxxxxx","enabled":true}
{id":"20000000000000000000000000000000","name":"BB","description","name":"DDxxxxxx","enabled":true}
......
I like to modify this file and retain similar below,
AA AAxxxxxx
BB BBxxxxxx
AA CCxxxxxx
BB DDxxxxxx
AA n.....
BB n.....
Is there a way I could remove word/commas/semicolons in-between so I can only retain these values?

Try this awk
curl your_command | awk -F\" '{print $(NF-9),$(NF-3)}'
Or:
curl your_command | awk -F\" '{print $7,$13}'
A semantic approach ussing perl:
curl your_command | perl -lane '/"name":"(\w+)".*"name":"(\w+)"/;print $1." ".$2'
For any number of name ocurrences:
curl your_command | perl -lane 'printf $_." " for ( $_ =~ /"name":"(\w+)"/g);print ""'

This might work for you (GNU sed):
sed -r 's/.*("name":")([^"]*)".*\1([^"]*)".*/\2 \3/p;d' file
This extracts the fields following the two name keys and prints them if successful.
Alternatively, on simply pattern matching:
sed -r 's/.*:.*:"([^"]*)".*:"([^"]*)".*:.*/\1 \2/p;d' file

In this particular case, you could do
awk -F ":|," '{print $4,$7}' file2 |tr -d '"'
and get
AA AAxxxxxx
BB BBxxxxxx
AA CCxxxxxx
BB DDxxxxxx
Here, the field separator is either : or ,, we print the fourth and seventh field (because all lines have the entries in these two fields) and finally, we use tr to delete the " because you don't want to have it.

Related

Find the second word delimited by space or comma then insert strings before and after

I have a file containing TABLE schema.table and want to put strings around it to make a command like MARK string REJECT
the file contains many lines
TABLE SCHEMA.MYTAB, etc. etc....
or
TABLE SCHEMA.MYTAB , etc. etc....
The result is
MARK SCHEMA.MYTAB REJECT
..etc
I have
grep TABLE dirx/myfile.txt | awk -F, '{print $1}' | awk '{print $2}' | sed -e 's/^/MARK /' |sed -e 's/$/ REJECT/'
It works, but can this be tidier? I think I can combine the awk and sed into single commands but not sure how.

Maybe:
awk '/^TABLE/ {gsub(/,.*$/, ""); print "MARK " $2 " REJECT"}' dirx/myfile.txt

Only output line if value in specific column is unique

Input:
line1 a gh
line2 a dd
line3 c dd
line4 a gg
line5 b ef
Desired output:
line3 c dd
line5 b ef
That is, I want to output line only in the case that no other line includes the same value in column 2. I thought I could do this with combination of sort (e.g. sort -k2,2 input) and uniq, but it appears that with uniq I can only skip columns from the left (-f avoid comparing the first N fields). Surely there's some straightforward way to do this with awk or something.

You can do this as a two-pass awk script:
awk 'NR==FNR{a[$2]++;next} a[$2]<2' file file
This runs through the file once incrementing a counter in an array whose key is the second field of each line, then runs through a second time printing only those lines whose counter is less than 2.
You'd need multiple reads of the file because at any point during the first read, you can't possibly know whether there will be another instance of the second field of that line later in the file.

Here is a one pass awk solution:
awk '{a1[$2]++;a2[$2]=$0} END{for (a in a1) if (a1[a]==1) print a2[a]}' file
The original order of the file will be lost however.

You can combine awk, grep, sort and uniq for a quick one-liner:
grep -v "^[^ ]* $(awk '{print $2}' input.txt | sort | uniq -d) " input.txt
Edit, to avoid the regexes, \+ and \backreferences:grep -v "^[^ ]* $(awk '{print $2}' input.txt | sort | uniq -d | sed 's/[^+0-9]/\\&/g') " input.txt

alternative to awk to demonstrate that it can still be done with sort and uniq (there is option -u for this), however setting up the right format requires some juggling (decorate/do stuff/undecorate pattern).
$ paste file <(cut -d' ' -f2 file) | sort -k2 | uniq -uf3 | cut -f1
line5 b ef
line3 c dd
as a side effect you lose the original sorting order, which can be recovered as well if you add line numbers...

match duplicate string before a specified delimiter

cat test.txt
serverabc.test.net
serverabc.qa.net
serverabc01.test.net
serverstag.staging.net
serverstag.test.net
here i need to match the duplicate strings just before the delimiter '.'
So the expected output would be like below. because string "serverabc" and "serverstag" found to be duplicates. Please help.
serverabc.test.net
serverabc.qa.net
serverstag.staging.net
serverstag.test.net

awk to the rescue!
$ awk -F\. '{c[$1]++; a[$1]=a[$1]?a[$1]RS$0:$0}
END{for(k in c) if(c[k]>1) print a[k]}' file
serverabc.test.net
serverabc.qa.net
serverstag.staging.net
serverstag.test.net

If it is not going to be used allot I would probably just do something like this:
cut -f1 -d\. foo.txt | sort |uniq -c | grep -v " 1 " | cut -c 9-|sed 's/\(.*\)/^\1\\./' > dup.host
grep -f dup.host foo.txt
serverabc.test.net
serverabc.qa.net
serverstag.staging.net
serverstag.test.net

How to awk pattern over two consecutive lines?

I am trying do something which I guess could be done very easy but I cant seem to find the answer. I want to use awk to pick out lines between two patterns, but I also want the pattern to match two consecutive lines. I have tried to find the solution on the Internet bu perhaps I did not search for the right keywords. An example would better describe this.
Suppose I have the following file called test:
aaaa
bbbb
SOME CONTENT 1
ddddd
fffff
aaaa
cccc
SOME CONTENT 2
ccccc
fffff
For example lets say I would like to find "SOME CONTENT 1"
Then I would use awk like this:
cat test | awk '/aaa*/ { show=1} show; /fff*/ {show=0}'
But that is not want I want. I want somehow to enter the pattern:
aaaa*\nbbbb*
And the same for the end pattern. Any suggestions how to do this?

You can use this:
awk '/aaa*/ {f=1} /bbb*/ && f {show=1} show; /fff*/ {show=f=0}' file
bbbb
SOME CONTENT 1
ddddd
fffff
If pattern1 is aaa* then set flag f
If pattern2 is bbb* and flag f is true, then set the show flag
If you need to print patter1 the aaa*?
awk '/aaa*/ {f=$0} /bbb*/ && f {show=1;$0=f RS $0} show; /fff*/ {show=f=0}' file
aaaa
bbbb
SOME CONTENT 1
ddddd
fffff

If every record ends with fffff, and GNU awk is available, you could do something like this:
$ awk '/aaa*\nbbbb*/' RS='fffff' file
aaaa
bbbb
SOME CONTENT 1
ddddd
Or if you want just SOME CONTENT 1 to be visible, you can do:
$ awk -F $'\n' '/aaa*\nbbbb*/{print $4}' RS='fffff' file
SOME CONTENT 1

I searched for two patterns and checkd that they were consecutive using line numbers, having line numbers lets sed insert a line between them, well after the first line/pattern.
awk '$0 ~ "Encryption" {print NR} $0 ~ "Bit Rates:1" {print NR}' /tmp/mainscan | while read line1; do read line2; echo "$(($line2 - 1)) $line1"; done > /tmp/this
while read line
do
pato=$(echo $line | cut -f1 -d' ')
patt=$(echo $line | cut -f2 -d' ')
if [[ "$pato" = "$patt" ]]; then
inspat=$((patt + 1))
sed -i "${inspat}iESSID:##" /tmp/mainscan
sed -i 's/##/""/g' /tmp/mainscan
fi
done < /tmp/this

AWK to search for a for a string and print full text where string occurs

I have a document containing several lines of text.
Example(not actual):
*Prepare 42 Locked delete from table where type='test' and user_id='099'and number='+66719919*
I want to be able to search for user_id where ever it occurs in the document (which does not follow a pattern) and have the output as:
user_id=009
OR
009
Please how do I achieve this using awk?
Thanks.

awk '/user_id/{for(i=1;i<=NF;i++){if($i~/user_id/){split($i,a,"=");print a[2]}}}' your_file
tested:
> echo "*Prepare type='test' and user_id='099' and number='+66719919*" | awk '/user_id/{for(i=1;i<=NF;i++){if($i~/user_id/){split($i,a,"=");print a[2]}}}'
'099'
another one:
> echo "*Prepare type='test' and user_id='099' and number='+66719919*" | awk '/user_id/{for(i=1;i<=NF;i++){if($i~/user_id/){ print $i}}}'
user_id='099'

You could also use grep:
grep -o "user_id='\?[0-9]*'\?"
Append tr to remove the quotes:
grep -o "user_id='\?[0-9]*'\?" | tr -d \'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Delete multiple strings/characters in a file - awk

This might work for you (GNU sed): sed -r 's/.("name":")([^"])".\1([^"])"./\2 \3/p;d' file This extracts the fields following the two name keys and prints them if successful. Alternatively, on simply pattern matching: sed -r 's/.:.:"([^"])".:"([^"])".:./\1 \2/p;d' file

Related

Find the second word delimited by space or comma then insert strings before and after

Only output line if value in specific column is unique

match duplicate string before a specified delimiter

How to awk pattern over two consecutive lines?

AWK to search for a for a string and print full text where string occurs

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Delete multiple strings/characters in a file - awk

This might work for you (GNU sed): sed -r 's/.*("name":")([^"]*)".*\1([^"]*)".*/\2 \3/p;d' file This extracts the fields following the two name keys and prints them if successful. Alternatively, on simply pattern matching: sed -r 's/.*:.*:"([^"]*)".*:"([^"]*)".*:.*/\1 \2/p;d' file

Related

Find the second word delimited by space or comma then insert strings before and after

Only output line if value in specific column is unique

match duplicate string before a specified delimiter

How to awk pattern over two consecutive lines?

AWK to search for a for a string and print full text where string occurs

Categories

Resources

This might work for you (GNU sed): sed -r 's/.("name":")([^"])".\1([^"])"./\2 \3/p;d' file This extracts the fields following the two name keys and prints them if successful. Alternatively, on simply pattern matching: sed -r 's/.:.:"([^"])".:"([^"])".:./\1 \2/p;d' file