replace newline with end of line in sed - awk

what is the difference between the following
cat script | sed -n ';s/\W|\W/\n/g; s/\W>\W/\n/g;p' | sed -n 's/\W/\n/1;p ' | sed -n 'x;n;x;p'
cat script | sed -n -e ';s/\W|\W/\n/g; s/\W>\W/\n/g;p' -e 's/\W/\n/1;p ' -e 'x;n;x;p'
using ;l; there is apparent difference in treating \n, and end of line $, sed doesn't treat streams delimited by \n as different lines, but each line ends with it's original end of line. is there a way to insert endofline without pipelining?
sample script
prog1 arg1 | prog2 opt1
prog3 arg3 > filename
sample output
prog1
prog2
prog3
filename
script support redirection(>), and pipelining(|), and limited executables.
output is executable name, or redirection file.

Related

sed cut grep linux command output

i have a string, i want to cut all occurrences from matching until first comma: example
[{"value":1,"btata":"15","Id":"17","","url":"","time":"222"{"value":1,"secId":"16","Id":"19","time":"20218 22status":""}
I want to get Id:17 Id:19
I have been able to get Id using sed -e 's/Id/_/g' -e 's/[^_]//g' -e 's/_/Id /g' but couldn't match until comma.
You can do it with sed but it requires two expressions. Essentially you need to remove all '"' characters and then split the input on ',' by replacing them with '\n'. The second expression simply locates the lines beginning with Id, e.g.
sed 's/"//g;s/,/\n/g' | sed -n /^Id/p
Example Use/Output
$ echo '[{value:1,btata:15,Id:17,,url:,time:222{value:1,secId:16,Id:19,time:20218 22status:}' |
sed 's/"//g;s/,/\n/g' | sed -n /^Id/p
Id:17
Id:19
(note: this all comes with the caveat that you should not process json with shell commands. Using a json validating tool like jq is recommended -- though this doesn't appear to be valid json either)

bash script variables weird results when using cp

If I use cp inside a bash script the copied file will have weird charachters around the destination filename.
The destination name comes from the results of an operation, it's put inside a variable, and echoing the variable shows normal output.
The objective is to name a file after a string.
#!/bin/bash
newname=`cat outputfile | grep 'hostname ' | sed 's/hostname //g'
newecho=`echo $newname`
echo $newecho
cp outputfile "$newecho"
If I launch the script the echo looks ok
$ ./rename.sh
mo-swc-56001
However the file is named differently
~$ ls
'mo-swc-56001'$'\r'
As you can see the file contains extra charachters which the echo does not show.
Edit: the newline of the file is like this
# file outputfile
outputfile: ASCII text, with CRLF, CR line terminators
I tried in every possible way to get rid of the ^M charachter but this is an example of the hundreds of attempts
# cat outputfile | grep 'hostname ' | sed 's/hostname //g' | cat -v
mo-swc-56001^M
# cat outputfile | grep 'hostname ' | sed 's/hostname //g' | cat -v | sed 's/\r//g' | cat -v
mo-swc-56001^M
This newline will stay there. Any ideas?
Edit: crazy, the only way is to perform a dos2unix on the output...
Looks like your outputfile has \r characters in it, so you could add logic there to remove them and give it a try then.
#!/bin/bash
##remove control M first from outputfile by tr command.
tr -d '\r' < outputfile > temp && mv temp outputfile
newname=$(sed 's/hostname //g' outputfile)
newecho=`echo $newname`
echo $newecho
cp outputfile "$newecho"
The only way was to use dos2unix

How to extract the final word of a sentence

For a given text file I'd like to extract the final word in every sentence to a space-delimited text file. It would be acceptable to have a few errors for words like Mr. and Dr., so I don't need to try to achieve that level of precision.
I was thinking I could do this with Sed and Awk, but it's been too long since I've worked with them and I don't remember where to begin. Help?
(Output example: For the previous two paragraphs, I'd like to see this):
file Mr Dr precision begin Help
Using this regex:
([[:alpha:]]+)[.!?]
Explanation
Grep can do this:
$ echo "$txt" | grep -o -E '([[:alpha:]]+)[.!?]'
file.
Mr.
Dr.
precision.
begin.
Help?
Then if you want only the words, a second time through:
$ echo "$txt" | grep -o -E '([[:alpha:]]+)[.!?]' | grep -o -E '[[:alpha:]]+'
file
Mr
Dr
precision
begin
Help
In awk, same regex:
$ echo "$txt" | awk '/[[:alpha:]]+[.!?]/{for(i=1;i<=NF;i++) if($i~/[[:alpha:]]+[.!?]/) print $i}'
Perl, same regex, allows capture groups and maybe a little more direct syntax:
$ echo "$txt" | perl -ne 'print "$1 " while /([[:alpha:]]+)[.!?]/g'
file Mr Dr precision begin Help
And with Perl, it is easier to refine the regex to be more discriminating about the words captured:
echo "$txt" | perl -ne 'print "$1 " while /([[:alpha:]]+)(?=[.!?](?:(?:\s+[[:upper:]])|(?:\s*\z)))/g'
file precision begin Help
gawk:
$ gawk -v ORS=' ' -v RS='[.?!]' '{print $NF}' w.txt
file Mr Dr precision begin Help
(Note that plain awk does not support assigning a regular expression to RS.)
This might work for you (GNU sed):
sed -r 's/^[^.?!]*\b(\w+)[.?!]/\1\n/;/\n/!d;P;D' file
For one word per line or use paste for a single line so:
sed -r 's/^[^.?!]*\b(\w+)[.?!]/\1\n/;/\n/!d;P;D' file | paste -sd' '
For another solution just using sed:
sed -r 'H;$!d;x;s/\n//g;s/\b(\w+)[.?!]/\n\1\n/g;/\n/!d;s/[^\n]*\n([^\n]*)\n/ \1/g;s/.//' file
Easy in Perl:
perl -ne 'print "$1 " while /(\w+)[.!?]/g'
-n reads the input line by line.
\w matches a "word character".
\w+ matches one or more word characters.
[.!?] matches any of the sentence-end markers.
/g stands for "globally" - it remembers where the last match occurred and tries to match after it.

Trying to use variable in sed or awk

I have 2 separate text files, each in the same exact format. I can grep FILE1.txt for a specific search term and output the line numbers of every match. The line numbers are outputted in numeric order to a file or a variable.
I want use each line number and print that line from FILE2.txt in numeric order to a single OUTPUT.txt. Does anyone know a way, using awk or sed to do this?
I have a string variable $linenumbers with values of 25 26 27 28.
I use the following command:
for i in $linenumbers; do sed -n "/$I/p" $i test_read2.fastq >> test.fastq; done.
I get errors of
sed: can't read 25: No such file or directory
sed: can't read 26: No such file or directory
sed: can't read 27: No such file or directory
sed: can't read 28: No such file or directory
If I do this sed command one by one, I can pull line number 25, 26, 27 and 28 from the file and print it to file using the following command;
sed -n "25p" test_read2.fastq >> test.fastq
I want to replace "25p" with variable so it pulls out multiple lines (25,26,27,28) from the file without doing this one by one...
Try this:
grep -n interesting FILE1.txt | cut -d: -f1 | while read l
do
sed -n "$l p" FILE2.txt
done
Example:
$ cat FILE1.txt
foo
bar
baz
$ cat FILE2.txt
qux
quux
quuux
$ grep -n bar FILE1.txt | cut -d: -f1 | while read l; do sed -n "$l p" FILE2.txt; done
quux
Not sure what exactly you want to do. If you want to print the lines of file which are defined in lines you could do awk 'NR==FNR{a[$0];next}FNR in a' lines file
test:
$ cat lines
1
3
7
$ cat file
a
b
c
d
e
f
g
$ awk 'NR==FNR{a[$0];next}FNR in a' lines file
a
c
g
sed -n "` grep -n 'Yourpattern' File1.txt | sed 's/:.*/p/'`" File2.txt
be carefful with substitution and (double) quote in YourPattern

How to remove comments from a file using "grep"?

I have an SQL file that I need to remove all the comments
-- Sql comment line
How can I achieve this in Linux using GREP or other tool?
Best Regards,
The grep tool has a -v option which reverses the sense of the filter. For example:
grep -v pax people
will give you all lines in the people file that don't contain pax.
An example is:
grep -v '^ *-- ' oldfile >newfile
which gets rid of lines with only white space preceding a comment marker. It won't however change lines like:
select blah blah -- comment here.
If you wanted to do that, you would use something like sed:
sed -e 's/ --.*$//' oldfile >newfile
which edits each line removing any characters from " --" to the end of the line.
Keep in mind you need to be careful with finding the string " --" in real SQL statements like (the contrived):
select ' -- ' | colm from blah blah blah
If you have these, you're better off creating/using an SQL parser rather than a simple text modification tool.
A transcript of the sed in operation:
pax$ echo '
...> this is a line with -- on it.
...> this is not
...> and -- this is again' | sed -e 's/ --.*$//'
this is a line with
this is not
and
For the grep:
pax$ echo '
-- this line starts with it.
this line does not
and -- this one is not at the start' | grep -v '^ *-- '
this line does not
and -- this one is not at the start
You can use the sed command as sed -i '/\-\-/d' <filename>
Try using sed on shell:
sed -e "s/(--.*)//" sql.filename