Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a folder with more than 100 .gz files. I need to get the output as:
file name : Count
Eg:
abc.gz : 123456
cde.gz : 123456
test.gz : 456896
To count the lines of each file in the current directory you can do
wc -l *
The above will generate a warning for any subdirectories present. To avoid these warnings, you can use the find command like:
find . -maxdepth 1 -type f -exec wc -l {} +
Of course the above will count the number of lines in the files as they are. If the files are compressed, and you need the number of lines of the uncompressed files that they contain, you can use the following script:
#!/bin/bash
for i in *.tar.gz
do
echo "$i: $(zcat -- "$i" | wc -l)"
done
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Eg.
abc_def_ghi_xyz
uvw_mno_gab_xyz
bac_cab_lmn_xyz
should be replaced with
ABC_xyz
ABC_xyz
ABC_xyz
How to do using awk, %s and sed commands ?
Using %s:
:%s/*_xyz/ABC_xyz/g
I would also suggest looking at :h about vim search and replace.
Using awk you can do:
awk -F_ '{print "ABC_"$NF}' file
Where:
-F_ .............. _ as field separator
"ABC_" ............ literal ABC_
$NF ............... last field
Using vim:
:%s/.*\ze_xyz/ABC
Using sed:
sed -r 's/.*(_xyz)/ABC\1/g' file
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
i search for this string abcdefgh in a very large file like this and i don't know at which position the new line begin. My first thought was to remove all \n, but the file is over 3 gb ... I think there is smart way to do this (sed, awk, ...?)
efhashivia
hjasjdjasd
oqkfoflABC
DEFGHqpclq
pasdpapsda
Assuming that your search string cannot expand into more than 2 lines, you can use this awk:
awk -v p="ABCDEFGH" 's $0 ~ p {print NR,s $0} {s=$0}' file
Or you can paste each line with its next one, and grep the result. This way you have to create a file with double size of your large input.
tail -n +2 file | paste -d '' file - > output.txt
> cat output.txt
efhashiviahjasjdjasd
hjasjdjasdoqkfoflABC
oqkfoflABCDEFGHqpclq
DEFGHqpclqpasdpapsda
pasdpapsda
> grep -n ABCDEFGH output.txt
3:oqkfoflABCDEFGHqpclq
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have a large file of size 38000 by 5001. The first column is position information and the rest are signals. I also have another file that contains pairs of positions that also exists in the large files. I need to split the large file into multiple small files where each file only contains the rows that are in a certain range.
I know this is almost a duplicate question and I have tried everything that is provided before. It's not working that's why I'm posting here my codes. I have tried with awk. Here's what I've tried.
The file that contains the pairs of ranges is named with the lowest and highest value. For example, the name of a range file I have can be blah_blah_30000_4000.txt. This file contains pair values in every 500 apart. Such as
30000 30000
30000 30500
30000 31000
.
.
.
40000 30000
40000 30500
.
.
.
40000 40000
First I extracted the lowest and highest value from the file name.
IFS='_' read -a splittedName <<< "${fileName}"
startRange=${splittedName[2]}
endRange=${splittedName[3]}
Now to make these two strings into numbers
starting=$((startRange + 0))
ending=$((endRange + 0))
Then I used awk like so
awk -F, '{ if($1 >= "$startRange" && $1 <= "$endRange") { print >"test.txt"} }' $InputFile
Could anyone tell me where I'm doing wrong?
You should rewrite your command on this way:
awk -F, -v start=$startRange -v end=$endRange -v fname=$fileName\
'{ if($1 >= start && $1 <= end) { print >$fname.txt} }' $InputFile
As mentioned in comments you can't use shell variables inside awk script
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
Remove all lines that don't contain a letter from the alphabet (upper or lower case)
Input :
34
76
0hjjAby68xp
H5e
895
Output :
0hjjAby68xp
H5e
With GNU grep:
grep '[[:alpha:]]' file
or GNU sed:
sed '/[[:alpha:]]/!d' file
Output:
0hjjAby68xp
H5e
Using awk:
$ awk '/[[:alpha:]]/' file
0hjjAby68xp
H5e
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am trying to write a script to grep output from a range based on text, not line numbers.
For instance, in my text file, I want to grep the output starting with $hostname and capture everything in between $endText and then output the data in between those to a file named $hostname.txt.
Since you didn't provide any details, here is the boiler plate.
$ sed -n '/start/,/end/p' file > outputfile
or
$ awk '/start/,/end/' file > outputfile