How do I correctly retrieve, using bash' cut, the first field from a line with only 1 field in a text file? - cut

In a text file (accounts.txt) with (financial) accounts the sub-accounts are, and need to be, separated by an underscore, looking like this:
assets
assets_hh
assets_hh_reimbursements
assets_hh_reimbursements_ff
... etc.
Now I want to get specific sub-accounts from specific line numbers, e.g.:
field 3 from line 4:
$ lnr=4; fnr=3
$ cut -d $'\n' -f "$lnr" < accounts.txt | cut -d _ -f "$fnr"
reimbursements
$
But both fnr=1 and fnr=2 give for the first line, which has only 1 field:
$ cut -d $'\n' -f 1 < accounts.txt | cut -d _ -f "fnr"
assets
$
which is undesired behaviour.
Now I can get around this by prefixing an underscore to each account and add 1 to each required field number, but this is not an elegant solution.
Am I doing something wrong and/or can this be changed by issuing a different retrieval command?

Using the cut -d $'\n' -f "$lnr" for getting the lnr-th line from the file is somewhat strange. More common approach is using sed, like:
sed -n "${lnr}p" file | cmd ...
However, for this the awk is better - in one invocation could handle the lnr and fnr too.
file=accounts.txt
lnr=1
fnr=2
awk -F_ -v l=$lnr -v f=$fnr 'NR==l{print $f}' "$file"
The above for the all combinations lnr/fnr produces:
line field1 field2 field3 field4
------------------------------------------------------------------------
assets assets
assets_hh assets hh
assets_hh_reimbursements assets hh reimbursements
assets_hh_reimbursements_ff assets hh reimbursements ff

Check below solution -
cat f
assets
assets_hh
assets_hh_reimbursements
assets_hh_reimbursements_ff
Based on your comment try below commands -
$ lnr=1; fnr=2
$ echo $lnr $fnr
1 2
$ awk -v lnr=$lnr -v fnr=$fnr -F'_' 'NR==lnr {print $fnr}' f
###Output is nothing as line 1 column 2 is blank when FS="_"
$ lnr=4;fnr=1
$ echo $lnr $fnr
4 1
$ awk -v lnr=$lnr -v fnr=$fnr -F'_' 'NR==lnr {print $fnr}' f
assets
$ lnr=4;fnr=3
$ echo $lnr $fnr
4 3
$ awk -v lnr=$lnr -v fnr=$fnr -F'_' 'NR==lnr {print $fnr}' f
reimbursements

One solution is to head|tail and read into an array so it's easier to work with the items:
lnr=4
fnr=2
IFS=_ read -r -a arr < <(head -n "$lnr" accounts.txt | tail -n 1)
#note that the array is 0-indexed, so the fieldnumber has to fit that
echo "${arr[$fnr]}"
Then you could expand the idea into a more usable function:
get_field_from_file() {
local fname="$1"
local lnr="$2"
local fnr="$3"
IFS=_ read -r -a arr < <(head -n "$lnr" "$fname" | tail -n 1)
if (( $fnr > ${#arr[#]} )); then
return 1
else
echo "${arr[$fnr]}"
fi
}
field=$(get_field_from_file "accounts.txt" "4" "2") || echo "no such line or field"
[[ -n $field ]] && echo "field: $field"

Related

Sed and count awk

I need to split a text file as a sliding window and need to count “0/0” from each segment.
For example If I have 20 lines of files and window size is 10 the command as follows
sed -n '1,11p' input.txt |grep -c "0/0"
sed -n '2,12p' input.txt |grep -c "0/0"
sed -n '3,13p' input.txt |grep -c "0/0"
.
.
.
sed -n '8,18p' input.txt |grep -c "0/0"
sed -n '9,19p' input.txt |grep -c "0/0"
But if I have a large file this method wont help me to do the same. Is there any way to automate this ?
awk -v k=11 -v str="0/0" '{
cnt += found[NR%k] = index($0,str)>=1;
}
NR>=k {
print 1+NR-k "-" NR, cnt+0;
cnt -= found[(NR+1)%k];
}' file
This calls window size k. Output prints a line number range and how many of those lines contained the string str (matched using index to avoid regex matching).

sed replace text between comma

I have csv files that need to be changed f -> 0 and t -> 1 only between commas for every single csv if it matches. From:
,t,t,f,f,a,t,f,t,f,f,t,f,
tftf
to:
,1,1,0,0,a,1,0,1,0,0,1,0,
tftf
Works this way, but want to know better way that could reduce the replacing time consume
for i in 1 2 3 4 5 6
do
echo "converting tables for mariaDB"
find ./ -type f -name "*.csv" -print0 | xargs -0 sed -i 's/\,t\,/\,1\,/g'
find ./ -type f -name "*.csv" -print0 | xargs -0 sed -i 's/\,f\,/\,0\,/g'
echo "$i time(s) changed "
done
I except , one single command will change the line
Could you please try following. Though it is not perfect solution but would be simplest use it in case you don't have gawk's latest version where -inplace edit option is present.
for file in *.csv
awk '{gsub(/,t,/,",1,");gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(/,f,/,",0,")} 1' "$file" > temp && mv temp"$file"
done
OR
for file in *.csv
awk -v t_val="1" -v f_val="0" 'BEGIN{FS=OFS=","}{for(i=2;i<NF;i++){$i=($i=="t"?t_val:$i=="f"?f_val:$i)}} 1' "$file" > temp && mv temp "$file"
done
2nd solution: Using gawk's latest version where we could save edit into Input_file itself.
gawk -i inplace '{gsub(/,t,/,",1,");gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(/,f,/,",0,")} 1' *.csv
OR
gawk -i inplace -v t_val="1" -v f_val="0" 'BEGIN{FS=OFS=","}{for(i=2;i<NF;i++){$i=($i=="t"?t_val:$i=="f"?f_val:$i)}} 1' Input_file
The main problem, in this case, is that a regular expression does not allow overlap when parsing it with sed 's/ere/str/g' or awk '{gsub(ere,str,$0)}'. This comment nicely explains how you can circumvent this in sed using the t<label> command, which means: if a change happened to the pattern space, move to <label>. The comment shows a generic way of doing it. The awk alternative to this rule would be:
$ awk '{while(match($0,ere)) gsub(ere,str)}'
An alternative sed solution in the case of the OP's example could use the following idea:
duplicate all commas. Since we are searching for strings of the form ",t,", this duplication avoid overlap using s.
since no overlap is possible, replace all ",f," with ",0," and all ",t," with ",1,".
We can now revert all duplicated commas again. As no overlap is allowed, sequences like ,,,, will be nicely converted to ,, and not ,
In POSIX sed this looks like:
$ sed -e 's/,/,,/g' -e 's/,f,/,0,/g' \
-e 's/,t,/,1,/g' -e 's/,,/,/g' file > file.tmp
$ mv file.tmp file
With GNU sed we can do it in one go:
$ sed -i 's/,/,,/g;s/,f,/,0,/g;s/,t,/,1,/g;s/,,/,/g' file
With awk, this would look like:
$ awk 'BEGIN{FS=",";OFS=FS FS}
{$1=$1;gsub(/,f,/,",0,");gsub(/,t,/,",1,");gsub(OFS,FS)}1' file > file.tmp
$ mv file.tmp file

How to replace a string in a file in KSH

My KSH-Script should replace a String in a txt file from the same directory.
sed -i 's/"$original"/"$reversed"/' inputtext.txt
is what I'm using currently, but it doesn't work. There is no error in the code or things like that. It just doesn't work.
Here is my whole code:
#!/bin/ksh
original=$1
reversed=""
counter=0
echo $original | awk -v ORS="" '{ gsub(/./,"&\n") ; print }' | \
while read char
do
letters[$counter]+="$char"
((counter=counter+1))
done
length=${#original}
((length=length-1))
echo $original | awk -v ORS="" '{ gsub(/./,"&\n") ; print }' | \
while read char
do
reversed+=${letters[$length]}
((length=length-1))
done
echo $reversed
sed -i 's/"$original"/"$reversed"/' inputtext.txt
exit 0
I want, that in the file "inputtext.txt" (same dir as the .sh file) every word that equals "$original" gets changed to "$reversed".
What am I doing wrong?
I think single quotes prevent variable expansion. You can try this:
sed -i "s/$original/$reversed/" inputtext.txt

Shell script: How to split line?

here's my scanerio:
my input file like:
/tmp/abc.txt
/tmp/cde.txt
/tmp/xyz/123.txt
and i'd like to obtain the following output in 2 files:
first file
/tmp/
/tmp/
/tmp/xyz/
second file
abc.txt
cde.txt
123.txt
thanks a lot
Here is all in one single awk
awk -F\/ -vOFS=\/ '{print $NF > "file2";$NF="";print > "file1"}' input
cat file1
/tmp/
/tmp/
/tmp/xyz/
cat file2
abc.txt
cde.txt
123.txt
Here we set input and output separator to /
Then print last field $NF to file2
Set the last field to nothing, then print the rest to file1
I realize you already have an answer, but you might be interested in the following two commands:
basename
dirname
If they're available on your system, you'll be able to get what you want just piping through these:
cat input | xargs -l dirname > file1
cat input | xargs -l basename > file2
Enjoy!
Edit: Fixed per quantdev's comment. Good catch!
Through grep,
grep -o '.*/' file > file1.txt
grep -o '[^/]*$' file > file2.txt
.*/ Matches all the characters from the start upto the last / symbol.
[^/]*$ Matches any character but not of / zero or more times. $ asserts that we are at the end of a line.
The awk solution is probably the best, but here is a pure sed solution :
#n sed script to get base and file paths
h
s/.*\/\(.*.txt\)/\1/
w file1
g
s/\(.*\)\/.*.txt/\1/
w file2
Note how we hold the buffer with h, and how we use the write (w) command to produce the output files. There are many other ways to do it with sed, but I like this one for using multiple different commands.
To use it :
> sed -f sed_script testfile
Here is another oneliner that uses tee:cat f1.txt | tee >(xargs -n 1 dirname >> f2.txt) >(xargs -n 1 basename >> f3.txt) &>/dev/random

awk capability cut capability

I am using the following ssh command to get a list of ids. Now I want to
get only ids greater than a given number in the list of ids; let's say "231219" in this case. How can I incorporate that?
I have a local file "ids_ignore.txt"; anyid we put in this list should be ignored by the command..
Can awk or cut do the above?
ssh -p 29418 company.com gerrit query --commit-message --files --current-patch-set \
status:open project:platform/code branch:master |
grep refs | cut -f4 -d'/'
OUTPUT:-
231222
231221
231220
231219
230084
229092
228673
228635
227877
227759
226138
226118
225817
225815
225246
223554
223527
223452
223447
226137
... | awk '$1 > max' max=8888 | grep -v -F -f ids_ignore.txt
Or, if you want to do it all with awk:
... | awk 'NR==FNR{ no[$1]++ }
NR!=FNR && $1 > max && ! no[$1]' max=NNN ids_ignore.txt -
cut cannot do numeric comparison on the input fields, it's just a simple field extraction tool. awk can do the work of grep and cut:
ssh -p 29418 company.com gerrit ... |
awk -F/ -v min=231219 '
NR == FNR {ignore[$1]; next}
/refs/ && $4>min && !($4 in ignore) {print $4}
' ids_ignore.txt -
The trailing - is important at the end of the awk command: it tells awk to read from stdin after it reads the ids_ignore file.