Unix shell: how to get the first 3 letters of the filename - filenames

I want to know in how to get the first 3 letters in a filename in very simple way. Thanks and regards

Depends on the shell you're using. In bash, you can just use the substring extraction, something like:
pax> fname=xyzzy.txt
pax> echo ${fname}
xyzzy.txt
pax> first3=${fname:0:3} ; echo ${first3}
xyz
If you're not using bash, another option is to use cut, which tends to be available on most systems. It's an external program, meaning it's not as efficient as the internal bash method above, but you'll probably only notice that if you're doing it thousands of times per second.
pax> first3=$(echo ${fname} | cut -c1-3) ; echo ${first3}
xyz

Related

Renaming directories

I've got like 230 directories of this kind (1367018589_name_nameb_namec_named) and would like to rename them into (Name Nameb Namec Named).
To be more precise:
Removing numbers
Replacing underscores with spaces (except the first understore which comes after the numbers)
First letter into capital letter
a easy one-liner is preferred since I'm quite a newbie regarding Linux and bash.
Bash script wouldn't be a problem either - just a small explanation how to use it would be very much appreciated.
Meaning that I can understand once I know the command, but having troubles coming up with in my own.
Much thanks in andvance
In one line (updated to capitalize first letter of each word -- missed that the first time):
$ for f in * ; do g=$(echo $f | sed s/[0-9_]*// | sed s/_/\ /g | sed "s/\b\(.\)/\u\1g") ; echo "mv \"$f\" to \"$g\"" ; done
Once you are happy that it is going to do what you want change
echo "mv \"$f\" to \"$g\""
to
mv -i "$f" "$g"
Note, the -i option is to avoid the case of accidentally overwriting a file (say if you had files 123_test and 345_test for instance)

Mining dictionary for sed search strings

For fun I was mining the dictionary for words that sed could use to modify strings. Example:
sed settee <<< better
sed statement <<< dated
Outputs:
beer
demented
These sed swords must be at least 5 letters long, and begin with s, then another letter, which can appear only 3 times, with at least one other letter between the first and second instances, and with the third instance as the final letter.
I used sed to generate a word list, and it seems to work:
d=/usr/share/dict/american-english
sed -n '/^s\([a-z]\)\(.*\1\)\{2\}$/{
/^s\([a-z]\)\(.*\1\)\{3\}$/!{/^s\([a-z]\)\1/!p}}' $d |
xargs echo
Output:
sanatoria sanitaria sarcomata savanna secede secrete secretive segregate selective selvedge sentence sentience sentimentalize septette sequence serenade serene serpentine serviceable serviette settee severance severe sewerage sextette stateliest statement stealthiest stoutest straightest straightjacket straitjacket strategist streetlight stretchiest strictest structuralist
But that sed code runs three passes through each line, which seems excessively long and kludgy. How can that code be simplified, while still outputting the same word list?
grep or awk answers would also be OK.
awk to the rescue!
code is cleaner with awk and reads as the spec: split the word based on the second char, three instances of the char will split the word into 4 segments; 2nd one should have at least one char and the last one should be empty.
$ awk '/^s/{n=split($1,a,substr($1,2,1));
if(n==4 && length(a[2])>0 && a[4]=="") print}' /usr/share/dict/american-english | xargs
sanatoria sanitaria sarcomata savanna secede secrete secretive
segregate selective selvedge sentence sentience sentimentalize
septette sequence serenade serene serpentine serviceable serviette
settee severance severe sewerage sextette stateliest statement
stealthiest stoutest straightest straightjacket straitjacket strategist
streetlight stretchiest strictest structuralist
very cool idea. I think you're more restrictive than necessary
sed -nE '/^s(.)[^\1]+\1[^\1]*\1g?$/p'
seems to work fine. It generated 518 words for me. I only have /usr/share/dict/words dictionary file though.
sabadilla sabakha sabana sabbatia sabdariffa sacatra saccharilla
saccharogalactorrhea saccharorrhea saccharosuria saccharuria sacralgia
sacraria sacrcraria sacrocoxalgia sadhaka sadhana sahara saintpaulia
salaceta salada salagrama salamandra saltarella salutatoria
...
stuntist subbureau sucuriu sucuruju sulphurou surucucu
syenite-porphyry symphyseotomy symphysiotomy symphysotomy symphysy
symphytically syndactyly synonymity synonymously synonymy
syzygetically syzygy
an interesting find is
$ sed snow-nodding <<< now-or-never
noddior-never
A speedy pcregrep method, (.025 seconds user time):
d=/usr/share/dict/american-english
pcregrep '^s(.)((?!\1).)+\1((?!\1).)*\1$' $d | xargs echo
Output:
sanatoria sanitaria sarcomata savanna secede secrete secretive segregate selective selvedge sentence sentience sentimentalize septette sequence serenade serene serpentine serviceable serviette settee severance severe sewerage sextette stateliest statement stealthiest stoutest straightest straightjacket straitjacket strategist streetlight stretchiest strictest structuralist
Code inspired by: Regex: Match everything except backreference

Retain backslashes with while read loop in multiple shells

I have the following code:
#!/bin/sh
while read line; do
printf "%s\n" $line
done < input.txt
Input.txt has the following lines:
one\two
eight\nine
The output is as follows
onetwo
eightnine
The "standard" solutions to retain the slashes would be to use read -r.
However, I have the following limitations:
must run under #!/bin/shfor reasons of portability/posix compliance.
not all systems
will support the -r switch to read under /sh
The input file format cannot be changed
Therefore, I am looking for another way to retain the backslash after reading in the line. I have come up with one working solution, which is to use sed to replace the \ with some other value (e.g.||) into a temporary file (thus bypassing my last requirement above) then, after reading them in use sed again to transform it back. Like so:
#!/bin/sh
sed -e 's/[\/&]/||/g' input.txt > tempfile.txt
while read line; do
printf "%s\n" $line | sed -e 's/||/\\/g'
done < tempfile.txt
I'm thinking there has to be a more "graceful" way of doing this.
Some ideas:
1) Use command substitution to store this into a variable instead of a file. Problem - I'm not sure command substitution will be portable here either and my attempts at using a variable instead of a file were unsuccessful. Regardless, file or variable the base solution is really the same (two substitutions).
2) Use IFS somehow? I've investigated a little, but not sure that can help in this issue.
3) ???
What are some better ways to handle this given my constraints?
Thanks
Your constraints seem a little strict. Here's a piece of code I jotted down(I'm not too sure of how valuable your while loop is for the other stuffs you would like to do, so I removed it off just for ease). I don't guarantee this code to be robustness. But anyway, the logic would give you hints in the direction you may wish to proceed. (temp.dat is the input file)
#!/bin/sh
var1="$(cut -d\\ -f1 temp.dat)"
var2="$(cut -d\\ -f2 temp.dat)"
iter=1
set -- $var2
for x in $var1;do
if [ "$iter" -eq 1 ];then
echo $x "\\" $1
else
echo $x "\\" $2
fi
iter=$((iter+1))
done
As Larry Wall once said, writing a portable shell is easier than writing a portable shell script.
perl -lne 'print $_' input.txt
The simplest possible Perl script is simpler still, but I imagine you'll want to do something with $_ before printing it.

grep a number from the line and append it to a file

I went through several grep examples, but don't see how to do the following.
Say, i have a file with a line
! some test here and number -123.2345 text
i can get this line using
grep ! input.txt
but how do i get the number (possibly positive or negative) from this line and append it to the end of another file? Is it possible to apply grep to grep results?
If yes, then i could get the number via something like
grep -Eo "[0-9]{1,}|\-[0-9]{1,}"
p/s/ i am using OS-X
p/p/s/ i'm trying to fetch data from several files and put into a single file for later plotting.
The format with your commands would be:
grep ! input.txt | grep -Eo "[0-9]{1,}|\-[0-9]{1,}" >> output
To grep from grep we use the pipe operator | this lets us chain commands together. To append this output to a file we use the redirection operator >>.
However there are a couple of problems. You regexp is better written: grep -Eoe '-?[0-9.]+' this allows for the decimal and returns the single number instead of two and if you want lines that start with ! then grep ^! is better to avoid matches with lines what contain ! but don't start with it. Better to do:
grep '^!' input | grep -Eoe '-?[0-9.]+' >> output
perl -lne 'm/.*?([\d\.\-]+).*/g;print $1' your_file >>anotherfile_to_append
$foo="! some test here and number -123.2345 text"
$echo $foo | sed -e 's/[^0-9\.-]//g'
$-123.2345
Edit:-
for a file,
[ ]$ cat log
! some test here and number -123.2345 text
some blankline
some line without "the character" and with number 345.566
! again a number 34
[ ]$ sed -e '/^[^!]/d' -e 's/[^0-9.-]//g' log > op
[ ]$ cat op
-123.2345
34
Now lets see the toothpicks :) '/^[^!]/d' / start of pattern, ^ not (like multiply with false), [^!] anyline starting with ! and d delete. Second expression, [^0-9.-] not matching anything within 0 to 9, and . and -, (everything else) // replace with nothing (i.e. delete) and done :)

Need help in executing the SQL via shell script and use the result set

I currently have a request to build a shell script to get some data from the table using SQL (Oracle). The query which I'm running return a number of rows. Is there a way to use something like result set?
Currently, I'm re-directing it to a file, but I'm not able to reuse the data again for the further processing.
Edit: Thanks for the reply Gene. The result file looks like:
UNIX_PID 37165
----------
PARTNER_ID prad
--------------------------------------------------------------------------------
XML_FILE
--------------------------------------------------------------------------------
/mnt/publish/gbl/backup/pradeep1/27241-20090722/kumarelec2.xml
pradeep1
/mnt/soar_publish/gbl/backup/pradeep1/11089-20090723/dataonly.xml
UNIX_PID 27654
----------
PARTNER_ID swam
--------------------------------------------------------------------------------
XML_FILE
--------------------------------------------------------------------------------
smariswam2
/mnt/publish/gbl/backup/smariswam2/10235-20090929/swam2.xml
There are multiple rows like this. My requirement is only to use shell script and write this program.
I need to take each of the pid and check if the process is running, which I can take care of.
My question is how do I check for each PID so I can loop and get corresponding partner_id and the xml_file name? Since it is a file, how can I get the exact corresponding values?
Your question is pretty short on specifics (a sample of the file to which you've redirected your query output would be helpful, as well as some idea of what you actually want to do with the data), but as a general approach, once you have your query results in a file, why not use the power of your scripting language of choice (ruby and perl are both good choices) to parse the file and act on each row?
Here is one suggested approach. It wasn't clear from the sample you posted, so I am assuming that this is actually what your sample file looks like:
UNIX_PID 37165 PARTNER_ID prad XML_FILE /mnt/publish/gbl/backup/pradeep1/27241-20090722/kumarelec2.xml pradeep1 /mnt/soar_publish/gbl/backup/pradeep1/11089-20090723/dataonly.xml
UNIX_PID 27654 PARTNER_ID swam XML_FILE smariswam2 /mnt/publish/gbl/backup/smariswam2/10235-20090929/swam2.xml
I am also assuming that:
There is a line-feed at the end of
the last line of your file.
The columns are separated by a single
space.
Here is a suggested bash script (not optimal, I'm sure, but functional):
#! /bin/bash
cat myOutputData.txt |
while read line;
do
myPID=`echo $line | awk '{print $2}'`
isRunning=`ps -p $myPID | grep $myPID`
if [ -n "$isRunning" ]
then
echo "PARTNER_ID `echo $line | awk '{print $4}'`"
echo "XML_FILE `echo $line | awk '{print $6}'`"
fi
done
The script iterates through every line (row) of the input file. It uses awk to extract column 2 (the PID), and then does a check (using ps -p) to see if the process is running. If it is, it uses awk again to pull out and echo two fields from the file (PARTNER ID and XML FILE). You should be able to adapt the script further to suit your needs. Read up on awk if you want to use different column delimiters or do additional text processing.
Things get a little more tricky if the output file contains one row for each data element (as you indicated). A good approach here is to use a simple state mechanism within the script and "remember" whether or not the most recently seen PID is running. If it is, then any data elements that appear before the next PID should be printed out. Here is a commented script to do just that with a file of the format you provided. Note that you must have a line-feed at the end of the last line of input data or the last line will be dropped.
#! /bin/bash
cat myOutputData.txt |
while read line;
do
# Extract the first (myKey) and second (myValue) words from the input line
myKey=`echo $line | awk '{print $1}'`
myValue=`echo $line | awk '{print $2}'`
# Take action based on the type of line this is
case "$myKey" in
"UNIX_PID")
# Determine whether the specified PID is running
isRunning=`ps -p $myValue | grep $myValue`
;;
"PARTNER_ID")
# Print the specified partner ID if the PID is running
if [ -n "$isRunning" ]
then
echo "PARTNER_ID $myValue"
fi
;;
*)
# Check to see if this line represents a file name, and print it
# if the PID is running
inputLineLength=${#line}
if (( $inputLineLength > 0 )) && [ "$line" != "XML_FILE" ] && [ -n "$isRunning" ]
then
isHyphens=`expr "$line" : -`
if [ "$isHyphens" -ne "1" ]
then
echo "XML_FILE $line"
fi
fi
;;
esac
done
I think that we are well into custom software development territory now so I will leave it at that. You should have enough here to customize the script to your liking. Good luck!