CSH: How to tokenize a string - tokenize

I'm making a CSH script where I am looping through the file names in a directory:
foreach i ($INPUTDIR/*)
$i
end
i ends up being something like this:
/dir1/dir2/dir3/dir4/fileNameHead_middle_2016080924
My question is, using CSH, how can I tokenize each path, first splitting on the forward slashes, then on the underscores, collecting only the last token?

The basename utility deletes any prefix ending with the last slash / character present in string (after first stripping trailing slashes), and a suffix, if given. On my system there's also a gbasename which is part of GNU coreutils which does essentially the same thing with a few more options.
basename is part of POSIX, so it's safe to use everywhere.

Related

Trino regexp_replace this character in the beginning but not in the middle Trino [duplicate]

I am a complete Reg-exp noob, so please bear with me. Tried to google this, but haven't found it yet.
What would be an appropriate way of writing a Regular expression matching files starting with a dot, such as .buildpath or .htaccess?
Thanks a lot!
In most regex languages, ^\. or ^[.] will match a leading dot.
The ^ matches the beginning of a string in most languages. This will match a leading .. You need to add your filename expression to it.
^\.
Likewise, $ will match the end of a string.
You may need to substitute the \ for the respective language escape character. However, under Powershell the Regex I use is: ^(\.)+\/
Test case:
"../NameOfFile.txt" -match '^(\\.)+\\\/'
works, while
"_./NameOfFile.txt" -match '^(\\.)+\\\/'
does not.
Naturally, you may ask, well what is happening here?
The (\\.) searches for the literal . followed by a +, which matches the previous character at least once or more times.
Finally, the \\\/ ensures that it conforms to a Window file path.
It depends a bit on the regular expression library you use, but you can do something like this:
^\.\w+
The ^ anchors the match to the beginning of the string, the \. matches a literal period (since an unescaped . in a regular expression typically matches any character), and \w+ matches 1 or more "word" characters (alphanumeric plus _).
See the perlre documentation for more info on Perl-style regular expressions and their syntax.
It depends on what characters are legal in a filename, which depends on the OS and filesystem.
For example, in Windows that would be:
^\.[^<>:"/\\\|\?\*\x00-\x1f]+$
The above expression means:
Match a string starting with the literal character .
Followed by at least one character which is not one of (whole class of invalid chars follows)
I used this as reference regarding which chars are disallowed in filenames.
To match the string starting with dot in java you will have to write a simple expression
^\\..*
^ means regular expression is to be matched from start of string
\. means it will start with string literal "."
.* means dot will be followed by 0 or more characters

batch-file variable %CD% adding a backslash when run from drive root

I have a problem with the variable %CD% in a batch-file. It adds a backslash if the script is run from the root of a drive.
as an example: updatedir=%CD%\Update & echo %updatedir% will return something like
From a folder E:\New Folder\Update
From a drive root E:\\Update
Is there any way to get rid of the extra backslash if run from root?
Yes %CD% only has a trailing \ if the current directory is the root. You could get rid of any trailing backslash that might be there. But there is a simpler solution.
Use the undocumented %__CD__% instead, which always appends the trailing backslash. This makes it easy to build a clean path, regardless of the current directory.
set "updatedir=%__CD__%Update
You can do something like this:
set "CurrentDir=%CD%"
if "%CD:~-1%"=="\" set "CurrentDir=%CD:~0,-1%"
Since you don't want to go changing the system variable %CD%, this sets a new variable %CurrentDir% to the current value of %CD%. Then, it checks to see if the last character in %CD% is a \, and if it is, sets %CurrentDir% to the value of %CD%, minus the last character.
This question/answer has more information on using substrings in batch files.
replace every occurence of \\ with \.
echo %updatedir:\\=\%

Renaming file via UNC path

I need to have my VB.NET program rename a file over the network.
Microsoft says that My.Computer.FileSystem.RenameFile does not work if the file path starts with two backslashes ("\\"). So, what other way is there of doing this? I just need to rename a file in the domain, for instance:
rename("\\domain\1\exemple.txt", "\\domain\1\exemple2.txt")
The second parameter for rename should be just the file name eg:
My.Computer.FileSystem.RenameFile("C:\Test.txt", "SecondTest.txt")
So try changing your code to this:
My.Computer.FileSystem.RenameFile(#"\\domain\1\exemple.txt", "exemple2.txt")
Also beware of escaping because \ is an escape character, so add a # before any string that contains \. This will cause it to ignore escaping and therefore will treat \ as a normal character

Is there any limitation in giving file name in Unix?

We are using crontab to schedule jobs and it was not picking the files for processing that have [ or ] or ¿ . Is there any limitation in giving file name or these characters means something in UNIX? Is there any other variables like these we shouldnt use in file name?? Thanks in advance.
Following are general rules for both Linux, and Unix (including *BSD) like systems:
All file names are case sensitive. So filename vivek.txt Vivek.txt VIVEK.txt all are three different files.
You can use upper and lowercase letters, numbers, "." (dot), and "_" (underscore) symbols.
You can use other special characters such as blank space, but they are hard to use and it is better to avoid them.
In short, filenames may contain any character except / (root directory), which is reserved as the separator between files and directories in a pathname. You cannot use the null character.
No need to use . (dot) in a filename. Some time dot improves readability of filenames.
And you can use dot based filename extension to identify file. For example:
.sh = Shell file
.tar.gz = Compressed archive
Most modern Linux and UNIX limit filename to 255 characters (255 bytes). However, some older version of UNIX system limits filenames to 14 characters only.
A filename must be unique inside its directory. For example, inside /home/vivek directory you cannot create a demo.txt file and demo.txt directory name. However, other directory may have files with the same names. For example, you can create demo.txt directory in /tmp.
Linux / UNIX: Reserved Characters And Words
Avoid using the following characters from appearing in file names:
/
>
<
|
:
&
Please note that Linux and UNIX allows white spaces, <, >, |, \, :, (, ), &, ;, as well as wildcards such as ? and *, to be quoted or escaped using \ symbol.
It will be good if you can avoid white spaces in your filename. It will make your scripting a lot more easier.
I got the answer from this link. I am just pasting it here so that this info will be available even if that website goes down.
The only characters that are actually illegal in *nix filenames are / (reserved as the directory separator) and NUL (because it's the C string terminator). Everything else is fair game, although various utilities may fail on certain characters - typically characters that have special meaning to the shell. These will need quoting or escaping to be handled correctly.

sed replacing without untouching a string

Im trying to replace all lines within files that contains:
/var/www/webxyz/html
to
/home/webxyz/public_html
the string: webxyz is variable: like web1, web232
So only the string before and after webxyz should be replaced.
Tried this without solution:
sed -i 's/"var/www/web*/html"/"home/web*/public_html"/g'
Also i want this should check and replace files (inclusive subdirectory and files),
the * operator don't work.
Within a regular expression, you’ll need to escape the delimiting character that surround them, in your case the /. But you can also use a different delimiter like ,:
sed -i 's,"var/www/web*/html","home/web*/public_html",g'
But to get it working as intended, you’ll also need to remove the " and replace the b* (sed doesn’t understand globbing wildcards) to something like this:
sed -i 's,var/www/web\([^/]*\)/html,home/web\1/public_html,g'
Here \([^/]*\) is used to match anything after web except a /. The matching string is then referenced by \1 in the replacement part.
Here is what your replacement operation should look like (see sed for more info):
sed -i 's/var\/www\(\/.*\/\)html/home\1public_html/g'
Note that \(...\) is a grouping, and specifies a "match variable" which shows up in the replacement side as \1. Note also the .* which says "match any single character (dot) zero or more times (star)". Note further that your / characters must be escaped so that they are not treated as part of the sed control structure.