Trying to cat a header into source files but a Unicode BOM is getting in the way - header

Following the instructions at Add header (copyright) information to existing source files, I need to add copyright headers to a bunch of source files we're sending out of the building. (I know, I hate copyright headers too, but it's policy for when we release proprietary source files. Please consider "persuade someone to waive the policy" as unhelpful and as not answering the question.)
I have two copies of all the files (in dir and dir.orig) and, from within dir.orig, I'm using
find . -name \*.cs -exec sh -c "mv '{}' tmp && cp ../header.txt '../dir/{}'
&& cat tmp >> '../dir/{}' && rm tmp" \;
This is working, but it's ending up with the header, then the BOM from the original source file, whereas I'd prefer either the BOM to move to the start or be removed.
(Looking at this, I realise that moving the file to tmp is unnecessary, given I'm not overwriting the original, but I didn't bother removing that from the example from the other SO question.)
How can I remove (or move) the BOM so that I end up without it appearing immediately after the newly-added header?

I think I may have found my solution, thanks to being pointed to uconv from this answer from Steven R. Loomis on a related question.
If I use
find . -name *.cs -exec sh -c "cp ../header.txt '../dir/{}'
&& uconv --remove-signature -f UTF-8 -t UTF-8 '{}' >> '../dir/{}'" \;
, then uconv assumes both input (-f) and output (-t) encodings should be UTF-8, but --remove-signature causes it to remove any BOM it finds.

Related

Remove specific suffix from all files containing it

Long story short, OneDrive has taken all my files and renamed them to include the string "-DESKTOP-9EI0FN7" at the end of the file name, resulting in files such as:
myTextFile-DESKTOP-9EI0FN7.txt
myVideo-DESKTOP-9EI0FN7.mp4
So I'd like to write a batch script that finds all the files with that string in them, and renames them to remove the string, so:
myTextFile-DESKTOP-9EI0FN7.txt becomes myTextFile.txt
The problem is, I know nothing about writing batch files. Any advice?
Test out with this bad boy:
find . -type f -exec rename -n -e 's/(.*)\-DESKTOP\-9EI0FN7(.*)/$1$2/' {} \;
If the output satisfies you, remove the -n portion and it does actually apply the changes.
Good luck, sir!

Is there a way to move a file from one branch to another in ClearCase?

A user checked new files on the wrong branch. I would like to move them in the most efficient way there is a lot of them. My first thought is to remove the element from the branch and have the user recheck in the files on the proper branch. But I was hoping there was a way i could change the pointers?
/VOB/DIRECTORY/file##/main/1.00/1 to /VOB/DIRECTORY/file##/main/2.00/1
Whenever there are a lot of files to checkout and move, clearfsimport is a viable option.
Simply set a view to the destination branch, and import the files found in the source (and wrong) view.
See "How can I use ClearCase to “add to source control …” recursively?"
That will checkout, add, modify or remove files in the destination view in order to mirror the ones from the source (here the source is a ClearCase view, but it could actually be any folder, ClearCase view or not, where the files are).
That will be enough to "recheck in the files on the proper branch", but that won't remove the versions from the wrong branch though, and I would advice against using cleartool rmver (even though I used that here).
Perhaps a subtractive merge is better.
If you know where they are, and where you want them, you could:
1) Merge the directory and files over.
2) Use cleartool ln in a view in the destination branch to link in the files, and then merge the files individually.
If you use clearfsimport, and don't purge the added-in-the-wrong-place files, you can set yourself up for down-the-road "fun" caused by "evil twins."
Personally, since you know the files and directories that got added, where, when and by whom, you could do something like this (command lines are off-the-top-of-my-head:
Get the list of files to copy/merge
cleartool find -type d -element "created_by(baduser) && created_since(25-Jul-2016) && !created_since(26-Jul-2016)" -print > dirlist.txt
cleartool find -type fl -element "created_by(baduser) && created_since(25-Jul-2016) && !created_since(26-Jul-2016)" -print > filelist.txt
Pull the directories over by merging the parent directories while CD'd/set in a view using the destination path. Not knowing the OS involved I can't say which way you would need to parse this. If you use perl, you can grab the offset of the last instance of the directory separator and use that in substr to get the parent directory path. In the windows command prompt, you can do something like this:
SET SRCDRIVE=D:
for /f "delims==" %x in (dirlist.txt) do cleartool co -nc %~px & cleartool merge -to %~px %SRCDRIVE%~px
for /f "delims==" %x in (dirlist.txt) do cleartool co -nc %~px & cleartool merge -to %~px\%~nx %SRCDRIVE%~px\%~nx
Yes, you can do all that in a single script, and do better error checking and not trying 40x to check out the same directory.
You might also be able to merge them to the 2.0 branch (using a view selecting the 2.0 branch). To identify the elements involved, you can run a 'cleartool find' command something like this:
% cd /vobs/myvob
% cleartool find -all -version 'brtype(1.0) && created_by(user_x)' -print
The 'created_since(date-time)' query might also be useful in the compound query.
Once you're convinced you have the right set of versions, you can use '-exec' in place of the '-print' to actually perform the merge. It might look something like this:
% cleartool find -all -version 'brtype(1.0) && created_by(user_x) && created_since(29-Jun)' -exec 'cleartool merge -to $CLEARCASE_PN -version $CLEARCASE_ID_STR'
If you're happy with the results, check everything in. Then you just have to decide if you need to remove the versions on the 1.0 branch (which you can do with another 'cleartool find ... -exec ...' command).

Recursive rsync over ssh, include only one file extension

I'm trying to rsync files over ssh from a server to my machine. Files are in various subdirectories, but I only want to keep the ones that match a certain pattern (IE blah.txt). I have done extensive googling and searching on stackoverflow, and I've tried just about every permutation of --include and --excludes that have been suggested. No matter what I try, rsync grabs all files.
Just as an example of one of my attempts, I have used:
rsync -avze 'ssh' --include='*blah*.txt' --exclude='*' myusername#myserver.com:/path/top/files/directory /path/to/local/directory
To troubleshoot, I tried this command:
rsync -avze 'ssh' --exclude='*' myusername#myserver.com:/path/top/files/directory /path/to/local/directory
expecting it to not copy anything, but it still grabbed all of the files.
I am using rsync version 2.6.9 on OSX.
Is there something obvious I'm missing? I've been struggling with this for quite a while.
I was able to find a solution, with a caveat. Here is the working command:
rsync -vre 'ssh' --prune-empty-dirs --include='*/' --include='*blah*.txt' --exclude='*' user#server.com:/path/to/server/files /path/to/local/files
However! If I type this into my command line directly, it works. If I save it to a file, myfile.txt, and I try `cat myfile.txt` it no longer works! This makes no sense to me.
OSX follows BSD style rsync
https://www.freebsd.org/cgi/man.cgi?query=rsync&apropos=0&sektion=0&manpath=FreeBSD+8.0-RELEASE+and+Ports&format=html
-C, --cvs-exclude
This is a useful shorthand for excluding a broad range of files
that you often don't want to transfer between systems. It uses a
similar algorithm to CVS to determine if a file should be
ignored.
The exclude list is initialized to exclude the following items
(these initial items are marked as perishable -- see the FILTER
RULES section):
RCS SCCS CVS CVS.adm RCSLOG cvslog.* tags TAGS
.make.state .nse_depinfo *~ #* .#* ,* _$* *$ *.old *.bak
*.BAK *.orig *.rej .del-* *.a *.olb *.o *.obj *.so *.exe
*.Z *.elc *.ln core .svn/ .git/ .bzr/
then, files listed in a $HOME/.cvsignore are added to the list
and any files listed in the CVSIGNORE environment variable (all
cvsignore names are delimited by whitespace).
Finally, any file is ignored if it is in the same directory as a
.cvsignore file and matches one of the patterns listed therein.
Unlike rsync's filter/exclude files, these patterns are split on
whitespace. See the cvs(1) manual for more information.
If you're combining -C with your own --filter rules, you should
note that these CVS excludes are appended at the end of your own
rules, regardless of where the -C was placed on the command-
line. This makes them a lower priority than any rules you spec-
ified explicitly. If you want to control where these CVS
excludes get inserted into your filter rules, you should omit
the -C as a command-line option and use a combination of --fil-
ter=:C and --filter=-C (either on your command-line or by
putting the ":C" and "-C" rules into a filter file with your
other rules). The first option turns on the per-directory scan-
ning for the .cvsignore file. The second option does a one-time
import of the CVS excludes mentioned above.
-f, --filter=RULE
This option allows you to add rules to selectively exclude cer-
tain files from the list of files to be transferred. This is
most useful in combination with a recursive transfer.
You may use as many --filter options on the command line as you
like to build up the list of files to exclude. If the filter
contains whitespace, be sure to quote it so that the shell gives
the rule to rsync as a single argument. The text below also
mentions that you can use an underscore to replace the space
that separates a rule from its arg.
See the FILTER RULES section for detailed information on this
option.

Split a batch of text files using pattern

I have a directory of almost a thousand html files. Each file needs to be split up into multiple text files, based on a recurring pattern (a heading). I am on a windows machine, using GnuWin32 tools.
I've found a way to do this, for a single file:
csplit 1.html -b "%04d.txt" /"Words in heading"/ {*}
But I don't know how to repeat this operation over the entire set of HTML files. This:
csplit *.html -b "%04d.txt" /"Words in heading"/ {*}
doesn't work, and neither does this:
for %i in (*.html) do csplit *.html -b "%04d.txt" /"Words in heading"/ {*}
Both result in an invalid pattern error. Help would be much appreciated!
The options/arguments order is important with csplit. And it won’t accept multiple files. It’s help gets you there:
% csplit --help
Usage: csplit [OPTION]... FILE PATTERN...
I’m surprised your first example works for the single file. It really should be changed to:
% csplit -b "%04d.txt" 1.html "/Words in heading/" "{*}"
^^^^^^^^^^^^^ ^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
OPTS/ARGS FILE PATTERNS
Notice also that I changed your your quoting to be around the arguments. You probably also need to have quoted your last "{*}".
I’m not sure what shell you’re using, but if that for-loop syntax is appropriate, then the fixed command should work in the loop.

add prefix to files with rename - error argument too long

I have thousands of files inside a directory I need to rename adding a prefix like "th_" so that files will be th_65461516846.jpg
but I can't due to the error "argument too long"
I have used this command
rename 's/^/th_/' *
thanks!
The xargs program is used to break command lines into multiple commands to avoid blowing the shell line length limit. In your case, you'd use:
ls | xargs rename 's/^/th_/'
Which repeatedly executes rename with a portion of the output of ls until the list of files is exhausted. Do be aware this idiom requires special attention if the file names have spaces or other funny characters in them (which I'm assuming isn't so based on your example).
This one did the job
for f in *; do mv "$f" "${f/9/th_}";done
or
for f in * ; do mv $f th_${f#} ; done
I don't know what differs between the 2 but in my case they both work.