Method to inspect first 4 bytes and rename file extension - filenames

I have a large batch of assorted files, all missing their file extension.
I'm currently using Windows 7 Pro. I am able to "open with" and experiment to determine what application opens these files, and rename manually to suit.
However I would like some method to identify the correct file type (typically PDF, others include JPG, HTML, DOC, XLS and PPT), and batch rename to add the appropriate file extension.
I am able to open some files with notepad and review the first four bytes, which in some cases shows "%PDF".
I figure a small script would be able to inspect these bytes, and rename as appropriate. However not all files give such an easy method. HTML, JPG, DOC etc do not appear to give such an easy identifier.
This Powershell method appears to be close: https://superuser.com/questions/186942/renaming-multiple-file-extensions-based-on-a-condition
Difficulty here is focusing the method to work on file types with no extension; and then what to do with the files that don't have the first four bytes identifier?
Appreciate any help!!
EDIT: Solution using TriD seen here: http://mark0.net/soft-trid-e.html
And recursive method using Powershell to execute TriD here: http://mark0.net/forum/index.php?topic=550.0

You could probably save some time by getting a file utility for Windows (see What is the equivalent to the Linux File command for windows?) and then writing a simple script that maps from file type to extension.
EDIT: Looks like the TriD utility that's mentioned on that page can do what you want out of the box; see the -ae and -ce options)

Use python3.
import os,re
fldrPth = "path/to/folder" # relative to My Documents
os.chdir(fldrPth)
for i in os.listdir():
with open(i,'r') as doc:
st = doc.read(4)
os.rename(i,i+'.'+re.search(r'\w+',st).group())
Hopefully this would work.
I don't have test files to check the code. Take a backup and then run it and let me know if it works.

Related

Doing DSPSMTF to display a stmf on browser but it all junk and it is downlading the file instead of displaying it. Also any idea about CONTTYPES file?

I am using CGI DSPSTMF command to display stmf file on web browser. I am copying a spool file to a stmf file using CPYSPLF *STMF option. Once copied i am passing IFS location to DSPSTMF command but it is going to download automatically and when i open the download file i am getting all Junk data any idea why?
Also, i noticed it is using CONTTYPES file in CGILIB and on my server it is empty. What should be the values in it and what should i do show correct data instead of junk. I tried to use different methods to copy the file to IFS like used cpytostmf instead of cpysplf but on IFS file looks correct not the download version.
What CCSID is the resulting stream file tagged with?
use WRKLNK and option 8=Display attributes
If 65535, that tells the system the data is binary and it won't try to translate the EBCDIC to ASCII.
The correct fix is to properly configure your IBM i so that the stream file is tagged with it's correct CCSID.
Do a WRKSYSVAL QCCSID ... if your system is still set to 65535, that's the start of your problem. But this isn't programming related, you can try posting to Server Fault but you might get better responses on the Midrange mailing list

custom file open with custom application only

I am working on vb.net application where I wanted to create and read a file. File will have specific extension for ex. .abcb the way I want my application to work is:
can create a file with .abcd extension
should read .abcd files only(and also application created files only so altered extension shouldn't be working)
.abcd files should show some garbage data when open in any other application(ex. word, notepad any image viewer etc.)
Now my application does 1,2(partly) step, i.e. it creates a file and load data also, it reads .abcd files only(not the altered files)
but created file can be read by other software's also.I tried searching a lot but have not found anything and don't know where to start.
Any help is appreciated!
if you don't want other programs to be able to read the content of your file then your going to have to mask it in some way, which is usually done with encryption.
assuming your not too worried about the key being compromised, the easiest way to accomplish this would be to generate a key with something like System.Security.Cryptography and use that key to encyrpt everything you send to the file and everything you read from it.
as for making your own file extension, you can make the extension of a file whatever you want when you make it:
Dim fs As FileStream = File.Create("/path/to/file/filename" & ".abcd")
the only thing that the extension does is tell the OS what progam to use when opening a file by default, which will probably be notepad since your making your own extension

Preventing other application from opening custom file vb.net

I have a text file. Now I have changed its file type from .txt to .abc. My VB.NET program loads the text into textboxes from that file. After changing the file type, however, other apps like NotePad and Word are able to open and read my .abc file.
Is there any way that only my application will be able to open/read from the file and no other app would be able to do so? What I mean is, suppose I have a PhotoShop document .psd file, no other app, rather that photoshop itself, can open it. How do I make my file unreadable by other apps?
There is no way to prevent an app that you don't develop from opening any file. The extensions are just there for helping us humans, and maybe a bit for the computer to know the default app you select for an extension.
Like you said, a .txt file can be opened by many many apps. You can open a .txt file with Notepad, Firefox, VSCode, and many others.
Same way, a .psd file can be opened by many many apps. You can open that .psd file with Photoshop, but also Notepad, Firefox, and VSCode, and probably the same apps as above.
The difference is which apps can read and understand the file.
In order to make a file not understandable by other apps, you need to make it into a format that cannot recognize, because you planned it "in secret".
Like Visual Vincent said above, you could encrypt the file in a way, or you can have a binary file, that basically only your app knows know to understand.
Since you dont own the app you want the file to be understood by, then you either have to accept that it can be opened by any app that can open files, or you can try to encrypt the file outside the app, or like zipping it with a password, and then decrypting or unzipping when you want to use it.
Firstly, any file can be read unless it is still open by a particular process or service. Even PhotoShop files can be 'read' by NotePad - try it!
So, an attempt at my first answer...
You can try a couple of methods to prevent opening the file, for instance, applying a file lock. As an example, SQL Server .mdf files are locked by the SQL Server service. This happens because the files are maintained in an open state, however; your application would have to remain running to keep these files open. Technically, though, the files can still be copied.
Another way is to set the hidden attribute for the file. This hides the file from the less savvy users, but it will be displayed if the user show's hidden files.
And my second answer: You refer to the format of files by saying only PhotoShop can read or write its own files (not true, but I know what you're saying).
The format of the file must be decided by yourself. You must determine how you are going to store the data that you output from your application. It looks like you have been attempting to write your application data into a text file. Perhaps you should try writing to binary files instead. Binary files, while not encrypted, as suggested by Visual Vincent in the comments to your question, still provide a more tailored approach to storing your data.
Binary files write raw binary data instead of humanised text. For instance, if you write an integer to the file it will appear as a string of four bytes, not your usual 123456789 textual format.
So, you really need to clarify what data you want to write to the file, decide on a set structure to your file (as you also have to be able to read it back in to your application) and then be able to write the information.

Can DM script read images with the same extension (like *.mrc) from a folder?

I have a bunch of MRC images saved in a folder. I want my DM script to read and process them one by one. Now I just open some of them (like 20 images) at once and use FindNextImage in my script to sequentially process them.
I am wondering if I can define a path and let DM script go to this path to read out the MRC images one by one.
The best way to do this is with the function GetFilesInDirectory. An example of its use was posted with the answer to the following question:
Opening multiple files from folder ...
An example of how to extract the files with a particular file type extension was given in the answer to this question:
How could I open more than one image ...
In particular, have a close look at the method CreateFilteredFileList.

Creating a search app like EasyFind

On OS X there is a popular app called EasyFind that searches for strings inside of a files content or you can just do a name search. More importantly, it searches in hidden files and inside of package contents.
So my research with using the Spotlight API leads me to believe that it is not possible to do this. Should I assume EasyFind is doing this all manually without using any Cocoa search API?
If that is true, does anyone know of some code to get me started, even just pseudo?
Basically I want to build an app that will find every single image on the drive no matter where it is or what permissions it has. This also includes icon files.
One other thing I can't seem to find an answer to is whether or not you can do a search like this on the command line in OS X.
Thanks!
In the command line you can use the find command line tool. That gives you access to all the files in the filesystem if you run it with root permissions (sudo). You can pipe its results to grep to find for strings inside the files. You can also use the strings command line tool to look for strings inside binary files.
This is not very complicated to implement within a Cocoa App. Just Google for how to iterate through all the hard drive contents. NSFileManager could be a good place to start digging.
Also check out FindAnyFile. It is a nice app that does similar to EasyFind but just on file properties (name, dates, etc.). It doesn't read file contents.