Coldfusion indexing Corrupt File - pdf

Running a refresh on the first file then an update on all others. Recurse may not be necessary.
<cfindex
action=refresh
collection="this_name_solr"
key="c:\inetpub\wwwroot\myappname\thefolder\thesubfolder\thefile.ext"
type="file"
urlpath="http://#application.root#/appname/thefolder/thesubfolder/thefile.ext"
extensions=".html, .htm, .xls, .xlsm, .doc, .docx, .pdf, .txt"
recurse="yes"
status="alldocs"
language="English">
Does fine until it hits a corrupt PDF File. If I try to open the file manually in PDF reader I get the message that the file may be corrupt.
I need it to get past this file and continue indexing the rest. I have tried a request timeout of three minutes but that does not work. I have attempted CFPDF Info extraction but it hangs reading it too. I do not know how to test the doc to see if it is corrupt.
Ultimately I would like it to give up on the file after about 3 minutes.
Any suggestions

Related

Repairing corrupted file while merging (Using PDFBox Merge)

I encountered a file which seems to be corrupted. Whenever I open and then close that file on adobe reader then a pop up comes to save that file again without any changes. This made me believe that this file is corrupted.
I am using PDFBox to Merge multiple files. While merging this particular file I am getting below error:
Error: Expected a long type at offset 8489, instead got ''
I am looking for the way where I can first repair this file and then merge. So, that code doesn’t fail for errors in the file.
PS : If the save this particular file using adobe reader and then try to merge then it got successfully merged.

Block PDF,DOCX, XLSX download, allow print

in our online laboratory software (php) we want to upload some files and allow users to print these documents. Bu we have to block downloading them, because user can print from downloaded file, if we upload newer version of document this will be a problem.
Is there any way to block downlodin these documents?
I am not sure if what you ask for is possible, because I can always print a file as pdf to my desktop instead to my printer if I wanted. My organisation handles this issue by having a watermark on all PDF files that says "Uncontrolled copy when printed", so even if someone prints it out they will know that it might be outdated and should be careful.

Preventing other application from opening custom file vb.net

I have a text file. Now I have changed its file type from .txt to .abc. My VB.NET program loads the text into textboxes from that file. After changing the file type, however, other apps like NotePad and Word are able to open and read my .abc file.
Is there any way that only my application will be able to open/read from the file and no other app would be able to do so? What I mean is, suppose I have a PhotoShop document .psd file, no other app, rather that photoshop itself, can open it. How do I make my file unreadable by other apps?
There is no way to prevent an app that you don't develop from opening any file. The extensions are just there for helping us humans, and maybe a bit for the computer to know the default app you select for an extension.
Like you said, a .txt file can be opened by many many apps. You can open a .txt file with Notepad, Firefox, VSCode, and many others.
Same way, a .psd file can be opened by many many apps. You can open that .psd file with Photoshop, but also Notepad, Firefox, and VSCode, and probably the same apps as above.
The difference is which apps can read and understand the file.
In order to make a file not understandable by other apps, you need to make it into a format that cannot recognize, because you planned it "in secret".
Like Visual Vincent said above, you could encrypt the file in a way, or you can have a binary file, that basically only your app knows know to understand.
Since you dont own the app you want the file to be understood by, then you either have to accept that it can be opened by any app that can open files, or you can try to encrypt the file outside the app, or like zipping it with a password, and then decrypting or unzipping when you want to use it.
Firstly, any file can be read unless it is still open by a particular process or service. Even PhotoShop files can be 'read' by NotePad - try it!
So, an attempt at my first answer...
You can try a couple of methods to prevent opening the file, for instance, applying a file lock. As an example, SQL Server .mdf files are locked by the SQL Server service. This happens because the files are maintained in an open state, however; your application would have to remain running to keep these files open. Technically, though, the files can still be copied.
Another way is to set the hidden attribute for the file. This hides the file from the less savvy users, but it will be displayed if the user show's hidden files.
And my second answer: You refer to the format of files by saying only PhotoShop can read or write its own files (not true, but I know what you're saying).
The format of the file must be decided by yourself. You must determine how you are going to store the data that you output from your application. It looks like you have been attempting to write your application data into a text file. Perhaps you should try writing to binary files instead. Binary files, while not encrypted, as suggested by Visual Vincent in the comments to your question, still provide a more tailored approach to storing your data.
Binary files write raw binary data instead of humanised text. For instance, if you write an integer to the file it will appear as a string of four bytes, not your usual 123456789 textual format.
So, you really need to clarify what data you want to write to the file, decide on a set structure to your file (as you also have to be able to read it back in to your application) and then be able to write the information.

Raw Access Logs File Too Big to Open

My raw access logs file is 420 MB and when I try to open, the program says it is too big and can't open it. I tried opening it with notepad, notepad++ and Excel, and none of them could open it. And the file is only for 3 days worth of logs. How can I view the file?
Instead of reading whole file at a same time, I will suggest you some tool to split text file. For Windows file splitter called HJSplit, And for the linux you can use split command for this.

FTP client sees a file that isn't there... How can I successfully delete/overwrite this "ghost" file?

So we have a client that creates "training packages" and then uploads them via ftp to their website. They create the training packages in PowerPoint, and then use some program to convert them into html/swf files and package them within a folder. When they upload, they use Filezilla, and just transfer the entire folder over. The folder is uniquely named, uses no spaces or special characters.
These files have uploaded fine for about a year. Recently, they've run into a problem. Whenever they try to upload training package folder, they are immediately presented with the "This file already exists, do you want to overwrite?" message. Except... the folder they're moving is brand new, and the file it's asking to overwrite DOESN'T EXIST. When they choose "Overwrite" the file looks like it transfers, but the file size is wrong, and the training package doesn't work correctly.
This happens with every training package they try to upload. It's not just a badly outputted package. Also, it's always the same file that has the problem--it's the main "player" for the training package, and though it contains different content for every package, it is the same file name (cplayer.swf) every time.
Things they've tried without success:
-Re-uploading the file again by itself, and overwriting
-Deleting the "bad" file and re-uploading the single file - Get the overwrite message again, even though the file DOES NOT EXIST.
-Renaming the file on the server and re-uploading the single file - Get the overwrite message.
-Renaming the single file locally within the package and uploading/renaming it - Won't let us rename because the file already exists.
-Used another FTP client - Same results as above, so not a client specific problem.
-Used a different FTP login - Same results as above, so not a permissions problem.
Other things of note:
-The file is small--it's not a time out problem. Plus, all other files upload fine, and some are a lot larger.
-They've emailed this file to me, and I've uploaded it successfully.
I am completely at my wits end. Does anyone have any ideas where I can at least troubleshoot a little further?
Thanks for the non-help, the downvote, and the general lack of response on what was a pretty serious issue for me.
In case anyone else has a similar problem, here's what was going on:
Virus software (specifically Malware Bytes) was blocking THIS ONE SINGLE FILE. All I had to do was exclude the folder that contained the file.