Decoding the Print Job Files - pdf

When i give print command ,print job file gets stored into the /var/spool/cups directory but that is in PDF format.Is there a way to decode that pdf file so that i can spy what data is there in that pdf file and accordingly take action on that user?

The scheduler stores job files in a spool directory, typically
/var/spool/cups. Two types of files will be found in the spool
directory: control files starting with the letter "c" ("c00001",
"c99999", "c100000", etc.) and data files starting with the letter "d"
("d00001-001", "d99999-001", "d100000-001", etc.) Control files are
IPP messages based on the original IPP Print-Job or Create-Job
messages, while data files are the original print files that were
submitted for printing. There is one control file for every job known
to the system and 0 or more data files for each job.
https://www.cups.org/doc/spec-design.html
You have to search for files like d000234 (data files, not c000234 print control files).
You can do a file d000234 to find information about the file format.
E.g.:
[root#pc cups]# file d000234
d000234: PostScript document text conforming DSC level 3.0, Level 2
For this job, I've printed a PDF with my default system print dialog. Somewhere it was converted to PhostScript. Open it with any application with PostScript capabilities.
E.g.:
okular d000234
Data files are only available if you've enabled the "PreserveJobFiles" and "PreserveJobHistory" in cupsd.conf.

Related

Using PDFtk to Update Web Server Files in Many Directories

long time reader, first time poster. Trying to automate a process to take many .PDF floorplan files and combine them into a single .PDF floorplan which will be referenced by a website.
To cut down on manual cut-and-paste from network shares to a web server as is current practice, I've written a PowerShell command as follows:
$SourcePath = '\\network\share\location\CAD Miniatures'
$DestinationPath = 'C:\inetpub\wwwroot\floorplans'
$LogFile = 'C:\Floorplan Transfer Logs\TransferLog.txt'
Robocopy $SourcePath $DestinationPath *.pdf /E /MIR /ZB /DCOPY:DAT /R:5 /W:10 /LOG+:$LogFile
My plan is to have this script run every hour as a Scheduled Task to mirror our local files and web files to ensure they remain up-to-date automatically.
The curve ball is the files being copied are individual files, within directories. I would like to take all .pdf files in a given folder and combine it into a single .pdf.
File structure is as such:
/floorplans
/ABC
/ABC-01.pdf
/ABC-02.pdf
/ABC-03.pdf
/XYZ
/XYZ-01.pdf
/XYZ-02.pdf
/XYZ-03.pdf
/XYZ-04.pdf
/XYZ-05.pdf
/XYZ-06.pdf
Within each directory (or in a subdirectory), I would like to have the combined output file be simple abc.pdf and xyx.pdf as per the examples above.
The file naming always follows the same format, but the number of files varies from a single file to over a dozen.
I would like to run the Robocopy and PDFtk tasks in the same script if possible (the idea to update all files, and combine them together). There would also be no need to merge files in which no updates have been detected.

Visual Basic read text file and delete files from that file

I know how to tell my program how to read a file but I dont know how to use that information to delete some files from that text file.
Example;
I have a text file called ban.txt inside that file there are two lines with text abc.exe and cba.exe
I want my program to read content of ban.txt and the delete those specified files.
Assuming that you know how to read the file and find the file names, then just add this statement to a for-each:
My.Computer.FileSystem.DeleteFile(strFilename)
There are also options for displaying error messages and sending the file to the recycle bin.
My.Computer.FileSystem.DeleteFile("C:\Test.txt", FileIO.UIOption.AllDialogs, FileIO.RecycleOption.SendToRecycleBin)

finding a corrupted part from the parts of a split archive

I have 7 files with extensions like xyz.rar.001 - xyz.rar.007 clearly they are parts of a single file. I have all the 7 parts. I join them using a file joiner into a single file xyz.rar and try to unrar them with WINRAR , it says that archive is corrupted It is clear that 1 or 2 parts are corrupted. IS THERE ANY WAY TO FIND THEM ? Please help I don't want to re download all of them NOTE- winrar can detect a corrupt part if the parts were splitted using winrar (with extensions like part1.rar , part2.rar etc. ) but not if they are named as rar.001
Parts .001 - .006 should have the same size. Check if there is a file with a different byte size.
Are there multiple files in the RAR or just the one? With multiple you could run a Test and see which is the first file to fail.
I think it's strange that there is a second tool used to split the RAR archive up. (e.g. HJSplit) This lets me think that .002 could be a RAR archive too. Try opening xyz.rar.001 with WinRAR and test/exctract. It happens more that RAR archives have the extension .001 instead of .rar. An example.
Naming your archives in WinRAR like this can be accomplished by putting "xyz.rar.001" as Archive name on the General tab and checking "Old style volume names" on the Advanced tab.
If I then join the files with HJSplit, I get one .rar file (that is corrupt). When I Test it, it says "Next volume is required". In the diagnostic messages I can see "The required volume is absent" and "CRC failed in X. The file is corrupt"
If there is one file stored inside the RAR and the RAR is indeed just chopped up into 7 pieces, there is no way of telling without additional files such as .sfv or .par2. (unless the RAR does not use compression: you can parse the underlying file for errors and calculate the part where it goes wrong)

recovering files with scrambled file names

I have a folder of files with scrambled file names. The file extensions are scrambled too. The folder contains a variety of different file formats. The files are not encrypted.
example: original file name = abcde.pdf
scrambled file name = !##FDZ13
Is there a way to recover the original file names? If not, is there a way to differentiate the file formats (.pdf, .png, ...)? Ultimately, I wish to access and use these files again.
I am working with windows.
Wei, in principle, the case is quite easy.
I assume you know the set of file types that can possibly appear there. Let's say we expect there to be DOC, PDF and PNG files.
Then I would go ahead and do the following:
- create a subdirectory for every file type you expect
- for each file f
- for each file type t
- move f under a nice name with appropriate file extension
to the subdirectory for file type t
- try to open the file with the correct application for t
- continue with next file if it works
- otherwise continue with next file type
- at this point the directory should contain no files anymore
- move all files from the subdirectories back to this one
- remove the subdirectory.

Programatically splitting pdf pages to their own pdf's in UNIX

I am trying to write a program that takes as input a .pdf file and separates each page into their own .pdf files in UNIX command line. I have tried SplitPdf but for some reason I keep getting errors.
update: I have already tried pdftk but it has poor performance and a limitation on the size of the pdf file.
Use pdftk.
The burst command is what you are after.
Man page section: http://www.pdflabs.com/docs/pdftk-man-page/#dest-op-burst
burst
Splits a single, input PDF
document into individual pages. Also
creates a report named doc_data.txt
which is the same as the output from
dump_data. If the output section is
omitted, then PDF pages arenamed:
pg_%04d.pdf, e.g.: pg_0001.pdf,
pg_0002.pdf, etc. To name these pages
yourself, supply a printf-styled
format string in the output section.
For example, if you want pages named:
page_01.pdf, page_02.pdf, etc.,pass
output page_%02d.pdf to pdftk.
Encryption can be applied to the
output by appending output options
such as owner_pw, e.g.: pdftk in.pdf
burst owner_pw foopass