Extract and recomprise PDF file using Origami

Extract and recomprise PDF file using Origami - pdf

This is regarding Origami, the Ruby tool for exploring PDF files at http://esec-lab.sogeti.com/pages/Origami
By way of example I am trying to open a PDF file, extract it and then rewrite the original PDF. This is the complete code I am trying to use to accomplish this:
hg clone https://code.google.com/p/origami-pdf/
cd origami-pdf/
rake
cd ..
curl 'http://www.ada.gov/hospcombrprt.pdf' -o hospcombrprt.pdf
origami-pdf/bin/pdf2ruby -x hospcombrprt.pdf
mv hospcombrprt.pdf hospcombrprtORIG.pdf
cd hospcombrprt
ruby hospcombrprt.rb # THIS STEP PRODUCES ERRORS
bc hospcombrprt.pdf ../hospcombrprtORIG.pdf || echo FAILED
However this produces the following error:
/Users/williamentriken/Developer/origami-pdf/lib/origami/page.rb:75:in `pages': Invalid page tree (Origami::InvalidPDFError)
from /Users/williamentriken/Developer/origami-pdf/lib/origami/pdf.rb:689:in `compile'
from /Users/williamentriken/Developer/origami-pdf/lib/origami/pdf.rb:233:in `save'
from hospcombrprt.rb:189:in `<main>'
Has anyone else had success in performing this operation using this library and could you please share?

Original Post:
I played around with the library for a while, but I kept getting errors and minor bugs, such as replicated pages and missing pages...
...you should read the authors comment about the limits of using the Origami library.
I recommend the combine_pdf gem, it's great for simple pdf manipulations, such as merging, stamping and the like.
update:
I looked at the specific PDF file and it might be an issue related to an unsupported PDF version.
The http://www.ada.gov/hospcombrprt.pdf file is encrypted with a type 4 encryption, which according to the PDF standard, starting with PDF 1.5, is:
"(PDF 1.5) The security handler defines the use of encryption and decryption in the document, using the rules specified by the CF, StmF, and StrF entries."
The encryption uses AES v.2, which is limited to PDF 1.6 and above:
"AESV2 (PDF 1.6) The application shall ask the security handler for the encryption key and shall implicitly decrypt data with "Algorithm 1: Encryption of data using the RC4 or AES algorithms", using the AES algorithm in Cipher Block Chaining (CBC) mode with a 16-byte block size and an initialization vector that shall be randomly generated and placed as the first 16 bytes in the stream or string."
So, Even if the decryption code is written in, the way to apply that code might not be known due to the way the PDF file is structured...
...It might be better to start with simple PDF files and then patch anything that isn't supported just yet.

Related

How to decode a base64 PDF without encryption in TCL?

Using Tcl in a integration suite called Cloverleaf, the programming language used is TCL.
The command I'm using for it is correct set dcpdf [::base64 -mode decode $pdf] the packege is called earlier in the code, the problem is that after the code base64 is decoded and we try to open the file it requires a password, there is no passwork inhereit to the file, is there a arg to the decode comand that stops this issue?

Implementing basic S3 compatible API with akka-http

I'm trying to implement the file storage ыукмшсу with basic S3 compatible API using akka-http.
I use s3 java sdk to test my service API and got the problem with the putObject(...) method. I can't consume file properly on my akka-http backend. I wrote simple route for the test purposes:
def putFile(bucket: String, file: String) = put{
extractRequestEntity{ ent =>
val finishedWriting = ent.dataBytes.runWith(FileIO.toPath(new File(s"/tmp/${file}").toPath))
onComplete(finishedWriting) { ioResult =>
complete("Finished writing data: " + ioResult)
}
}
}
It saves file, but file is always corrupted. Looking inside the file I found the lines like these:
"20000;chunk-signature=73c6b865ab5899b5b7596b8c11113a8df439489da42ddb5b8d0c861a0472f8a1".
When I try to PUT file with any other rest client it works as fine as expected.
I know S3 uses "Expect: 100-continue" header and may it he causes problems.
I really can't figure out how to deal with that. Any help appreciated.

This isn't exactly corrupted. Your service is not accounting for one of the four¹ ways S3 supports uploads to be sent on the wire, using Content-Encoding: aws-chunked and x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD.
It's a non-standards-based mechanism for streaming an object, and includes chunks that look exactly like this:
string(IntHexBase(chunk-size)) + ";chunk-signature=" + signature + \r\n + chunk-data + \r\n
...where IntHexBase() is pseudocode for a function that formats an integer as a hexadecimal number as a string.
This chunk-based algorithm is similar to, but not compatible with, Transfer-Encoding: chunked, because it embeds checksums in the stream.
Why did they make up a new HTTP transfer encoding? It's potentially useful on the client side because it eliminates the need to either "read your payload twice or buffer [the entire object payload] in memory [concurrently]" -- one or the other of which is otherwise necessary if you are going to calculate the x-amz-content-sha256 hash before the upload begins, as you otherwise must, since it's required for integrity checking.
I am not overly familiar with the internals of the Java SDK, but this type of upload might be triggered by using .withInputStream() or it might be standard behavor for files too, or for files over a certain size.
Your minimum workaround would be to throw an HTTP error if you see x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD in the request headers since you appear not to have implemented this in your API, but this would most likely only serve to prevent storing objects uploaded by this method. The fact that this isn't already what happens automatically suggests that you haven't implemented x-amz-content-sha256 handling at all, so you are not doing the server-side payload integrity checks that you need to be doing.
For full compatibility, you'll need to implement the algorithm supported by S3 and assumed to be available by the SDKs, unless the SDKs specifically support a mechanism for disabling this algorithm -- which seems unlikely, since it serves a useful purpose, particularly (it appears) for streams whose length is known but that aren't seekable.
¹ one of four -- the other three are a standard PUT, a web-based html form POST, and the multipart API that is recommended for large files and mandatory for files larger than 5 GB.

Enable Transfer Acceleration for AWS SDK for macos

Disclaimer: I posted this originally on Code Review but as the code does not currently work I was suggested to post at SO instead.
I have converted the official AWS SDK iOS (v 2.5.0) framework (focusing on S3) to macOS and everything is working as expected with uploads and downloads. However, I wanted to also enable Transfer Acceleration for S3 using AWSTransferManager. I know that you can enable Transfer Acceleration (TA) using the AWSTransferUtility, but that utility uses pre-signed requests that are valid for only 50 minutes (useful for iOS but not macOS). I would like to be able to transfer files that are large and can take hours even when using Transfer Acceleration.
I have edited the original code from AWS to enable TA for AWSTransferManager, however, I still can not get this to work properly as the final signing of the upload/download request fails. The error message is:
Message=The request signature we calculated does not match the signature you provided. Check your key and signing method.}]
For the most part, I have edited the files AWSSignature, AWSS3TransferManager, and AWSService (AWSServiceConfiguration). I think that the signing error occurs because I am editing a path, or URL without correctly code signing the change (probably in AWSSignature.m). As I am uncertain as to where my code breaks, I have created a repository with all of the AWS SKD macOS files required to compile the framework including a unit test. If I run the test initializing the call to AWSServiceConfiguration with:
AWSServiceConfiguration *configuration = [[AWSServiceConfiguration alloc]initWithRegion:[region aws_regionTypeValue] credentialsProvider:credentialsProvider
accelerateModeEnabled:#(NO)
bucketName:self.testBucketName];
Then everything works as expected and the test file uploads and downloads correctly. However, if I try to turn transfer acceleration on (I have already made sure that my bucket has acceleration enabled), then it fails with the code signing error above.
AWSServiceConfiguration *configuration = [[AWSServiceConfiguration alloc]initWithRegion:[region aws_regionTypeValue] credentialsProvider:credentialsProvider
accelerateModeEnabled:#(YES)
bucketName:self.testBucketName];
In fact it seems like the test script is trying to upload a file much larger than what the testfile (3 Mb) really is. I assume my error is related to the signing of the URL body in some way as the size of the file seems wrong.
I know that finding this error requires a bigger effort than what is usually expected at SO and it is not something many will throw themselves at (but hopefully some) as it involves very complex code and it is time-consuming. However, I do believe that if we could make this work then many can find the framework for AWS SKD for macOS + Transfer Acceleration very useful.
I hope you will take a look at try to identify what may be my problem breaking the code signing.
All code for the framework+test example is available here: https://github.com/trondkr/aws-sdk-macos-TA. To run the unit test you need to provide the secret key and access id to your AWS S3 and the name of a bucket that has transfer acceleration enabled.
Thanks. Cheers, Trond

.scr file and APDU

I am using JCard sim, java card version 2.2.2 and I want to know how the .scr file is associated with the .java file. (the java card simulator on NetBeans IDE. I am not using an actual smartcard).
If someone can provide me with some useful links on how these two files are related, I would greatly appreciate it.
I have looked through the following links, but they were not specifically helpful in illustrating how I can modify the .scr file in association with my .java file
C H A P T E R 5 - Converting Java Class Files
How to write a Java Card applet: A developer's guide - JavaWorld
Basically what I am trying to do is create a test applet (without the need of .scr files to send and receive APDUs by my other files)
- I want to be able to read APDU which contains the the parameters for a function in my process method
- That function will then create another APDU as its output, which another function will read as one of it's parameters
As far as I understand, the .scr file is used to send command APDUs that is read by the applet, but there is no way to write to the .scr file.
How can I create my own .java test file that sends and receives APDUs instead of relying on the .scr?
I can provide more details of what my code looks, if absolutely required.
Thanks

You can communicate with the simulator using the method described in the quick start guide of jCardSim. It is also described how to select an Applet using the correct AID in there. The inherited process(APDU) method will receive any APDU send using the methods described in the quick start guide, starting with the SELECT by NAME APDU (INS = A4). After that it is normal APDU processing.

Getting EPG info from DVB-T

I'm interested in grabbing the EPG data from DVB-T streams. Does anyone know of any C libraries or an alternative means of getting the data?

tv_grab_dvb can do this. See the subversion repository for sources.
tv_grab_dvb is made to work with the stream grabbed from the DVB-T card using dvbtools on Linux, but it may be portable to other platforms - I think it just works with the raw data from the stream.

...a new answer to an old question:
I wrote a utility called dvbtee that can be used as a c++ library, a cross-platform command line utility, or a node.js module.
(despite it being a c++ library, one could still link to it from c code)
The command line utility will parse your streams and output the EPG, depending on the arguments you specify, it can generate plain text or a JSON block of data.
dvbtee: a digital television streamer / parser / service information aggregator supporting various interfaces including telnet CLI & http control
The node.js module will emit events containing the PSIP table data (along with EPG info)
node-dvbtee: MPEG2 transport stream parser for Node.js with support for television broadcast PSIP tables

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas