Parse PDF content stream as a string in xCode?

Parse PDF content stream as a string in xCode? - objective-c

I am trying to get the contents stream out of a PDFs internal structure using xCode.
I have managed to get to the array of contents using:
CGPDFDictionaryGetArray(str, "Contents", &val)
Then counting the amount of objects within the array, its returning 8 which is far less than shown in Acrobat Pro.
The objects with in the array seem to be of type kCGPDFObjectTypeStream, not sure what I can do with this.
Any help would be much appreciated, many
thanks,
Jacob

The page /Contents entry can be a stream object or an array of stream objects. When you have an array of stream objects you get the complete page content by merging these streams in a single one (append one stream after another).

Related

PDF 1.6 cross reference stream decoding

I am trying to duplicate the solution shown here but no luck.
Basically Ivan Kuckir managed to decompress a PDF1.6 xref stream by first decrypting it and then decompressing. This stream like mine belongs to an encrypted PDF file.
One issue here however, is that the PDF 1.6 spec states on p.83 that "The cross-reference stream must NOT be encrypted, nor may any strings appearing in the cross-reference stream dictionary. It must not have a Filter entry that specifies a Crypt filter (see 3.3.9, “Crypt Filter”)." What I understand from this is that, like cross ref tables before them, xref streams must not be encrypted.
When I try to inflate the stream the zlib dll crashes. It also crashes when I decrypt first and then inflate... Has anyone managed to duplicate Ivan Kuckir's solution? Thanks.
P.S. I tried to ask the question in the above thread but for some reason it was deleted by the admin...
This is the link to the object: https://drive.google.com/file/d/1DwOf3zarg9p_B8DNZ2gZdaBr43NKDWR3/view?usp=sharing
I replaced the stream charecters with a hex string for unrisky pasting

So, as you read in the spec, xref streams are not encrypted. So you don't need to decrypt any strings in the xref stream dictionary nor the stream itself. What you need to take into account are the /Filter and /DecodeParams entries when decoding the stream.
Most of the time an xref stream uses a /Flate decode filter together with parameters that allow for better compression due to the way an xref stream is structured. So have a look at sections 7.4.4.1 and 7.4.4.4 of the PDF specification.

Vulkan: Ways of reading attachment data in subsequent RenderPasses

Given 2 RenderPasses, A and B, and an attachment X accessed by both, if A does a .storeOp=store on X on its last subpass, and B does a .loadOp=load on X on its first subpass, can B read from X as an input attachment?
Futhermore, I can think of 3 ways of reading attachment data from a previous RenderPass.
Using a sampler.
(Potentially) as an input attachment.
As a storage image.
Are there any other ways?

Once a render pass instance has concluded, all attachments cease to be attachments. They're just regular images from that point forward. The contents of the image are governed by the render pass's storage operation. But once the storage operation is done (subject to the correct use of dependencies), the image has the data produced by the storage operation.
So there is no such thing as an attachment "from a previous RenderPass". There is merely an image and its data. How that image got its data (again, subject to the correct use of dependencies) irrelevant to how you're going to use it now. The data is there, and it can be accessed in any way that any image is access, subject only to the restrictions you choose to impose.
So if an image has some data, and you use it as an attachment, and you use a load operation of load, the data in that attachment will have the image data from the image before becoming an attachment regardless of how the data got there. That's how load operations work.

SBJson Stream Parser

I'm working in Xcode 4.3.2 + building for an app in iOS 5.
I've decided to use SBJson to parse streams of data from our server. I've verified that I'm receiving a valid JSON response from the server. My question concerns the design behind the classes SBJsonStreamParser and the SBJsonParser.
It appears that in SBJsonParser the method "objectWithData" takes the data received from the JSON response and uses the SBJsonStreamParserAccumulator to append the stream of data into a single JSON document. Once the data stream is gathered into one object, it is then parsed by the "parse" method in SBJsonStreamParser.
I've run into several issues when requesting larger JSON documents. The size of the responses seem to be reasonable (specially 9.4 KB response). It appears that the SBJsonStreamParser breaks when getting a data stream greater than a certain size. The parser succeeds when the response is small (~3KB), but fails when the response is larger (~10KB).
I used NSLog to verify that in both cases, pulling a small & large stream, the methods are successfully receiving the full json document - because it looks like [{"id": .... 123}]. I'm convinced that the issue is that the data stream is too long.
I'm wondering if I'm using SBJson incorrectly or is this simply a limitation of the parser? Is there anything that I can configure that allows SBJsonStreamParser to not throw an error for larger (but reasonable) data streams & continue to parse the full response?
Thanks in advance!

Actually you have the workings of objectWithData: backwards. SBJsonStreamParserAccumulator is used to accumulate the parsed output, not the unparsed data stream.

.NET ZipPackage vs DotNetZip when getting streams to entries

I have been using the ZipPackage-class in .NET for some time and I really like the simple and intuitive API it has. When reading from an entry I do entry.GetStream() and I read from this stream. When writing/updating an entry I do entry.GetStream(FileAccess.ReadWrite) and write to this stream. Very simple and useful because I can hand over the reading/writing to some other code not knowing where the Stream comes from originally.
Now since the ZipPackage-API doesn't contain support for entry properties such as LastModified etc I have been looking into other zip-api's such as DotNetZip. But I'm a bit confused over how to use it. For instance, when wanting to read from an entry I first have to extract the entire entry into a MemoryStream, seek to the beginning and hand-over this stream to my other code. And to write to an entry I have to input a stream that the ZipEntry itself can read from. This seem very backwards to me. Am I using this API in a wrong way?
Isn't it possible for the ZipEntry to deliver the file straight from the disk where it is stored and extract it as the reader reads it? Does it really need to be fully extracted into memory first? I'm no expert but it seems wrong to me.

using the DotNetZip libraries does not require you to read the entire zip file into a memory stream. When you instantiate an instance an instance of ZipFile as shown below, the library is only reading from the zip file header. The zip file headers contain properties such as last modified, etc. Here is an example of opening a zip file. The DotNetZip library then reads the zip file headers and constructs a list of all entries on the zip:
using (Ionic.Zip.ZipFile zipFile = Ionic.Zip.ZipFile.Read(this.FileAbsolutePath))
{
...
}
It's up to you to then extract zip files either to a stream, to the file system, etc. In the example below, I'm using a string property accessor on zipFile to get a zip file named SomeFile.txt. The matching ZipEntry object is then extracted to a memory stream.
MemoryStream memStr = new MemoryStream();
zipFile["SomeFile.txt"].Extract(memStr); // Response.OutputStream);
Zip entries must be read into the .NET process space in order to be deflated, there's no way to bypass that by going straight into the filesystem. Similar to how your Windows Explorer shell zip extractor would work - The Windows shell extensions for 7zip or Windows built in Compressed Folders have to read entries into memory and then write them to the file system in order for you to be able to open an entry.

Okey I'm answering this my self because I found the answers. There are apparently methods for both these things I wanted in DotNetZip. For opening a read-stream -> myZipEntry.OpenReader() and for opening a write-stream -> myZipFile.UpdateEntry(e, (fn, obj) => Serialize(obj)). This works fine.

Sharing Data bwtween different AppDomain

I'm trying to sending data from newDomain to cureentDomain.
I used DoCallBack for loading list of .dll and extract file&assembly information as Dictionary type.
And tried to send key/value data to currentDomain.
It's my first time to use appDomain, so I just found rough way, Set/GetData.
For using that, too much converting process needed, and it shows it can make exceptions in a variety of situations.
If I can send Dictionary, it will be very excellent way of that.
Please, let me know~

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Parse PDF content stream as a string in xCode? - objective-c

The page /Contents entry can be a stream object or an array of stream objects. When you have an array of stream objects you get the complete page content by merging these streams in a single one (append one stream after another).

Related

PDF 1.6 cross reference stream decoding

Vulkan: Ways of reading attachment data in subsequent RenderPasses

SBJson Stream Parser

.NET ZipPackage vs DotNetZip when getting streams to entries

Sharing Data bwtween different AppDomain

Categories

Resources