How to match ZLib stream between VBA 6/VBA 7and Java 8? - vba

We are being able to do the following.
In VBA 6/ VBA 7:
Refer a 32 bit zlibwapi.dll (VBA 6) or 64 bit zlibwapi.dll (VBA 7).
Invoke compress() or compress2() methods to generate compressed
streams
Invoke uncompress() and uncompress2() methods to decompress
compressed streams
In Java 8 (JDK 1.8 on Tomcat 8)
Have a simple java program that compresses data using the new
Deflater() instance
Have a simple Java program that decompresses using Inflater()
instance
We are failing when VBA sends out the compressed stream for Java Servlet to uncompress or when Java Servlet sends out compressed response data for VBA to decompress.
We are aware of following facts.
there are 3 formats provided by ZLib (raw, zlib and gzip).
The methods in zlibwapi.dll namely compress() and compress2()
generates compressed bytes in zlib format. This has been mentioned
in a similar thread at
Java decompressing array of bytes
Inflater() instance on Java side allows to uncompress zlib format
data as per a code sample posted at
Compression / Decompression of Strings using the deflater
Java 8 has zlib version 1.2.5 integrated as part of java.utils.zip
package.
We have ensured that we are using zlibwapi.dll version
1.2.5 on VBA side as well.
We have tried to use Hex editors to compare byte streams of compressed data independently generated by VBA and Java as well. We notice some difference in the generated compressed data. We think it is this difference that is causing both the environments to misunderstand each other.
Additionally, we think that when communication occurs, there has to be some common charset that governs the encoding/decoding scheme between both the endpoints. We have even tried to compare the hex code of byte stream generated by VBA and communicated across to Java Servlet.
The bytes seem to be getting some additional 0 bytes inserted in
between the actual set of compressed bytes while communication
occurs. This happens on VBA side. May be because of some unicode interpretation.
Whatever bytes get communicated across to Java appear entirely
different in their representation.
We need to fix our independently working code to communicate with one another and compress and decompress peacefully. We think there are 2 things to address - Getting format to match and using a charset that sends bytes as is. We are looking for any assistance from experts on this front that can help us find correct path to the possible solution. We need answers for
Does compress2() or compress() really generate zlib format?
Which charset will allow us to send bytes as is (if there are 10
bytes, we want to send 10 bytes. Not 20). If its unicode, 0 bytes
get inserted in between (10 bytes become 20 bytes because of this).

Yes.
Don't send characters. Send bytes.

Related

Reading DIEs in ELF file

Hello I fairly new to the DWARF standard and ELF format. I have a few questions. I am using the DWARF 2 standard and I have a pretty basic understanding of how DIEs work and I was needing more clarity on how they are represented in bytes.
ELF Wiki provides a good table for in which order the bytes go in the program header, sections, and segments. But what is the correct way to represent DIEs in bytes for the DWARF 2 standard?
I have tried to dive deep into Dwarf Standards pdf documents to try to understand how DIEs are represented in bytes. Perhaps there is a section I am missing?
I would like to use this information to be able to delete certain DIEs to save space in the debugging section. I am only interest in the DIEs that provide variable address's.
I recommend that anyone starting out in DWARF begin with the Introduction to the DWARF Debugging Format. It's a very concise overview that provides an excellent foundation for exploring the format in further depth. Armed with this background, compile a debug version of a very simple program and compare a hex dump of the two ELF sections .debug_abbrev and .debug_info with the output of dwarfdump or readelf.
Once you are broadly familiar with the encoding of a DIE you will see that simply deleting its corresponding bytes from .debug_info would corrupt the entire file — in terms of both DWARF and ELF. For example, each DIE is identified by its relative file offset; deleting one DIE's bytes would alter the offsets of all subsequent DIEs and any references to them would therefore be broken. A robust solution would require parsing the DWARF to create an internal representation of the tree before eliminating unwanted nodes and writing out new DWARF. After modifying .debug_info you'd then need to edit the fabric of the ELF itself: at the very least, this would involve updating the section header table to reflect the new offsets for any shifted sections and updating any relocations.
If your principal concern is indeed space saving then I suggest you instead investigate what compiler options you have. The Oracle Studio Compilers, for example, allow fine control over the content included in the DWARF. Depending on your compiler and OS it may also be possible to emit files with compressed DWARF sections (e.g. .zdebug_info) or even leave the DWARF in different files altogether. The problem of DWARF bloat is well known and, if you are interested in tackling it at a low level yourself, you will find other suggestions in Michael Eager's introduction and in later versions of the standard.
The format is explained page 66 in sections 7.5.2 and 7.5.3.
The example in appendix 2, page 93 is is much clearer:
Each DIE references a corresponding entry in .debug_abbrev which defines a given DIE "signature" i.e.
its type (DW_TAG_*)
it has child DIE
its attribute (DW_AT_*) and their form (DW_FORM_*).
The format od the DIE is:
reference to a abbreviation (LEB128 i.e. variable length);
0 is used for ending a list of children̄;
une value per attribute (using the encoding associated with the given form).

Elf representation in HEX

I am working on understanding some ground concepts in embedded Systems. My question is similar to understand hexedit of an elf .
In order to burn compiler output to ROM, the .out file is converted to HEX (say intel-hex). I wonder how the following informations are preserved in HEX format:
Section header
Symbol tables, debug symbols, linker symbols etc.
Elf header.
If these are preserved in HEX file, how they can be read from hex file?
A bit out question but how the microcontroller on boot knows where .data .bss etc. exists in HEX and to be coppied to RAM?
None of that is preserved. A HEX file only contains the raw program and data. https://en.wikipedia.org/wiki/Intel_HEX
The microcontroller does not know where .data and .bss are located - it doesn't even know that they exist. The start-up code which is executed before main() is called contains the start addresses of those sections - the addresses are hard-coded into the program. This start-up code will be in the HEX file like everything else.
The elements in points 1 to 3 are not included in the raw binary since they serve no purpose in the application; rather they are used by the linker and the debugger on the development host, and are unnecessary for program execution where all you need is the byte values and the address to write them to, which is more or less all the hex file contains (may also contain a start address record).
Systems that have dynamic linking or self-hosted debug capabilities (such as VxWorks for example) use the object file file.
With respect to point 5, the microcontroller does not need to know; the linker uses that information when resolving absolute and relative addresses in the object code. Once filly resolved (linked), the addresses are embedded in the code directly. Again where dynamic loading/linking is used the object file meta-data is required and such systems do not normally load a raw hex file or binary.

Lazily Read Stream of Bytes from File in Java 8

So Java 8 introduces a lot of lazily loaded Streams, including one for reading lines from a text file using a particular character encoding.
But after doing a LOT of reading I've determined that there is no out of the box method to lazily read chunks of bytes from file, and I'm a bit confused as to why this is the case. This is a pretty common use-case, so there must be a good reason for it to not have been included, right?
My best solution to this seems to be a custom implementation of a Spliterator to read byte chunks using some guidance from this post:
https://www.airpair.com/java/posts/parallel-processing-of-io-based-data-with-java-streams
But I would love to know why Java 8 doesn't have this feature out of the box?

is there anyway in fortran90 to read data at specied byte

I have encountered a problem that demands reading at data at specified byte from a binary input file,like reading at location 40000 bytes off the start of the file.I intend to use direct access to file.But that requires each segment be divided in the same size which specified in the argument recl.Can anybody provides a feasible solution.Some programming language like c provide function that can jump to the specified bytes.
The Fortran 2003 standard introduced unformatted stream access, to pretty much do exactly this. Once the file has been opened appropriately you can just use a POS specifier in the relevant write statement. Support for this Fortran 2003 feature is reasonably widespread amongst the Fortran compilers that are actively supported. The compiler needs to use a file storage unit of a byte, but all compilers that I am aware of do this (this is also what the standard recommends).
Otherwise, the closest standard Fortran 90 approach is to use unformatted direct access with a record length that is some reasonable common factor of the desired position and size of the elements of data to be read. For instance - if you were reading eight byte real numbers from the file, then a record length of eight might work - you would start reading at record number 5000. This requires both that the file storage unit of the Fortran processor be a byte (common, perhaps with compile options) and that no record delimiters or similar exists in the file for unformatted direct access (mostly the case, again perhaps with compile options).

Out of memory error when merging large numbers of PDFs using Zend_PDF

We're using the Zend_PDF module in SugarCRM to merge pdf invoices that our system generates. I have been able to successfully merge a number of PDFs (around 10 to 30 in my tests), but we're getting memory errors when we try to merge larger numbers of pdf files. The error looks something like this:
[30-Jan-2012 14:10:20] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted at /usr/local/src/php-5.3.8/Zend/zend_operators.c:1265 (tried to allocate 68134 bytes) in /srv/www/htdocs/sugar6_mf/Zend/Pdf/Element/Object/Stream.php on line 442
The above error was generated when we tried to merge 457 pdf files - that's files, not pages. We're going to need to merge 5,000 and more at a time eventually.
Can anyone offer any help/advice on how to address this?
If needed, ask, and I'll post the code on how the merged pdf is being generated.
Thanks.
I should preface this answer by saying that I know nothing about SugarCRM - my response is based solely on my knowledge of Zend_Pdf.
If my understanding is correct, you have a PHP script (hopefully not running inside Apache considering the length of time it will take to process 5,000 files) that is taking multiple PDF files as input using the Zend_Pdf::load() method and then iterating through the pages of each PDF object and adding them to one target instance of Zend_Pdf, which you are then writing to a file using the save() method.
Using this approach, even if you unset() each of the source PDF objects after you've added the pages to the target PDF object, you'll still need enough memory to store the entire output file. If you blew through 250MB with only 457 files, then I'm guessing your input PDF files are probably about 500KB, so your output file is going to be absolutely huge, so you are still going to end up running out of memory.
My advice would be to ditch this method entirely and use pdftk instead, which you could invoke using the exec() function. I'm sure there's a limit to the size of the arguments you can provide to exec(), so it will probably be a multi-step process with several intermediate files, but ultimately I think this will be a faster, more robust solution.
And just to re-iterate an earlier point, I would not run this process within Apache. I would set up a cron job that runs at the appropriate intervals and drops the output file into a secure area on your web/file server.